DeepSeek Launches Janus Pro: A New Multimodal AI Model Family
DeepSeek, a prominent AI company, has unveiled a new collection of multimodal AI models that it asserts can surpass the capabilities of OpenAI's DALL-E 3.
The new models, named Janus Pro, can be downloaded from the AI development platform Hugging Face. They vary in size from 1 billion to 7 billion parameters, which are indicators of a model's proficiency in solving problems. In general, models with more parameters tend to deliver better performance.
The Janus Pro models are licensed under the MIT license, allowing for unrestricted commercial use.
Shaped as a “novel autoregressive framework,” Janus Pro can analyze existing images as well as generate new ones. DeepSeek claims that its largest model, Janus Pro 7B, outperforms DALL-E 3 along with other models like PixArt-alpha and Stability AI's Stable Diffusion XL on two AI evaluation benchmarks: GenEval and DPG-Bench.
Even though some of the competing models are older, and Janus Pro has limitations in generating and analyzing small images with a maximum resolution of 384 x 384 pixels, the overall performance of the Janus Pro family is notable given the compact sizes.
DeepSeek states, “Janus Pro surpasses previous unified model and matches or exceeds the performance of task-specific models,” highlighting its high flexibility and effectiveness, and suggests it is a strong contender for next-generation unified multimodal models.
DeepSeek, a Chinese AI laboratory primarily funded by the quantitative trading firm High-Flyer Capital Management, gained significant attention when its chatbot app topped the Apple App Store charts. The efficiency of its language models has led analysts and technologists alike to ponder if the U.S. can maintain its dominance in the AI sector, particularly regarding the continuing demand for AI chips.
AI, model, DeepSeek