MiniMax, A Rising AI Player in China, Launches Competitive Models
Chinese companies are increasingly developing artificial intelligence (AI) models that can compete with those created by well-established U.S. companies like OpenAI. This week, a startup named MiniMax, which is supported by major players such as Tencent and Alibaba, introduced three new AI models: MiniMax-Text-01, MiniMax-VL-01, and T2A-01-HD.
MiniMax-Text-01 is designed for text processing, while MiniMax-VL-01 has the capability to understand both images and text. The third model, T2A-01-HD, is specialized in audio generation, particularly for producing speech.
According to MiniMax, MiniMax-Text-01 has 456 billion parameters and performs better than Google’s Gemini 2.0 Flash on various benchmarks like MATH and SimpleQA, which evaluate a model's ability to solve math problems and answer factual questions. In general, having more parameters can help models perform better in various tasks.
In the case of MiniMax-VL-01, the company claims that it can compete with Anthropic’s Claude 3.5 Sonnet in tasks requiring a combination of text and image comprehension, such as ChartQA, where the model is tasked with answering questions based on charts and graphs. However, while MiniMax-VL-01 shows strong performance, it doesn't surpass Gemini 2.0 Flash in several evaluations, with OpenAI’s GPT-4o and Meta’s Llama 3.1 also outperforming it in certain areas.
One interesting aspect of MiniMax-Text-01 is its large context window. A model's context window refers to how much input it can consider before generating an output. MiniMax-Text-01 can handle 4 million tokens, enabling it to analyze approximately 3 million words at once, which is equivalent to over five copies of "War and Peace." This context window is about 31 times larger than those of GPT-4o and Llama 3.1.
As for T2A-01-HD, it is an audio generator specifically tailored for creating speech. It offers a synthetic voice with customizable options for cadence, tone, and tenor, and supports around 17 languages, including English and Chinese. Moreover, it has the capability to clone voices using just 10 seconds of audio samples.
Although MiniMax did not release benchmark results comparing T2A-01-HD with other audio models, its outputs seem to match the quality of audio models developed by Meta and various startups like PlayAI.
Aside from the audio model, users can access MiniMax's text and visual models through platforms like GitHub and Hugging Face. However, it is important to note that MiniMax-Text-01 and MiniMax-VL-01 are not completely open-source because the company has not shared all necessary components, such as their training data. Furthermore, they operate under a restrictive license that prevents developers from using the models to enhance competing AI systems. Platforms with over 100 million monthly active users must also seek special permission from MiniMax to use these models.
Founded in 2021 by former employees of SenseTime, one of China’s largest AI companies, MiniMax has been working on various projects, including Talkie, an AI-driven role-playing platform, and text-to-video models that integrate with Hailuo, one of its services.
Some of MiniMax’s applications have sparked controversy. For instance, Talkie was removed from the Apple App Store last December for unclear technical reasons. This app features AI avatars of well-known public figures like Donald Trump and Taylor Swift, who likely did not provide consent for their likenesses to be used.
In December, Broadcast magazine highlighted that MiniMax's video generation technology was able to replicate the logos of British television channels, raising concerns about whether the models were trained using copyrighted content from those channels. Additionally, reports indicate that MiniMax is facing a lawsuit from iQiyi, a Chinese video streaming service, which accuses the company of unlawfully training its models on iQiyi's copyrighted recordings.
MiniMax's latest model releases come at a time when the Biden administration has suggested stricter regulations on the export of AI technologies to Chinese companies. Previously, Chinese firms faced limitations in purchasing advanced AI chips, and the newly proposed rules may further tighten these restrictions on necessary technology for developing sophisticated AI systems.
Recently, additional measures were announced by the Biden administration to prevent advanced chips from reaching China. Companies that want to export certain chips will need to meet broader licensing requirements and demonstrate careful scrutiny to keep their products from reaching Chinese clients.
AI, Technology, China, Models, Innovation