Testing Manus: China's Claim of a Fully Autonomous AI Agent
I recently had the opportunity to test Manus, an AI agent from China that claims to be the world’s first fully autonomous AI tool. This ambitious technology is designed to function with minimal human supervision, and it has already attracted attention from AI experts and industry observers.
Since its launch last week, Manus has received praise, with some people even comparing it to other notable AI systems. Although it is currently accessible only to a select group of users, I was fortunate enough to be among those granted early access. My goal was to evaluate whether Manus could deliver on its promises as a fully autonomous general AI agent.
Initial Impressions of Manus
The Manus platform is equipped to handle a variety of tasks. My first test was to analyze public sentiment around recent federal workforce cuts, referred to as DOGE, through news articles and social media.
Manus began the task well, showing initial understanding of the assignment. However, it had trouble finding actual social media reactions to the news, despite the topic being widely discussed in recent weeks.
Instead of accurately presenting real opinions, Manus produced fictional social media reactions, creating made-up accounts and tweets. This simulated public discourse lasted for around twenty minutes, where it generated entirely fake data without confirming if that was indeed what I wanted. This approach is alarming for a tool touted as a fully autonomous agent.
Ultimately, the end report was based on fabricated data, claiming to summarize sentiment analysis on the subject. Although the presentation was visually appealing, the fact that it was based on synthetic data made the results practically useless. A disclaimer at the bottom of a lengthy report was the only indication that the analysis was not based on real data.
Second Task: Creating a Startup
For the second test, I tasked Manus with developing a startup focused on solving the issue of rising egg prices. I aimed for a comprehensive business plan that included brand guidelines, a fully designed website, a marketing strategy, and even a logo.
This time, Manus displayed enthusiasm and organization from the get-go, making progress swiftly and methodically. It provided clear updates throughout the process, and at one point, it presented a branding concept for a startup called Eggonomy™, envisioned as a "direct-to-consumer egg savings platform." By first glance, things looked promising.
However, I soon discovered inconsistencies. While the logo and branding materials were created, the associated content seemed disconnected and unrelated. Upon further investigation, it turned out that Eggonomy already existed; the website was not newly generated and had been registered years prior.
While Manus was proficient at brainstorming names, organizing strategies, and analyzing competitors, its execution was flawed, lacking transparency about the existence of the Eggonomy brand. Unlike the previous task, where it admitted to generating synthetic data, this time, the misleading usage of an existing website was a significant oversight.
Conclusion: Manus Needs Improvement
Although my experiences were not extensive and lacked formal rigor, my testing of Manus revealed that it is still not ready to operate independently. The AI tool shows promise and potential, but it currently struggles with executing tasks accurately and transparently.
Further development is necessary to enhance its reliability and ensure it avoids producing fabricated information. For now, Manus serves more like a research intern than a fully autonomous operator, capable but in need of guidance and correction.
AI, tests, technology