Technology

Google's New AI Model Gemma 3: A Boon for Creative Writers but Limited Elsewhere

Published March 13, 2025

On Tuesday, Google introduced Gemma 3, an open-source AI model based on Gemini 2.0 that impresses with its capabilities despite its size. Running entirely on a single GPU, Gemma 3 outperforms many larger AI models that require significant computing resources.

The new model family includes variants ranging from 1 billion to 27 billion parameters, positioning them as practical choices for developers aiming to deploy AI directly on devices like smartphones and workstations. According to Clement Farabet, VP of Research at Google DeepMind, and Tris Warkentin, Director at Google DeepMind, these are the most advanced, portable, and responsibly developed open models available.

Despite being smaller, Gemma 3 has outperformed much larger models in recent benchmarks, including Meta’s Llama-405B and OpenAI’s o3-mini. The 27 billion instruction-tuned version achieved a score of 1339 on the LMSys Chatbot Arena Elo rating, placing it among the top ten models overall.

Gemma 3 is noteworthy for its multimodal capabilities, handling text, images, and short videos in larger variants. It boasts an impressive expanded context window of up to 128,000 tokens, a significant improvement over the previous Gemma 2's 8,000 tokens, enabling more robust information processing.

With support for over 140 languages and out-of-the-box compatibility for 35 languages, Gemma 3 is a fitting choice for developers creating applications for diverse international audiences. Since its launch last year, Google's Gemma family has accumulated over 100 million downloads and generated more than 60,000 derivatives, resulting in a vibrant ecosystem dubbed the "Gemmaverse."

Developers can deploy applications powered by Gemma 3 through various channels, including Vertex AI and Google GenAI API, as well as locally, offering flexibility suited to different infrastructure needs.

Evaluating Gemma 3's Performance

We conducted several tests to determine Gemma 3's effectiveness across various tasks. Here’s a breakdown of its performance in different areas.

Creative Writing

Gemma 3 excelled in creative writing. Despite its model size of just 27 billion parameters, it substantially exceeded Claude 3.7 Sonnet, a model noted for its creative writing abilities. In fact, it produced the lengthiest narrative amongst all tested models, with the sole exception being Longwriter, which focuses specifically on long narratives.

The quality of its output was equally impressive, generating engaging and original content while steering clear of the trite beginnings that many AI models tend to use. Gemma 3 created rich, immersive worlds with coherent narratives, ensuring that character names, settings, and descriptions all aligned seamlessly within the context of the story.

This consistency is particularly beneficial for creative writers, as many competing models often stumble over cultural references or overlook essential details, disrupting immersion. Furthermore, the extended format enabled natural story progression with smooth transitions between narrative elements. The model effectively captured emotions, actions, thoughts, and dialogue, fostering a realistic reading experience.

When prompted for a twist ending, Gemma 3 managed to deliver without compromising the internal logic of the narrative, a challenge that other models often mishandled. For writers seeking a reliable AI assistant for fiction projects, Gemma 3 clearly stands out.

Document Summarization and Information Retrieval

However, when it came to analyzing documents, Gemma 3 fell short. During our test with a 47-page IMF document, the model initially accepted the file but failed to complete its analysis, stalling midway through. Repeated attempts produced the same outcome, indicating challenges with long-form content.

We also tried copying and pasting the document's text into the interface, but the functionality remained unresponsive. This limitation might be tied to the implementation within Google’s AI Studio rather than an inherent flaw in Gemma 3. It's conceivable that running the model locally could yield better performance, but users relying on Google's platform may face restrictions.

Sensitive Topics

Google AI Studio utilizes strict content filters, accessible through a series of sliders. We tested the boundaries by requesting advice for ethically dubious scenarios, and the model firmly refused to engage in any forms of suggestive content creation.

Our attempts to modify or bypass these filters were largely unsuccessful. Google’s safety settings theoretically regulate how much the model can engage in discussions on harassment, hate speech, and explicit content, but users often found the restrictions overly stringent.

Even when trying to explore sensitive topics for legitimate creative purposes, the model consistently declined to participate. Those wanting to work on such content may need to consider alternative methods or carefully word their prompts to navigate the censorship.

Multimodal Performance

As a multimodal model, Gemma 3 is capable of processing images without relying on a separate model. However, during our testing, we faced some limitations on the Google AI Studio platform, which did not support direct image processing.

Testing through Hugging Face's interface revealed that the model could identify essential components within images and provide relevant analysis, but there were some limitations. In one instance, it inaccurately interpreted a financial chart, incorrectly speculating on Bitcoin's price in 2024, which did not correspond to the image's actual content.

Although Gemma 3's multimodal features work adequately, smaller model variants may not deliver the level of precision seen in specialized larger vision models.

Reasoning Capabilities

Gemma 3 struggles with complex logical deduction required for non-mathematical reasoning tasks. Tests involving mystery problems from the BigBENCH dataset demonstrated the model’s inability to identify crucial clues or draw logical conclusions.

Attempting to guide the AI through step-by-step reasoning triggered its content filters, resulting in a refusal to respond.

Is Gemma 3 Right for You?

Your experience with Gemma 3 may vary based on your specific needs. Creative writers will find it to be an excellent option, as it can produce detailed and engaging narratives that surpass some larger models like Claude 3.7 and GPT-4.5.

If your focus is on fiction, blogging, or safe-for-work content creation, Gemma 3 offers remarkable quality at no cost and operates efficiently on standard hardware.

Developers creating multilingual applications can benefit from its support for over 140 languages, allowing for streamlined development without requiring separate language models.

Small businesses with limited resources will appreciate the ability to run advanced AI functionalities on a single GPU, reducing costs-related constraints typically associated with adopting AI technologies.

The open-source nature of Gemma 3 provides distinct advantages over closed models, allowing for customization, domain-specific fine-tuning, and deep integration into existing systems without API restrictions.

For applications demanding strict privacy measures, the model can operate entirely offline.

That said, users needing to analyze lengthy texts or engage with sensitive topics may encounter significant roadblocks. Tasks that require nuanced reasoning or exploration of controversial subjects are generally better suited for larger, closed-source models with more flexibility.

Gemma 3 also falls short on tasks requiring mathematical reasoning, coding assistance, or complex operations that many would typically expect of AI. It is best suited for creative text generation.

In conclusion, while Gemma 3 won't dethrone the most advanced proprietary or open-source reasoning models in every area, its blend of performance, efficiency, and modifiability makes it an intriguing option for AI enthusiasts and open-source advocates looking for control and local operation.

Google, AI, Gemma