Apple Introduces MGIE, the Natural Language AI for Advanced Image Editing
Apple has stepped up its game in the artificial intelligence (AI) landscape by collaborating with the University of Santa Barbara on a groundbreaking project. This cutting-edge development has resulted in the creation of an AI model called Multimodal Large-Language Model-Guided Image Editing, better known as MGIE. This tool is designed to edit images through interactions in natural language, similar to how people communicate with AI like ChatGPT.
What Is MGIE?
MGIE stands out for its capability to interpret text input from users and turn these into precise image editing commands. By utilizing a diffusion model, MGIE can consider the original image's details to make informed edits. This innovation is based on the principles of Multimodal Large Language Models (MLLMs), which can understand a blend of text and images. Unlike other AIs that handle only text or images, MLLMs process multifaceted instructions to work adaptively in diverse contexts.
MGIE's Editing Power
MGIE redefines how we issue commands to image editing software. For instance, if a user provides an instruction like 'remove the traffic cone from the foreground', MGIE deciphers this and flawlessly performs the task. This approach is not only versatile but also yields results more accurate than existing methods, such as Pix2Pix when employed with tools like Stable Diffusion. In practice, a user could change the hair color of a person in a photograph by simply instructing 'make this person a redhead', and MGIE would handle the rest, applying the change with realistic precision.
The Advantages of Open Source
The decision by Apple to make MGIE open-source was a strategic one. The tech titan utilized pre-existing open-source models to construct MGIE. To abide by these models' licensing terms, Apple has shared its advancements on GitHub. This decision not only fulfills legal obligation but also invites global developer participation, accelerating the tool's improvement and diversification. This open-source initiative positions Apple as an influential player in the AI community and might influence future industry standards for AI-powered image editing.
MGIE's Applications
MGIE's technology promises to enhance Apple's product line significantly. Imagine sending a voice command to Siri to edit a photo; MGIE could translate this command and apply the edit on various Apple devices. Currently, tech-savvy AI developers have access to MGIE's codebase via GitHub, allowing them to contribute to this evolving project.
Apple, MGIE, AI