Apple is now working on a technology to let you edit camera-created real photos with natural language text prompts and possibly even voice instructions.
In the last couple of years, AI has saturated just about every facet of the image creation and photo editing sphere, and not always in ways that deliver great results.
However, one area where it’s been a oddly lacking is in using natural language text prompts to edit real photos, instead of just for creating visuals from digital whole cloth.
This is what Apple is now working on in cooperation with researchers from the University of California. Not only this, but the software is even available now in an open-source beta that’s AI-powered and freely available online.
If there’s one company that should be able to nail its own version of AI-powered photo editing innovations well, it’s Apple though their current effort is still a work in progress.
The new AI tech from the tech giant and the UC Santa Barbara researchers is called “MGIE”, which is an acronym for MLLM-Guided Image Editing. MLLM by the way stands for Multimodal Large Language Model.
The AI system lets users use natural language to describe the edits they want done to a photo while it simply complies with a finished result.
In a paper released in February by UC Santa Barbara and Apple researchers, the creators of the AI platform argue that image editing through AI and natural language could be more efficient.
They add the caveat that current LLM models need to have their natural language abilities beefed up further for this to work.
In their abstract, the researchers state,
“Instruction-based image editing improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks. However, human instructions are sometimes too brief for current methods to capture and follow. Multimodal large language models (MLLMs) show promising capabilities in cross-modal understanding and visual-aware response generation via LMs.”
According to them, MGIE helps solve the above problem by enabling “Expressive instruction-based editing”.
Specifically, “MGIE can produce concise and clear instructions that guide the editing process effectively. This not only improves the quality of the edits but also enhances the overall user experience.”
In reference to the kinds of Photoshop-style edits that so many of us are accustomed to today, the paper further adds,
“MGIE can perform common Photoshop-style edits, such as cropping, resizing, rotating, flipping, and adding filters. The model can also apply more advanced edits, such as changing the background, adding or removing objects, and blending images.”
The writers of the paper also explain that MGIE can optimize the overall quality of photos through corrections to brightness, contrast, sharpness, and color balance and by adding artistic effects such as sketch or paint styles.
All of these can be done through natural language instructions to the AI by a user.
MGIE can also do local editing in an image, being able to modify specific parts of it such as eyes or faces while leaving other elements untouched.
The paper then explains that “the model can also modify the attributes of these regions or objects, such as shape, size, color, texture, and style.”
You as a user could, for example, ask MGIE to “increase the contrast of the sky in this image” and have the AI do this while leaving the existing contrast of the landscape untouched.
The online beta of MGIE is available here, where you can upload your images and give the software a try.
Based on my spin on the software, I’d say that the results aren’t exactly impressive yet, but then again, this particular type of editing AI is just getting off the ground.
The wait time for an edit to complete can be long too. Single image edits can take several minutes or longer, though the page tells you how far along they are.
Once MGIE or some other software that works similarly, gets refined further, it definitely could be extremely useful for easing the learning curve of high-quality photo editing.
Adding in voice control would make the AI’s abilities even more impressive, particularly for users who are physically handicapped in some way that interferes with mouse and keyboard use.
So far, Apple hasn’t mentioned any specific product plans with MGIE but the company has claimed that Siri, its existing voice assistant AI, is going to soon get updates that make it much smarter.
This isn’t directly related to MGIE, but the two technologies could definitely be fused in the near future.
Highly Recommended
8 Tools for Photographers
Check out these 8 essential tools to help you succeed as a professional photographer.
Includes limited-time discounts.
Learn more here