The AI ​​at Salesforce has developed a new editing algorithm called EDICT that creates a text-to-image spread with a process that is not reversible given any existing spread model

With the recent developments in technology and the field of artificial intelligence, there have been a lot of innovations. Be it generating text using the super popular ChatGPT template or creating an image from text, everything is possible now. Currently there are several text-to-image models that not only produce a new image from a text description but also edit an existing one. It is usually easier to create an image than to edit an available image, as many fine details need to be preserved during editing. For precise editing of text-based images, the researchers developed a new algorithm, EDICT – Exact Diffusion Inversion via Coupled Transformations. EDICT is a new algorithm capable of performing text-guided image editing with the help of diffusion models.

Text to image generation is a task in which a machine learning model is trained to produce an image based on a given textual description. The model learns to associate text descriptions with images and generates new images that match the given description. EDICT performs text-to-image propagation generation using any existing propagation model. In image generation, diffusion models are generative models that use the diffusion process to produce new images. The propagation process starts from a random image and is then iteratively filtered by applying a series of transformations until it reaches a final image identical to the target image.

Diffusion models are trained to generate a patterned image from a noisy image with the help of a text description. To edit an image, blur is added to the original image, and this partial generation is used to perform a new generation using the selected text. EDICT works on the concept of getting a fuzzy image that will produce the exact original image when supplied with the original or vector text. It is a kind of reverse noise technology. This way, if the original text is altered slightly, the modified image will mostly remain unchanged with only the required modifications.

The team behind EDICT shares the results of the algorithm with the help of an example. While creating an image of a cat surfing in the water by editing an existing image of a surfer dog, a lot of subtle details and information are lost, such as waves, plate color, etc. This is because, in this method, noise is simply added to the original image to create the new image. . In the EDICT technique, reverse generation is performed by finding a scrambled image that will exactly generate the original image. This disturbing image then generates the actual image of a surfing dog with the help of a text caption. The noise from the image generated to query the form is copied back into the image without noise. This is followed by tweaking the text by simply replacing the word dog with the word cat, and in the end, a modified and relatively detailed image of a cat surfing is obtained. EDICT just works on the idea of ​​making two identical copies of an image and instead enhances each one with details over the other in a reverse way.

This new approach seems undeniably promising, as existing paradigms for creating text-to-image are inconsistent and do not fully do justice to the details of the original image. By reversing the generation process, the important content of the image can be preserved. Given the increasing innovations and increasing demand for these image generation models, EDICT seems to be a great competitor to all existing models.


scan the paperAnd githubAnd And SF blog. All credit for this research goes to the researchers on this project. Also, don’t forget to join Our Reddit pageAnd discord channelAnd And Email newsletterwhere we share the latest AI research news, cool AI projects, and more.


Tania Malhotra is a final year from University of Petroleum and Energy Studies, Dehradun, pursuing a BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is passionate about data science and has good analytical and critical thinking, along with a keen interest in acquiring new skills, leading groups, and managing work in an organized manner.


Leave a Comment