InstructPix2Pix: Editing Images Based on Text Prompts

Instruct the model to get a different result.

Researchers from the University of California, Berkeley presented a new method for editing images based on text instructions called InstructPix2Pix. You just need to give the system a picture and a text prompt in form of an order to get a different image.

To gather training data for the method, the creators combined the GPT-3 language model and Stable Diffusion's text-to-image capabilities to produce a vast dataset of image editing examples. InstructPix2Pix generalizes to real images and user-written instructions at inference time.

The model can edit images quickly as it performs edits in the forward pass and doesn't need er-example fine-tuning or inversion.

While InstructPix2Pix produces pretty accurate new images, it still suffers from biases from the data and models it is based upon. For example, flight attendants are usually women in the results and doctors are men.

The model also can't perform viewpoint changes, can make undesired excessive changes, sometimes fails to isolate the specified object, and has difficulty reorganizing or swapping objects with each other.

Check out the project here and don't forget to join our 80 Level Talent platformour Reddit page, and our Telegram channel, follow us on Instagram and Twitter, where we share breakdowns, the latest news, awesome artworks, and more.

Join discussion

Comments 0

    You might also like

    We need your consent

    We use cookies on this website to make your browsing experience better. By using the site you agree to our use of cookies.Learn more