יולי 21, 1994 - יולי 21, 2027

  • Date:30חמישימאי 2024

    Vision and AI

    More information
    שעה
    12:15 - 13:15
    כותרת
    Editing methods for Text-to-Image Models
    מיקום
    בניין יעקב זיסקינד
    Room 1
    מרצה
    Hadas Orgad
    Technion
    מארגן
    המחלקה למדעי המחשב ומתמטיקה שימושית
    Seminar
    צרו קשר
    תקצירShow full text abstract about Text-to-image generative diffusion models are trained on hug...»
    Text-to-image generative diffusion models are trained on huge amounts of web-scraped image-caption pairs. As a result, these models encode real-world information and correlations, such as the identity of the President of the United States, or the color of the sky. While this knowledge can be useful, and allows easy and efficient generation of beautiful images from simple prompts, it may also be outdated, reflect assumptions and biases (e.g., doctors are always white male), or violate copyrights (as was demonstrated in recent lawsuits for models imitating artistic styles). However, model providers and creators currently have no efficient means to update models without either retraining them---which is costly in computation and time, and might also require data curation---or requiring explicit prompt engineering from the end user. In this talk, we will discuss three of our recent papers, which aim to offer a fast and practical way to control model behavior post-training. We modify a small, targeted part of the model that is responsible for encoding a certain part in the computation process of the deep network. This is done without training, by editing the model weights using a closed-form solution. The different papers target different parts of the model, as well as various types of information encoded in it: implicit assumptions, factual associations, artistic style, social biases, and harmful content. We will also discuss some of the interpretability aspects and insights that can be gained from these editing methods. Overall, the methods presented in the talk offer a fast and practical means for safe deployment of text-to-image models. 
    הרצאה