Manipulating Diffusion Models

Large-scale text-to-image generative models allow synthesizing of diverse images that convey highly complex visual concepts. However, it remains a challenge o provide users with control over the generated content. In this project, we present a new framework that takes text-to-image synthesis to the realm of image-to-image translation. Given a guidance image and a target text prompt, generates a new image that complies with the target text, while preserving the semantic layout of the source image. Specifically, we observe and empirically demonstrate that fine-grained control over the generated structure can be achieved by manipulating spatial features and their self-attention inside the model, requiring no training or fine-tuning. 

Michal Geyer, Omer Bar-Tal, Shai Bagon, Tali Dekel "TokenFlow: Consistent Diffusion Features for Consistent Video Editing" ICLR 2024. [project page]

Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel  "Plug-and-Play Diffusion Features for Text-Driven Image-to-Image TranslationCVPR 2023. [project page]

The Power of DINO-ViT Features

Vision Transformers (ViT) are novel and powerful backbones for image analysis. When trained in a self-distillation manner (DINO), they yield intriguing representations locally and globally.  In this project, we set out to explore these representations, find their strengths and advantages, and how they give rise to new applications while performing more straightforward tasks (e.g., segmentation, point correspondences) in a zero-shot manner.


Narek Tumanyan, Assaf Singer, Shai Bagon and Tali Dekel "DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video" arXiv 2024. [project page]

Shir Amir, Yossi Gandelsman, Shai Bagon, and Tali Dekel "Deep vit features as dense visual descriptors" ECCVW 2022. [project page]

Narek Tumanyan, Omer Bar-Tal, Shai Bagon, and Tali Dekel "Splicing ViT Features for Semantic Appearance Transfer" CVPR 2022 [project page]

Narek Tumanyan, Omer Bar-Tal, Shir Amir, Shai Bagon, and Tali Dekel "Disentangling Structure and Appearance in ViT Feature Space" ACM Trans. on Graphics 2023

Tal Zimbalist, Ronnie Rosen, Keren Peri-Hanania, Yaron Caspi, Bar Rinott, Carmel Zeltser-Dekel, Eyal Bercovich, Yonina C. Eldar and  Shai Bagon "Detecting bone lesions in X-ray under diverse acquisition conditions" SPIE Journal of Medical Imaging, Vol. 11, Issue 2, 2024.

Amit Aflalo, Shai Bagon, Tamar Kashti, and Yonina Eldar "DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering" ICCVW (SG2RL) 2023 [project page]

Multiplex Imaging

These projects are the fruits of a collaboration with Leeat Keren's Lab which uses MIBI-TOF (Multiplexed Ion Beam Imaging by Time of Flight). This technology gives high-dimensional images, depicting sub-cellular protein expression and localization in situ.


  • CombPlex (COMBinatorial multiPLEXing): a combinatorial staining platform coupled with an algorithmic framework to exponentially increase the number of proteins that can be measured from C up to 2C - 1, and is applicable to any mass spectrometry-based or fluorescence-based microscopy platform.

    Raz Ben-Uri,  Lior Ben Shabat, Omer Bar-Tal, Yuval Bussi, Noa Maimon, Tal Keidar Haran, Idan Milo, Ofer Elhanani, Alexander Rochwarger, Christian M. Schürch, Shai Bagon, Leeat Keren "Escalating High-dimensional Imaging using Combinatorial Channel Multiplexing and Deep Learning" (biorxiv, 2023)  [project page]


Drop the GAN

Single image and video generative models can generate diverse variations given a single input image or video. In these projects, we revisit the classical patch-based methods and show their capabilities to handle tasks that were considered "GAN only".

Niv Granot, Ben Feinstein, Assaf Shocher, Shai Bagon, Michal Irani "Drop The GAN: In Defense of Patch Nearest Neighbors as Single Image Generative Models" CVPR 2022. [project page]

Niv Haim, Ben Feinsten, Niv Granot, Assaf Shocher, Shai Bagon, Tali Dekel, Michal Irani "VGPNN: Diverse Generation from a Single Video Made Possible" ECCV 2022. [project page]

Assessment of COVID-19 in Lung Ultrasound by Combining Anatomy and Sonographic Artifacts Cues using Deep Learning

When assessing severity of COVID-19 from lung ultrasound (LUS) frames, both anatomical phenomena (e.g., the pleural line, presence of consolidations), as well as sonographic artifacts, such as A-lines and B-lines are of importance. While ultrasound devices aim to provide an accurate visualization of the anatomy, the orientation of the sonographic artifacts differ between probe types. This difference poses a challenge in designing a unified deep artificial neural network capable of handling all probe types.

In this work we improve upon Roy et al (2020): We train a simple deep neural network to assess the severity of COVID19 from LUS data. To address the challenge of handling both linear and convex probes in a unified manner we employed two strategies: First, we augment the input frames of convex probes with a ``rectified” version in which A-lines and B-lines assume a horizontal/vertical aspect close to that achieved with linear probes. Second, we explicitly inform the network of the presence of important anatomical features and artifacts. We use a known Radon-based method for detecting the pleural line and B-lines and feed the detected lines as inputs to the network.

Michael Roberts, Oz Frank, Shai Bagon, Yonina C. Eldar, and Carola-Bibiane Schönlieb. "AI and Point of Care Image Analysis for COVID-19." In Artificial Intelligence in Covid-19, pp. 85-119. Springer, Cham, 2022.

Oz Frank, Nir Schipper, Mordehay Vaturi, Gino Soldati, Andrea Smargiassi, Riccardo Inchingolo, Elena Torri, Tiziano Perrone, Federico Mento, Libertario Demi, Meirav Galun, Yonina C. Eldar, Shai Bagon Integrating Domain Knowledge Into Deep Networks for Lung Ultrasound With Applications to COVID-19 IEEE Transactions on Medical Imaging (2021)

[A recorded talk at the Acoustics Virtually Everywhere, The 179th Meeting of the Acoustical Society of America]

Across Scales & Across Dimensions: Temporal Super-Resolution using Deep Internal Learning

When a very fast dynamic event is recorded with a low-framerate camera, the resulting video suffers from severe motion blur (due to exposure time) and motion aliasing (due to low sampling rate in time). True Temporal Super-Resolution (TSR) is more than just Temporal-Interpolation (increasing framerate). It also recovers new high temporal frequencies beyond the temporal nyquist limit of the input video, thus resolving both motion-blur and motion-aliasing. In this paper we propose a "Deep Internal Learning" approach for true TSR. We train a video-specific CNN on examples extracted directly from the low-framerate input video. Our method exploits the strong recurrence of small space-time patches inside a single video sequence, both within and across different spatio-temporal scales of the video. We further observe (for the first time) that small space-time patches recur also across-dimensions of the video sequence - i.e., by swapping the spatial and temporal dimensions. In particular, the higher spatial resolution of video frames provides strong examples as to how to increase the temporal resolution of that video. Such internal video-specific examples give rise to strong self-supervision, requiring no data but the input video itself. This results in Zero-Shot Temporal-SR of complex videos, which removes both motion blur and motion aliasing, outperforming previous supervised methods trained on external video datasets.

[project page] [github]

InGAN: Capturing and Remapping the "DNA" of a Natural Image

Generative Adversarial Networks (GANs) typically learn a distribution of images in a large image dataset, and are then able to generate new images from this distribution. However, each natural image has its own internal statistics, captured by its unique distribution of patches. In this paper we propose an "Internal GAN" (InGAN) - an image-specific GAN - which trains on a single input image and learns its internal distribution of patches. It is then able to synthesize a plethora of new natural images of significantly different sizes, shapes and aspect-ratios - all with the same internal patch-distribution (same "DNA") as the input image. In particular, despite large changes in global size/shape of the image, all elements inside the image maintain their local size/shape. InGAN is fully unsupervised, requiring no additional data other than the input image itself. Once trained on the input image, it can remap the input to any size or shape in a single feedforward pass, while preserving the same internal patch distribution. InGAN provides a unified framework for a variety of tasks, bridging the gap between textures and natural images.

[project page] [github]

Discrete Energy Minimization


Matlab code implementing discrete multiscale optimization presented in:
Shai Bagon and Meirav Galun A Unified Multiscale Framework for Discrete Energy Minimization (arXiv'2012),
and Shai Bagon and Meirav Galun A Multiscale Framework for Challenging Discrete Optimization (NIPS Workshop on Optimization for Machine Learning 2012).



Correlation clustering

Matlab code implementing optimization algorithms presented in:
Shai Bagon and Meirav Galun Large Scale Correlation Clustering Optimization (arXiv'2011).
May be applicable to other graph partitioning problems as well.



Sketch the Common

Matlab implementing the sketching part of Shai Bagon, Or Brostovsky, Meirav Galun and Michal Irani's Detecting and Sketching the Common (CVPR 2010).

[project page] [github]

Matlab Wrappers


Matlab wrapper to Veksler, Boykov, Zabih and Kolmogorov's implementation of Graph Cut algorithm. Use the following citation if you use this software. There is a simple example of image segmentation using GraphCuts.



Robust P^n

Matlab wrapper to Lubor Ladicky, Pushmeet Kohli and Philip Torr's Minimizing Robust Higher Order Potentials using Move Making Algorithms. This software is for research purposes only. Use the following citations in any resulting publication.
Note: This wrapper provides an additional functionality of varying weights for the nodes participating in a higher order potential as described in the tech. report.



Approximate Nearest Neighbors

Matlab class providing interface to ANN library of David Mount and Sunil Arya.



EDISON mean-shift Segmentation

Matlab wrapper for EDISON mean-shift image segmentation.