Towards Discovery and Attribution of Open-world GAN Generated Images

ICCV 2021

Sharath Girish*

Saksham Suri*

Saketh Rambhatla

[Paper]

[GitHub]

A plethora of GANs are released every year which can be used to create images that come from several unknown sources. This could happen in an online fashion where images could consist of multiple unknown GAN sources. Our approach is capable of discovering and attributing unknown GAN sources while utilizing only an initial small labeled set of GANs. With high accuracy we attribute seen GANs from a set of images as well as identify and cluster unknown GAN sources with high purities.

Abstract

With the recent progress in Generative Adversarial Networks (GANs), it is imperative for media and visual forensics to develop detectors for identifying and attributing images to the model generating them. Existing works have shown to attribute GAN-generated images with high accuracy. However, they work in a closed set scenario and fail to generalize to GANs unseen during train time. Therefore, they are not scalable with a steady influx of new GANs. We present an iterative algorithm for discovering images generated from previously unseen GANs by exploiting the fact that all GANs leave distinct fingerprints on their generated images. Our algorithm consists of multiple components involving network training, out-of-distribution detection, clustering, merge and refine steps. We provide extensive experiments to show that our algorithm discovers unseen GANs with high accuracy and also generalizes to GANs trained on unseen real datasets. Our experiments demonstrate the effectiveness of our approach to discover new GANs and can be used in an open-world setup.

Approach Overview

Our approach consists of an iterative multistep pipeline which improves the attributed samples while at the same time learning on the go and attributing previously unassigned ones.

Network Training At any step we train the network with the initial labelled training set and the pseudo-labelled clustered set followed by extracting features from it.

Out-of-Distribution Detection Using the features extracted, we predict whether the remaining (unclustered) samples belong to an existing clustered set or are out of distribution with respect to these. For the samples predicted as in-distribution, the attribution is performed using the network.

Clustering For the unclustered set at each step, we perform overclustering to create groupings of same source images.

Merging and Refining Performed to reduce the number of clusters obtained due to overclustering. Although this reduces the number of clusters, it also reduces the purity. To tackle this we perform a refine step which throws away impure some samples/clusters based on heuristics.

Qualitative Analysis

Samples from clusters discovered by our approach for unseen GANs with the majority class in parenthesis. It can be noticed that they are not just focusing on the object structure and semantics rather the underlying source.

Some example clusters merged by our approach during the merge step. The merge step successfully combines clusters having the same majority GAN source.

Paper and Supplementary Material

S. Girish*, S. Suri*, S. Rambhatla, A. Shrivastava.
Towards Discovery and Attribution of Open-world GAN Generated Images
(Paper | Supplementary | arXiv)

Template credits