https://github.com/kahsolt/neural-classifier-prototypes

Evolving from a pure-color image or random noise towards class-wise inherent prototypes for a neural classifier
https://github.com/kahsolt/neural-classifier-prototypes

Last synced: 11 months ago
JSON representation

Evolving from a pure-color image or random noise towards class-wise inherent prototypes for a neural classifier

Host: GitHub
URL: https://github.com/kahsolt/neural-classifier-prototypes
Owner: Kahsolt
License: mit
Created: 2022-10-10T09:16:56.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-10-21T12:30:06.000Z (about 3 years ago)
Last Synced: 2025-01-03T16:29:00.681Z (about 1 year ago)
Language: Python
Size: 38.1 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Neural-Classifier-Prototypes

    Evolving from a pure-color image or random noise towards class-wise inherent prototypes for a neural classifier

----

For a well-trained neural classifier `f`, we call an input `x` to be a prototype of class `y` iff. `f(x) == y st. d(f(x)) / d(x) == 0`. 

Which means this input `x` is a extremum of `f`, causing zero gradient on `f`, and reasonably correctly classified by `f` (with the highest confidence). 

Due to the inherent nature of NN models, a 1000-class classifier may have much more than 1000 class-wise prototypes, constrasting with centroids-based clustering models. 

Even though, there are much much moooore inputs that are NOT exactly equal to any prototype, and they will cause a non-zero gradient on `f`.

An adversarial attacker usually seeks along the direction of max gradient `max. d(f(x)) / d(x)` for a small perturbation `dx` letting `f(x + dx) != y`, using the naive FGSM or PDG method. 

We now consider about the opposite: for any given input `x`, find its closest class-wise prototypes inherent of `f`. 

In addition, we are quite curious about this: when `x` starts evolving/alientnating from a total blank or guassian noise, **to what degree** will `f` start to recognize it (with a high confidence), is that generated texture also recognizable for a human-being?

Yeah, you might've got it - we are treating adversarial attacks as kind of generative model, and the frozen classifier is its discriminator counterpart ;)

### Experiments

#### Alienate from blank / pure color image

Occupying any label-ignorant gradient-based adversarial attack (we try `PGD`, `MIFGSM` and `PGDL2`), for each input `x` towards then given classifier `f`: 

  - Find perturbations `dx_{i}` st. `f(c + dx_{i}) == y_{i} st. d(f(x + dx_{i})) / d(x + dx_{i}) -> 0`, where `c` is a const

  - Find perturbations `dx_{i}` st. `f(c + dx_{i}) == y_{i} st. | d(f(x + dx_{i})) / d(x + dx_{i}) | -> inf`

We test model `f=resnet18` (pretrained over the whole ImageNet), victim dataset `X=imagenet-1k`, attack setting `atk=('pgd', 0.1, 0.001)`

#### Alienate from random noise

For each input `x` towards then given classifier `f`: 

  - Find perturbations `dx_{i}` st. `f(r + dx_{i}) == y_{i} st. d(f(x + dx_{i})) / d(x + dx_{i}) -> 0`, where `r` follow some kind of stochastic distribution

  - Find perturbations `dx_{i}` st. `f(r + dx_{i}) == y_{i} st. | d(f(x + dx_{i})) / d(x + dx_{i}) | -> inf`

We test model `f=resnet18` (pretrained over the whole ImageNet), victim dataset `X=imagenet-1k`, attack setting `atk=('pgd', 0.1, 0.001)`

#### Alienate from a given picture

  - Find perturbations `dx_{i}` st. `f(x + dx_{i}) == y_{i} st. d(f(x + dx_{i})) / d(x + dx_{i}) -> 0`, where `i` enumerates over all target classes of `f`

  - Find perturbations `dx_{i}` st. `f(x + dx_{i}) == y_{i} st. | d(f(x + dx_{i})) / d(x + dx_{i}) | -> inf`

We test model `f=resnet18` (pretrained over the whole ImageNet), victim dataset `X=imagenet-1k`, attack setting `atk=('pgd', 0.1, 0.001)`

```

log\_--__e_a.pkl

resnet18_imagenet-min-svhn_pgd_e3e-2_a1e-3

```

----

by Armit

2022/10/09

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kahsolt/neural-classifier-prototypes

Awesome Lists containing this project

README