Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/awikramanayake/gnn-mnist-workspace
Classification of the MNIST dataset using Graph Neural Networks
https://github.com/awikramanayake/gnn-mnist-workspace
ai deep-learning graph-neural-networks machine-learning neural-networks
Last synced: 5 days ago
JSON representation
Classification of the MNIST dataset using Graph Neural Networks
- Host: GitHub
- URL: https://github.com/awikramanayake/gnn-mnist-workspace
- Owner: AWikramanayake
- Created: 2022-11-15T21:09:01.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-11-29T05:33:01.000Z (about 2 years ago)
- Last Synced: 2024-12-01T08:16:23.128Z (2 months ago)
- Topics: ai, deep-learning, graph-neural-networks, machine-learning, neural-networks
- Language: Jupyter Notebook
- Homepage:
- Size: 394 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GNN-mnist-workspace
The purpose of this project is to practice using GNNs to work with pixel-based data.
The first step of this project is to convert the MNIST dataset from pixel form to graph form. This step seems redundant as this dataset is readily available in graph form (in fact, it can be called in graph form within Spektral directly). However, I have chosen to do this step manually for two reasons: Firstly, it serves as practice for my research, which also involves converting pixels to graphs. Secondly, creating graphs manually makes it easier to change features of the graphs, and thus hopefully improve network performance.
At present, the DatasetGeneration class produces graphs similar to those that can be called directly within Spektral (https://graphneural.network/datasets/#mnist). The node features are the vectorized digits, except here the pixels are in a binary on/off state, where pixels with brightness > 0.4 in the original MNIST dataset are considered 'on' and all others are 'off' (this results in a loss of information, but was a conscious design choice to match my research. It can trivially be undone to match the Spektral dataset). Edges are created between adjacent 'on' nodes.
A visualization of the process is shown below:
Figure 1: an example image from the MNIST dataset
Figure 2: the image after the pixels are set to the binary on/off state
Figure 3: the resulting graphIn addition to fine tuning the model and hyperparameters, the next steps consist of improving the features of the graph.
The current implementation notably lacks edge features.
One possible improvement would be to add edges to more distant neighbours, with the edges weighted by distance.
Figure 4: a graph with additional longer edgesHowever, introducing this many edges may be computationally prohibitive, so finding other potential edge and node features might be a more prudent approach. And of course, as other work in this area has shown, creating nodes that do not correspond to pixels 1:1 may be an even better approach [1].
[1]: Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., & Bronstein, M. M. (2017). Geometric deep learning on graphs and manifolds using mixture model cnns. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2017.576
## Extended Neighbourhoods update: 29/11/2022
Extended neighbourhoods to bright pixels one layer beyond the initial neighbourhood-of-8 has been successfully implemented. Additional extentions can be easily added using the Distance Matrix and additional loops in GenerateDataset class.
The initial testing with a simple model suggests that the extended neighbourhoods do in fact improve the model, which was not a given as it was possible that this addition could have been made redundant by the convolution process. However, when using a simple model (see "Extended Neighbourhood Test/Control" notebooks), the model attains a greater final accuracy in fewer epochs.
The computational time is not considerably worse despite the increase in the density of the adjacency matrix, at least for this simple model.
This change introduces a new parameter that needs to be optimized: the scaling of edge weights with distance.