Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mohannadcse/normalizer-authorship
https://github.com/mohannadcse/normalizer-authorship
Last synced: 4 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/mohannadcse/normalizer-authorship
- Owner: Mohannadcse
- License: mit
- Created: 2022-10-02T17:14:16.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-07-18T17:02:40.000Z (over 1 year ago)
- Last Synced: 2024-12-06T15:50:12.576Z (20 days ago)
- Language: C++
- Size: 407 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## The normalizer implementation consists of 2 components:
1- [normalized transformations](src/LibToolingAST_Normalizer)2- [generating normalized Dataset](src/PyProject/normalizer)
## Preparing Projects
### Clang TransformationsFollow the instructions [here](https://github.com/EQuiw/code-imitator/blob/master/src/LibToolingAST/README.md)
### Python Project
- Export the python project path `i.e., export PYTHONPATH="/nobackup/code-imitator-normalizer/src/PyProject:$PYTHONPATH"`- Follow the instructions [here](https://github.com/EQuiw/code-imitator/blob/master/src/PyProject/README.md)
## Dataset
The dataset shold follow the same structure of the original dataset https://github.com/EQuiw/code-imitator/tree/master/data/dataset_2017/dataset_2017_8## Running the Normalizer
1- Download the [dataset](https://github.com/EQuiw/code-imitator/tree/master/data/dataset_2017/dataset_2017_8)
2- Specifying the location of the `datasetpath` and `normalizedresultspath` inside the python file `src/PyProject/Configuration.py`
3- Specify the attack configuration as defined [here](src/PyProject/Configuration.py#L90)
4- Provide the corresponding pickle files (classification models) to the specified attack in the previous step. The default location is defined [here](src/PyProject/Configuration.py#L30)
5- The verification of the output before and after the applying the transformer can be enable/disabled from [here](src/PyProject/Configuration.py#L96). It's by default enabled.6- To run the normalizer and generate the normalized dataset `python3 src/PyProject/normalizer/normalizer.py `, where the current implementation supports the following categories: `API, Control, Misc, Declaration, all`
## Ouput of the Normalizer
Besides generating the normalized files for the provided dataset in the specified location `normalizedresultspath`, the normalizer also generates two CSV files `tranfromation_results.csv` and `count_results.csv` to report about the performed transformations for each `cpp` file.# Debugging the Normalizer
We added debugging feature to the normalizer to normalize just a signle `.cpp` file. The variable `single` inside the python file `src/PyProject/Configuration.py` enables this debugging. Otherwise, it should be zero.# Disclaimer
The normalizer has been developed to normalize code transformations defined [here](https://github.com/EQuiw/code-imitator/tree/master/src/LibToolingAST). Therefore, many of its libraries have been used as is or adapted.# Citation
If you are using our implementation, please cite our NeurIPS '22 paper. You may use the following BibTex entry:
```
@inproceedings{
wang2022robust,
title={Robust Learning against Relational Adversaries},
author={Yizhen Wang and Mohannad Alhanahnah and Xiaozhu Meng and Ke Wang and Mihai Christodorescu and Somesh Jha},
booktitle={Advances in Neural Information Processing Systems},
editor={Alice H. Oh and Alekh Agarwal and Danielle Belgrave and Kyunghyun Cho},
year={2022},
url={https://openreview.net/forum?id=WBp4dli3No6}
}
```