Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/spirals-team/introclassjava

A dataset of Java bugs for automatic repair, derived from the C bugs of IntroClass
https://github.com/spirals-team/introclassjava

program-repair

Last synced: 5 days ago
JSON representation

A dataset of Java bugs for automatic repair, derived from the C bugs of IntroClass

Awesome Lists containing this project

README

        

# IntroClassJava: A Benchmark of Buggy Java Programs

If you use IntroClassJava, please cite the following technical report:

Thomas Durieux and Martin Monperrus. "[IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs](https://hal.archives-ouvertes.fr/hal-01272126/document)". Technical Report hal-01272126, University of Lille; 2015.

```
@techreport{durieux:hal-01272126,
TITLE = {{IntroClassJava: A Benchmark of 297 Small and Buggy Java Programs}},
AUTHOR = {Durieux, Thomas and Monperrus, Martin},
URL = {https://hal.archives-ouvertes.fr/hal-01272126/document},
INSTITUTION = {{Universite Lille 1}},
YEAR = {2016},
NUMBER = {hal-01272126},
}
```

This benchmark has been automatically generated from the IntroClass benchmark for C from the Autorepair Benchmark Suite, a joint project between Carnegie-Mellon University and the University of Massachusetts ().

## Main statistics
| Project | # wb ok | # wb ko | # bb ok | # bb ko | # both ko | # program |
|-----------|---------|---------|---------|---------|-----------|-----------|
| digits | 15 | 60 | 24 | 51 | 36 | 75 |
| grade | 1 | 88 | 0 | 89 | 88 | 89 |
| checksum | 0 | 11 | 4 | 7 | 7 | 11 |
| median | 9 | 48 | 6 | 51 | 42 | 57 |
| smallest | 7 | 45 | 5 | 47 | 40 | 52 |
| syllables | 1 | 12 | 0 | 13 | 12 | 13 |
| 6 | 33 | 264 | 39 | 258 | 225 | 297 |

## Directory Overview

```
introclassJava/
|-lib/
|--data/
|---dataset.xml
|--CToJava.py
|--evalIntroClassJava.py
|-dataset/
|--checksum/
|---f4a823174201234546789abcdeffff.../
|----000/
|-----src/
|------main/
|-------java/introclassJava
|--------digits_f4a823174201234546789abcdeffff_000.java
|------test/
|-------java/introclassJava
|--------digits_f4a823174201234546789abcdeffff_000BackboxTest.java
|----001/
|--------
|---09F911029D74E35BD84156C5635688C0.../
|--digits/
|-...
```

The folder ```lib``` contains the python scripts use to transform the C dataset to a Java dataset.

The file ```lib/data/dataset.xml``` contains the dataset IntroClass transformed into xml via the following command:
```console
srcml --language=C --literal --operator --modifier `find IntroClass -name "*.c"` -o IntroClassJava/lib/data/dataset.xml
```

The folder ```dataset``` contains the assignment programs:

* checksum -- compute a simple checksum of a string
* digits -- reverse the digits of an integer "123" -> "321"
* grade -- compute the letter grade corresponding to a percentage
* median -- give the median of three numbers
* smallest -- give the smallest of three numbers
* syllables -- give the number of English syllables in a string

Each subdirectory below represents a student's submitted repository which contains several revisions. Each revision is a maven project.