Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-msr
A curated repository of software engineering repository mining data sets
https://github.com/dspinellis/awesome-msr
Last synced: 5 days ago
JSON representation
-
Data Sets
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- AndroidTimeMachine - Graph-based dataset of commit history of 8,431 real-world Android apps.
- AndroZoo - Collection of Android Applications.
- Bug Prediction Dataset - Collection of models and metrics from Eclipse JDT Core, PDE UI, Equinox Framework, Lucene, Mylyn, and their histories.
- Eclipse AERI stacktraces - Collection of stacktraces of Exceptions encountered by users of the Eclipse IDE, as retrieved by the AERI reporting system.
- Enron Spreadsheets and Emails - All the spreadsheets and emails used in the paper 'Enron's Spreadsheets and Related Emails: A Dataset and Analysis'.
- GitHub Bug Dataset - Bug Dataset of 15 Java open-source projects characterized by static source code metrics.
- GitHub on Google BigQuery - GitHub data accessible through Google's BigQuery platform.
- KaVE - Developer tool interaction data.
- Linux Kernel 4.21 Call Graphs - The Linux Kernel 4.21 Call Graphs produced using [CScout](https://github.com/dspinellis/cscout/).
- Maven Dependency Graph - Snapshot of the whole Maven Central taken on September 6, 2018, stored in a graph database.
- RepoReapers Data Set - Data set containing a collection of _engineered software projects_ from GHTorrent.
- Software Heritage Graph Dataset - Graph of the development history and file metadata of >80 million software projects from various forges (GitHub, Gitlab, Debian, PyPI, Google Code, etc) in a deduplicated and unified representation ([paper here](https://dl.acm.org/citation.cfm?id=3341907)).
- STAMINA - (STAte Machine INference Approaches) data are used to benchmark techniques for learning deterministic finite state machines (FSMs).
- Stack Exchange - Anonymized dump of all user-contributed content on the Stack Exchange network.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- Ultimate Debian Database (UDD) - Data about various aspects of Debian (e.g. packages, bugs, mainteners) in the same SQL database.
- Unified Bug Dataset - Static source code based datasets which includes the Bugcatchers Bug Dataset, the [Bug Prediction Dataset](http://bug.inf.usi.ch/index.php), the [Eclipse Bug Dataset](https://www.st.cs.uni-saarland.de/softevo/bug-data/eclipse/), the [GitHub Bug Dataset](http://www.inf.u-szeged.hu/~ferenc/papers/GitHubBugDataSet/), some datasets from the [PROMISE](http://promise.site.uottawa.ca/SERepository/datasets-page.html) repository.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- GHTorrent - Scalable, queriable, offline mirror of data offered through the GitHub REST API.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- TravisTorrent - Provides free and easy-to-use Traivs CI build analyses.
- CoREBench - Collection of 70 realistically Complex Regression Errors that were systematically extracted from the repositories and bug reports of four open-source software projects: Make, Grep, Findutils, and Coreutils.
- GHTorrent - Scalable, queriable, offline mirror of data offered through the GitHub REST API.
-
Research Outlets
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- MSR: Mining Software Repositories conference
- PROMISE: Predictive Models and Data Analytics in Software Engineering conference
- ACM Transactions on Software Engineering and Methodology (TOSEM)
- ESEC/FSE: ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
- ICSE: International Conference on Software Engineering
- IEEE Software magazine
- IEEE Transactions on Software Engineering
- Journal of Systems and Software
- SANER: IEEE International Conference on Software Analysis, Evolution and Reengineering
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- SANER: IEEE International Conference on Software Analysis, Evolution and Reengineering
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
- Empirical Software Engineering journal
-
Repositories
- SIR - Software-artifact infrastructure repository; Java, C, C++, and C# software together with test suites and fault data.
- PROMISE - About 20 datasets related to software engineering research.
- FLOSSmole - Collaborative collection and analysis of free/libre/open source project data.
- Zenodo - Software data collections in CERN's open-access repository.
- Software Engineering Artifacts Can Really Assist Future Tasks
- Empirical Software Engineering
- Mining Software Repositories
-
Tools
- Boa - Domain-specific language and infrastructure that eases mining software repositories.
- ckjm - Chidamber and Kemerer Java Metrics.
- Designite - Compute source code metrics and detect a variety of implementation, design, and architecture smells for C#.
- GrimoireLab - Free/Libre/Open Source tools for Software Development Analytics.
-
Uncategorized
-
Uncategorized
-
Sub Categories
Keywords