Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-software-engineering-research

A curated list of software engineering research, data set, tool.
https://github.com/FudanSELab/awesome-software-engineering-research

Last synced: about 14 hours ago
JSON representation

  • PapersWithCode

  • Data Sets and Benchmarks

    • API Usage Pattern Recommendation

      • FOCUS - aware collaborative-filtering system that exploits cross relationships among OSS projects to suggest the inclusion of additional API invocations and concrete API usage patterns. Paper: "FOCUS: A Recommender System for Mining API Function Calls and Usage Patterns".
    • Bug Localization

    • Others

      • Stack Exchange - Anonymized dump of all user-contributed content on the Stack Exchange network.
    • Code Language

      • Code-LMs - lingual code corpus used to train language model and some pretrained language model for code, e.g., GPT-2, PolyCoder. [Paper](https://arxiv.org/pdf/2202.13169.pdf).
    • Test Oracle Generation

    • Microservice System

      • CodeXGLUE - A benchmark dataset and open challenge for code intelligence. It includes 14 datasets for 10 diversified code intelligence tasks covering the following scenarios: 1) code-code (clone detection, defect detection, cloze test, code completion, code repair, and code-to-code translation); 2) text-code (natural language code search, text-to-code generation); 3) code-text (code summarization); 4) text-text (documentation translation).
      • Project CodeNet - for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques. Project CodeNet is a large scale dataset with approximately 14 million code samples, each of which is an intended solution to one of 4000 coding problems. Project CodeNet aims to do for AI for Code what ImageNet did for computer vision.
    • Library-Oriented Code Generation

      • PyCodeGPT - Training on Sketches for Library-Oriented Code Generation
    • API Misuse

      • MUBench - misuse detectors, based on the MUBench benchmarking dataset. If you encounter any problems using MUBench, please report them to us. If you have any questions, please contact Sven Amann.
      • CryptoAPI-Bench
    • Variable Misuee

      • great - Great](https://github.com/VHellendoorn/ICLR20-Great),, the dataset for the variable-misuse task, described in the ICLR 2020 paper 'Global Relational Models of Source Code' [https://openreview.net/forum?id=B1lnbRNtwr]. This repository contains the data and code to replicate our ICLR 2020 paper on models of source code that combine global and structural information, including the Graph-Sandwich model family and the GREAT (Graph-Relational Embedding Attention Transformer) model.
    • Programming-Language Understanding and Repair

      • PLUR - Language Understanding and Repair) is a collection of source code datasets suitable for graph-based machine learning. We provide scripts for downloading, processing, and loading the datasets. This is done by offering a unified API and data structures for all datasets.
    • API Recommendation

      • BIKER - API Knowledge Gap", ASE, [Paper](https://dl.acm.org/doi/10.1145/3238147.3238191), including about 400 API retrieval tasks from Stack Overflow.
    • Feature Location

  • Other Resource Lists