An open API service indexing awesome lists of open source software.

https://github.com/aida-ugent/occupation_coding_datasets


https://github.com/aida-ugent/occupation_coding_datasets

Last synced: 3 months ago
JSON representation

Awesome Lists containing this project

README

        

# Occupation coding datasets
1. GenEasy: A collection of 500 synthetic job listings linked to select ESCO occupation codes, crafted using GPT-4.
2. GenHard: Identical to the above, but with job titles diverging from the textual descriptors of their respective codes.
3. Real_indeed: A set of 100 genuine job listings sourced from Indeed, annotated manually.

## Each dataset consists of columns for ID, job title, description, label, and other potential supplementary data.