{"id":13564323,"url":"https://github.com/paperswithcode/axcell","last_synced_at":"2025-04-05T17:07:47.917Z","repository":{"id":37590543,"uuid":"194139116","full_name":"paperswithcode/axcell","owner":"paperswithcode","description":"Tools for extracting tables and results from Machine Learning papers","archived":false,"fork":false,"pushed_at":"2022-11-28T06:53:08.000Z","size":661,"stargazers_count":402,"open_issues_count":1,"forks_count":55,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-03-29T16:08:05.214Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paperswithcode.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-06-27T17:44:50.000Z","updated_at":"2025-03-28T18:00:35.000Z","dependencies_parsed_at":"2022-09-07T07:01:07.637Z","dependency_job_id":null,"html_url":"https://github.com/paperswithcode/axcell","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperswithcode%2Faxcell","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperswithcode%2Faxcell/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperswithcode%2Faxcell/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperswithcode%2Faxcell/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paperswithcode","download_url":"https://codeload.github.com/paperswithcode/axcell/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247369952,"owners_count":20927928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T13:01:29.685Z","updated_at":"2025-04-05T17:07:47.900Z","avatar_url":"https://github.com/paperswithcode.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# AxCell: Automatic Extraction of Results from Machine Learning Papers\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/axcell-automatic-extraction-of-results-from/scientific-results-extraction-on-pwc)](https://paperswithcode.com/sota/scientific-results-extraction-on-pwc?p=axcell-automatic-extraction-of-results-from)\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/axcell-automatic-extraction-of-results-from/scientific-results-extraction-on-nlp-tdms-exp)](https://paperswithcode.com/sota/scientific-results-extraction-on-nlp-tdms-exp?p=axcell-automatic-extraction-of-results-from)\n\nThis repository is the official implementation of [AxCell: Automatic Extraction of Results from Machine Learning Papers](https://arxiv.org/abs/2004.14356).\n\n![pipeline](https://user-images.githubusercontent.com/13535078/81287158-33e01000-905a-11ea-8573-d716373efbdd.png)\n\n## Requirements\n\nTo create a [conda](https://www.anaconda.com/distribution/) environment named `axcell` and install requirements run:\n\n```setup\nconda env create -f environment.yml\n```\n\nAdditionally, `axcell` requires `docker` (that can be run without `sudo`). Run `scripts/pull_docker_images.sh` to download necessary images.\n\n## Datasets\nWe publish the following datasets:\n* [ArxivPapers](https://github.com/paperswithcode/axcell/releases/download/v1.0/arxiv-papers.csv.xz)\n* [SegmentedTables \u0026 LinkedResults](https://github.com/paperswithcode/axcell/releases/download/v1.0/segmented-tables.json.xz)\n* [PWCLeaderboards](https://github.com/paperswithcode/axcell/releases/download/v1.0/pwc-leaderboards.json.xz)\n\nSee [datasets](notebooks/datasets.ipynb) notebook for an example of how to load the datasets provided below. The [extraction](notebooks/extraction.ipynb) notebook shows how to use `axcell` to extract text and tables from papers.\n\n## Evaluation\n\nSee the [evaluation](notebooks/evaluation.ipynb) notebook for the full example on how to evaluate AxCell on the PWCLeaderboards dataset. \n\n## Training\n\n* [pre-training language model](notebooks/training/lm.ipynb) on the ArxivPapers dataset \n* [table type classifier](notebooks/training/table-type-classifier.ipynb) and [table segmentation](notebooks/training/table-segmentation.ipynb) on the SegmentedResults dataset \n\n## Pre-trained Models\n\nYou can download pretrained models here:\n\n- [axcell](https://github.com/paperswithcode/axcell/releases/download/v1.0/models.tar.xz) \u0026mdash; an archive containing the taxonomy, abbreviations, table type classifier and table segmentation model. See the [results-extraction](notebooks/results-extraction.ipynb) notebook for an example of how to load and run the models \n- [language model](https://github.com/paperswithcode/axcell/releases/download/v1.0/lm.pth.xz) \u0026mdash; [ULMFiT](https://arxiv.org/abs/1801.06146) language model pretrained on the ArxivPapers dataset\n\n## Results\n\nAxCell achieves the following performance:\n\n### \n\n\n| Dataset | Macro F1 | Micro F1 |\n| ---------- |---------------- | -------------- |\n| [PWC Leaderboards](https://paperswithcode.com/sota/scientific-results-extraction-on-pwc)     |     21.1         |      28.7       |\n| [NLP-TDMS](https://paperswithcode.com/sota/scientific-results-extraction-on-nlp-tdms-exp)    |     19.7         |      25.8       |\n\n\n\n## License\n\nAxCell is released under the [Apache 2.0 license](LICENSE).\n\n## Citation\nThe pipeline is described in the following paper:\n```bibtex\n@inproceedings{axcell,\n    title={AxCell: Automatic Extraction of Results from Machine Learning Papers},\n    author={Marcin Kardas and Piotr Czapla and Pontus Stenetorp and Sebastian Ruder and Sebastian Riedel and Ross Taylor and Robert Stojnic},\n    year={2020},\n    booktitle={2004.14356}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaperswithcode%2Faxcell","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaperswithcode%2Faxcell","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaperswithcode%2Faxcell/lists"}