{"id":20652432,"url":"https://github.com/thesofakillers/infoshare","last_synced_at":"2025-06-19T09:40:33.253Z","repository":{"id":62694371,"uuid":"487113598","full_name":"thesofakillers/infoshare","owner":"thesofakillers","description":"Official repository for the paper: \"Probing LLMs for Joint Encoding of Linguistic Categories.\" Findings of EMNLP 2023.","archived":false,"fork":false,"pushed_at":"2023-12-30T19:20:45.000Z","size":33004,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-07T19:03:32.054Z","etag":null,"topics":["deep-learning","dependency-parsing","interpretability","machine-learning","parts-of-speech","probing","syntax","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thesofakillers.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-04-29T21:22:33.000Z","updated_at":"2023-10-13T14:28:52.000Z","dependencies_parsed_at":"2023-01-22T07:01:18.795Z","dependency_job_id":"c1141694-609f-45ad-ba69-0587f742475f","html_url":"https://github.com/thesofakillers/infoshare","commit_stats":null,"previous_names":["thesofakillers/infoshare"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/thesofakillers/infoshare","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesofakillers%2Finfoshare","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesofakillers%2Finfoshare/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesofakillers%2Finfoshare/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesofakillers%2Finfoshare/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thesofakillers","download_url":"https://codeload.github.com/thesofakillers/infoshare/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesofakillers%2Finfoshare/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260726497,"owners_count":23053167,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","dependency-parsing","interpretability","machine-learning","parts-of-speech","probing","syntax","transformers"],"created_at":"2024-11-16T17:34:57.736Z","updated_at":"2025-06-19T09:40:28.235Z","avatar_url":"https://github.com/thesofakillers.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Probing LLMs for Joint Encoding of Linguistic Categories\n\n[![Paper](https://img.shields.io/static/v1.svg?logo=arxiv\u0026label=Paper\u0026message=Open%20Paper\u0026color=green)](https://arxiv.org/abs/2310.18696)\n\nOfficial repository for the paper: \"Probing LLMs for Joint Encoding of\nLinguistic Categories.\" Findings of EMNLP 2023.\n\nhttps://arxiv.org/abs/2310.18696\n\n## Requirements and Setup\n\nDetails such as python and package versions can be found in the generated\n[pyproject.toml](pyproject.toml) and [poetry.lock](poetry.lock) files.\n\nWe recommend using an environment manager such as\n[conda](https://docs.conda.io/en/latest/). After setting up your environment\nwith the correct python version, please proceed with the installation of the\nrequired packages. We provide a [requirements.txt](requirements.txt) file for\nthis.\n\n```terminal\npip install -r requirements.txt\n```\n\nThis `requirements.txt` file is generated by running the following\n\n```terminal\nsh gen_pip_reqs.sh\n```\n\n## Repository contents\n\n```bash\n.\n├── data/                            # Where data is kept\n├── experiments/                     # arrays of images\n├── images/                          # more individual images\n├── lisa/                            # SLURM jobs and configs\n├── infoshare/\n│   ├── datamodules/                 # handle data loading, processing\n│   ├── models/                      # Model implementations\n│   ├── run\n│   │   ├── test.py                  # run testing\n│   │   ├── test_xlingual.py         # run testing across languages\n│   │   └── train.py                 # run training\n│   ├── __init__.py\n│   └── utils.py                     # general utils\n├── notebooks/                       # see notebooks/README.md\n├── reports/                         # LaTeX and more\n├── README.md                        # you are here\n├── lswsd_lemmas.txt                 # lemmas used for LSWSD\n├── poetry.lock                      # dependencies metadata\n├── pyproject.toml                   # project metadata\n├── gen_pip_reqs.sh                  # script for generating requirements.txt\n└── requirements.txt                 # required packages for PIP\n```\n\nThe above was generated with\n\n```bash\ntree . -L 3 --dirsfirst -I \"*.eps|*.png|*.pdf|lightning_logs|*pycache*|backup\"\n```\n\nfollowed by some manual edits.\n\n## Citation\n\nIf you use this code or find our work otherwise useful, please consider citing\nour paper:\n\n```bibtex\n@inproceedings{starace2023probing,\n  title={Probing LLMs for Joint Encoding of Linguistic Categories},\n  author={Starace, Giulio and Papakostas, Konstantinos and Choenni, Rochelle and Panagiotopoulos, Apostolos and Rosati, Matteo and Leidinger, Alina and Shutova, Ekaterina},\n  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},\n  pages={7158--7179},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthesofakillers%2Finfoshare","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthesofakillers%2Finfoshare","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthesofakillers%2Finfoshare/lists"}