{"id":31944449,"url":"https://github.com/johngiorgi/seq2rel-ds","last_synced_at":"2025-10-14T10:29:39.558Z","repository":{"id":42077587,"uuid":"351242937","full_name":"JohnGiorgi/seq2rel-ds","owner":"JohnGiorgi","description":"This is a companion repository to seq2rel (https://github.com/JohnGiorgi/seq2rel) which aims to make it easy to generate training data.","archived":false,"fork":false,"pushed_at":"2022-04-13T16:40:27.000Z","size":1026,"stargazers_count":2,"open_issues_count":8,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2023-03-03T22:31:14.262Z","etag":null,"topics":["coreference-resolution","entity-extraction","information-extraction","relation-extraction","seq2rel","seq2seq"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JohnGiorgi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-03-24T22:40:52.000Z","updated_at":"2022-12-02T14:10:29.000Z","dependencies_parsed_at":"2022-08-12T04:10:56.834Z","dependency_job_id":null,"html_url":"https://github.com/JohnGiorgi/seq2rel-ds","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"purl":"pkg:github/JohnGiorgi/seq2rel-ds","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JohnGiorgi%2Fseq2rel-ds","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JohnGiorgi%2Fseq2rel-ds/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JohnGiorgi%2Fseq2rel-ds/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JohnGiorgi%2Fseq2rel-ds/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JohnGiorgi","download_url":"https://codeload.github.com/JohnGiorgi/seq2rel-ds/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JohnGiorgi%2Fseq2rel-ds/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279018780,"owners_count":26086452,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coreference-resolution","entity-extraction","information-extraction","relation-extraction","seq2rel","seq2seq"],"created_at":"2025-10-14T10:29:37.898Z","updated_at":"2025-10-14T10:29:39.552Z","avatar_url":"https://github.com/JohnGiorgi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# seq2rel: Datasets\n\n[![ci](https://github.com/JohnGiorgi/seq2rel-ds/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/JohnGiorgi/seq2rel-ds/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/JohnGiorgi/seq2rel-ds/branch/main/graph/badge.svg?token=69PIN7H6UW)](https://codecov.io/gh/JohnGiorgi/seq2rel-ds)\n[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)\n![GitHub](https://img.shields.io/github/license/JohnGiorgi/seq2rel?color=blue)\n\nThis is a companion repository to [`seq2rel`](https://github.com/JohnGiorgi/seq2rel), which makes it easy to preprocess training data.\n\n## Installation\n\nThis repository requires Python 3.8 or later.\n\n### Setting up a virtual environment\n\nBefore installing, you should create and activate a Python virtual environment. If you need pointers on setting up a virtual environment, please see the [AllenNLP install instructions](https://github.com/allenai/allennlp#installing-via-pip).\n\n### Installing the library and dependencies\n\nIf you _do not_ plan on modifying the source code, install from `git` using `pip`\n\n```bash\npip install git+https://github.com/JohnGiorgi/seq2rel-ds.git\n```\n\nOtherwise, clone the repository and install from source using [Poetry](https://python-poetry.org/):\n\n```bash\n# Install poetry for your system: https://python-poetry.org/docs/#installation\ncurl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python\n\n# Clone and move into the repo\ngit clone https://github.com/JohnGiorgi/seq2rel-ds\ncd seq2rel-ds\n\n# Install the package with poetry\npoetry install\n```\n\n## Usage\n\nInstalling this package gives you access to a simple command-line tool, `seq2rel-ds`. To see the list of available commands, run:\n\n```bash\nseq2rel-ds --help\n```\n\n\u003e Note, you can also call the underlying python files directly, e.g. `python path/to/seq2rel_ds/main.py --help`.\n\nTo preprocess a dataset (and in most cases, download it), call one of the commands, e.g.\n\n```bash\nseq2rel-ds cdr main \"path/to/cdr\"\n```\n\n\u003e Note, you have to include `main` because [`typer`](https://typer.tiangolo.com/) does not support default commands.\n\nThis will create the preprocessed `tsv` files under the specified output directory, e.g.\n\n```\ncdr\n ┣ train.tsv\n ┣ valid.tsv\n ┗ test.tsv\n```\n\nwhich can then be used to train a [`seq2rel`](https://github.com/JohnGiorgi/seq2rel) model.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohngiorgi%2Fseq2rel-ds","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohngiorgi%2Fseq2rel-ds","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohngiorgi%2Fseq2rel-ds/lists"}