{"id":21165857,"url":"https://github.com/sidmohan0/gpt_annotator","last_synced_at":"2025-03-14T16:44:46.950Z","repository":{"id":202970664,"uuid":"708258575","full_name":"sidmohan0/gpt_annotator","owner":"sidmohan0","description":null,"archived":false,"fork":false,"pushed_at":"2023-10-26T17:38:34.000Z","size":76,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-21T10:09:53.545Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sidmohan0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.MD","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-22T02:15:17.000Z","updated_at":"2023-10-22T02:18:32.000Z","dependencies_parsed_at":"2023-10-26T18:34:16.406Z","dependency_job_id":null,"html_url":"https://github.com/sidmohan0/gpt_annotator","commit_stats":null,"previous_names":["sidmohan0/gpt_annotator"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidmohan0%2Fgpt_annotator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidmohan0%2Fgpt_annotator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidmohan0%2Fgpt_annotator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sidmohan0%2Fgpt_annotator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sidmohan0","download_url":"https://codeload.github.com/sidmohan0/gpt_annotator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243615336,"owners_count":20319726,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-20T14:46:57.722Z","updated_at":"2025-03-14T16:44:46.927Z","avatar_url":"https://github.com/sidmohan0.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n# GPT Annotator\n\n## Overview\n\nGPT Annotator is a tool that utilizes GPT models to annotate textual data for Named Entity Recognition (NER). It works with various file formats including `.md`, `.txt`, `.pdf`, `.docx`, and `.html`.\n\n## Features\n\n- Extract text from multiple file formats.\n- Tokenize and process sentences using NLTK.\n- Generate annotated data suitable for NER tasks. Currently output as JSONL\n\n## Requirements\n\n- Python 3.x\n- NLTK\n- OpenAI API key\n- PyPDF2\n- python-docx\n\n## Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/sidmohan0/gpt_annotator.git\n\n# Install dependencies\npip install -r requirements.txt\n```\n\n## Usage\n\nSet up your `.env` file with the following variables:\n\n```\nSAMPLES=\u003cSAMPLE_SIZE\u003e\nMODEL=\u003cGPT_MODEL_NAME\u003e\nOPENAI_API_KEY=\u003cYOUR_API_KEY\u003e\nPATH=\u003cYOUR_PATH\u003e\n```\n\nRun the main script:\n\n```bash\npython main.py\n```\n\n## Contributing\n\nSee [`CONTRIBUTING.md`](CONTRIBUTING.md) for guidelines on how to contribute to this project.\n\n## License\n\nThis project is licensed under the MIT License. See [`LICENSE.md`](LICENSE.md) for more details.\n\n---\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsidmohan0%2Fgpt_annotator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsidmohan0%2Fgpt_annotator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsidmohan0%2Fgpt_annotator/lists"}