{"id":22458874,"url":"https://github.com/s1998/submit","last_synced_at":"2025-07-20T12:35:08.908Z","repository":{"id":67610950,"uuid":"592700445","full_name":"s1998/submit","owner":"s1998","description":null,"archived":false,"fork":false,"pushed_at":"2023-01-26T22:51:17.000Z","size":2632,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-27T14:17:05.780Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/s1998.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-24T10:43:42.000Z","updated_at":"2023-01-24T10:45:52.000Z","dependencies_parsed_at":"2023-03-25T22:31:32.931Z","dependency_job_id":null,"html_url":"https://github.com/s1998/submit","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/s1998/submit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s1998%2Fsubmit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s1998%2Fsubmit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s1998%2Fsubmit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s1998%2Fsubmit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/s1998","download_url":"https://codeload.github.com/s1998/submit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s1998%2Fsubmit/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266127236,"owners_count":23880423,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-06T08:42:06.156Z","updated_at":"2025-07-20T12:35:08.871Z","avatar_url":"https://github.com/s1998.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n## Submission\n\n- [Installation](#installation)\n- [Code](#code)\n    - [Training](#training)\n    - [Server](#server)\n- [Discussion](#discussion)\n\n### Installation\n\n\n\nRun following commands:\n\nconda install flask==2.0.1 click==7.1.2\npip install allennlp==2.1.0 allennlp-models==2.1.0\npip install transformers spacy\npython spacy -m download en_core_web_sm\n\nSee the following file (for versions of the library) : requirementsNew.txt\n\n### Code\n\nThere are two main portions in the code: \n\n#### Training\n\nNotebooks folder contains the code used for obtaining the models.\n\n(These were actually colab notebooks that have been download as python file, let me know if actual notebooks are needed).\n\nAbout files in notebooks:\n\n-   notebooks/sailesaichalleng.py\n\n    This file uses (NLI trained) BART in zero-shot fashion to divide sentences into questions and statements.\n    We obtain the files questions.csv and statements.csv .\n\n    Then the top samples of the files (250 from each) are labelled manually.\n    This helps in fixing errors + filtering out the filler questions.\n    This finally creates two labelled datasets : questions_lablled.csv and statements_labelled.csv\n\n-    notebooks/salesaichallenge2.py\n\n     This notebook uses the files questions_lablled.csv and statements_labelled.csv to create a model that is used in the final API endpoint and seems to work decently well.  \n\n-   notebooks/newsalesaichallenge3.py\n\n    There are two steps here. \n    First figure out if the sentence mentions a taks and when is it supposed to be done (this is an important criteria for filtering but we should be able to do without it in next iteration for example with better labelling budget or more carefully tuned zero-shot model + prompts).\n    This can be done using Semnatic Role Labelling.\n    Once we have figured out the sentences that have a verb and a time period, we use (NLI trained) BART in zero-shot fashion to check if the task is to be done in future.\n    This creates a file srlTimeV3.csv\n\n    The top samples from this file are also labelled, which creates srlTimeV3labelled.csv .\n\n-   notebooks/newsalesaichallenge4.py\n\n    Using the samples from  srlTimeV3labelled.csv, we create a second classifier that is used for the task of detecting statements that could be potential next steps after the meeting.\n\n\nModels from notebook 2 and 4 are used in the final deployed API.\n\nThe second task is significantly harder (to annotate as well as model).\n\n#### Server\n\nTo run the server, use command : `python app.py`\n\nSee the file sendReq.py for example of API request.\n\nThe ouput has the following format:\n\nQuesion contains the list of meaningful questions.\n\nFollowup conatins the sentences that denote an action to be done in future (this part is significantly harder and needs some more work, e.g. better models, more labelled data etc...).\n\nBoth the question and followups are sorted in the order of confidence. Most confident ones are printed in the end.\n\n\n```\n\n{\n    \"question\" : [\n        {\n            \"question\" : ...,\n            \"context\" : ...\n        },\n        {\n            \"question\" : ...,\n            \"context\" : ...\n        }, .....\n    ]\n    \"followup\" : [\n        {\n            \"followup\" : ...,\n            \"context\" : ...\n        },\n        {\n            \"followup\" : ...,\n            \"context\" : ...\n        }, .....\n    ]\n}\n\n```\n\n### Discussion\n\nRegarding the model size:\n\n\nRegarding the model performance:\n\n\n\n\n\n  \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\nPrevious repository of the flask app (based on which the server was created) here : \n\nhttps://github.com/rajatdiptabiswas/flask-api-hugging-face-fb-bart-large-mnli\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs1998%2Fsubmit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fs1998%2Fsubmit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs1998%2Fsubmit/lists"}