Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/thammegowda/thammegowda
README
https://github.com/thammegowda/thammegowda
Last synced: about 1 month ago
JSON representation
README
- Host: GitHub
- URL: https://github.com/thammegowda/thammegowda
- Owner: thammegowda
- Created: 2021-08-18T17:51:12.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2024-09-06T20:56:35.000Z (2 months ago)
- Last Synced: 2024-09-07T00:09:07.891Z (2 months ago)
- Size: 37.1 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.adoc
Awesome Lists containing this project
README
[cols="2a,2a"]
|===image::https://stackexchange.com/users/flair/1632148.png[float="right",align="center", link="https://stackexchange.com/users/1632148/thamme-gowda?tab=accounts"]
- š Iām currently working on neural machine translation, imbalanced learning
- š± Iām currently learning ...
- šÆ Iām looking to collaborate on ...
- š¤ Iām looking for help with ...
- š¬ Ask me about ... anything
- š« How to reach me: https://twitter.com[@thammegowda^]
- š Pronouns: he/him/his
- ā” Fun fact: ...
|=== Tools
- link:https://github.com/thammegowda/mtdata[mtdata^], datasets downloader
- link:https://github.com/marian-nmt/sotastream[sotastream^], streaming approach to training data
- link:https://github.com/isi-nlp/rtg[RTG^], NMT toolkit
- link:https://github.com/isi-nlp/boteval[BotEval^], chatbot eval
- link:https://github.com/isi-nlp/nlcodec[nlcodec^], vocabulary manager
- link:https://github.com/thammegowda/awkg[awkg^], pythonic awk
- link:https://github.com/USCDataScience/sparkler[Sparkler^], a webcrawler on Apache Spark
- Parser-Indexer: https://github.com/USCDataScience/parser-indexer[Java^], https://github.com/USCDataScience/parser-indexer-py[Python^]
- Web apps (I wasn't trying to be a web dev):
* link:https://github.com/USCDataScience/supervising-ui[Supervising UI^], for image labeling
* link:https://github.com/thammegowda/nllb-serve[NLLB Serve^], an MT webapp to serve huggingface models|===
== Research
https://scholar.google.com/citations?hl=en&user=7Ed3-tMAAAAJ&view_op=list_works&sortby=pubdate[__See my latest publications on Google Scholar__^]
[columns="m,"]
|===
| Repo | Description | Status | Note| https://github.com/marian-nmt/marian-dev/tree/master/src/python[PyMarian^]
| Python bindings to Marian C++; `pip install pymarian`
| Completeā
| https://arxiv.org/abs/2408.11853[Paper^]; https://pypi.org/project/pymarian[PyPI^]| https://github.com/isi-nlp/boteval[BotEval^]
| Facilitating human evaluation of chatbots; `pip install boteval`
| Completeā
| https://aclanthology.org/2024.acl-demos.11/[Paper @ ACL2024 Demos^]; https://justin-cho.com/boteval[Demos^]; https://pypi.org/project/boteval[PyPI^]| https://github.com/marian-nmt/wmt23-metrics[Cometoid^]
| Distilling strong reference based metrics into stronger reference-less metrics
| Completeā
| https://aclanthology.org/2023.wmt-1.62/[Paper @ WMT2023^]; https://huggingface.co/collections/marian-nmt/cometoid-wmt23-metrics-66903bb137eadb9c5768d5f2[Models on Huggingface^]| https://github.com/marian-nmt/sotastream[sotastream^]
| A streaming approach to machine translation training. `pip install sotastream`
| Completeā
| https://aclanthology.org/2023.nlposs-1.13/[Paper @ NLP OSS 2023^] ; https://pypi.org/project/sotastream[PyPI^]| https://github.com/thammegowda/016-many-eng-v2[016-many-eng-v2^]
| Many-to-English (v2)
| Completeā
|| https://github.com/thammegowda/015-nmt-ablation[015-nmt-ablation^]
| Transformer ablation, showing that model can work without encoder.
| Completeā
|| https://github.com/thammegowda/014-udhr-dataset[014-udhr-dataset^]
| Parallel sentence alignment from Universal Declaration of Human Rights corpus
| WIP/Incompleteā
|| https://github.com/thammegowda/013-nmt-codeswitching[013-nmt-codeswitching^]
| Done
| Completeā
| https://arxiv.org/abs/2210.05096[Paper^]
| https://github.com/thammegowda/012-macrobert[012-macrobert^]
| Macro sampling in BERT
| Didn't workā
| Maybe we should revisit| https://github.com/thammegowda/011-imb-learn[011-imb-learn^]
| Imbalanced machine learning: case studies in image recognition, text classification, and machine translation
| Incompleteā
| https://gowda.ai/011-imb-learn/[Docs^]| https://github.com/thammegowda/010-hyperparam-theory[010-hyperparam-theory^]
| A theory on hyperparameter
| Incompleteā
| Book idea! Needs more time. š| https://github.com/thammegowda/009-nmt-toolkits[009-nmt-toolkits^]
| A survey of NMT toolkits
| Incompleteā
| _Lost interest_| https://github.com/thammegowda/008-asr-eval-macro[008-asr-eval-macro^]
| Macro-averaged evaluation for automatic speech recognition
| Incompleteā
| (Some positive results, but needs more evidence)| https://github.com/thammegowda/007-mt-eval-macro[007-mt-eval-macro^]
| Macro Average: Rare Types are Important Too
| Completeā
| https://aclanthology.org/2021.naacl-main.90/[NAACL 2021^]| https://github.com/thammegowda/006-many-to-eng[006-many-to-eng]
| Many-to-English machine translation tools, data, and pretrained models
| Completeā
| https://aclanthology.org/2021.acl-demo.37/[ACL 2021 Demos^]. http://rtg.isi.edu/many-eng/[Demo page^]| https://github.com/thammegowda/005-nmt-imbalance[005-nmt-imbalance^]
|Finding the optimal vocabulary size for neural machine translation
| Complete ā
| https://aclanthology.org/2020.findings-emnlp.352/[EMNLP 2020 Findings^]| https://github.com/thammegowda/005-nmt-imbalance-old[005-nmt-imbalance-old^]
| Neural machine translation with imbalanced classes
| Complete ā
| Rejected from *ACL; https://arxiv.org/abs/2004.02334v1[Arxiv link^]| https://github.com/thammegowda/004-nmt-learning-curve[004-nmt-learning-curve^]
| NMT learning curve revisited.
| Complete ā
| Not published| https://github.com/thammegowda/image-forensics-MFSec17[image-forensics-MFSec17^]
| An Approach for Automatic and Large Scale Image Forensics
| Complete ā
| https://dl.acm.org/doi/abs/10.1145/3078897.3080536[MFSec 2017^]| https://github.com/uscdataScience/autoextractor[autoextractor^]
| Clustering webpages based on structure and style similarity
| Completeā
| https://ieeexplore.ieee.org/abstract/document/7785739[IEEE IRI 2016^]|===