Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/davidberenstein1957/davidberenstein1957
๐จ๐ฝโ๐ณ Cooking, ๐จ๐ฝโ๐ป Coding, ๐ Committing
https://github.com/davidberenstein1957/davidberenstein1957
Last synced: 10 days ago
JSON representation
๐จ๐ฝโ๐ณ Cooking, ๐จ๐ฝโ๐ป Coding, ๐ Committing
- Host: GitHub
- URL: https://github.com/davidberenstein1957/davidberenstein1957
- Owner: davidberenstein1957
- Created: 2022-10-02T07:53:42.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-22T17:06:02.000Z (8 months ago)
- Last Synced: 2024-05-22T17:31:57.158Z (8 months ago)
- Size: 65.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
### Hi there ๐
From failing to study medicine โก๏ธ BSc industrial engineer โก๏ธ MSc computer scientist. \
Life can be strange, so better enjoy it. \
Iยดm sure I do by: ๐จ๐ฝโ๐ณ Cooking, ๐จ๐ฝโ๐ป Coding, ๐ Committing.# Conference slides ๐
- No data? No problem! - [synthetic data to the rescue](https://www.canva.com/design/DAGViIBmdic/yUJ02U4pP9qTLvChf--gVg/edit?utm_content=DAGViIBmdic&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton)
- Practical AI Podcast - [Towards high-quality (maybe synthetic) datasets](https://practicalai.fm/290)
- Code Together Podcast | Intel Software - [Scaling LLM Datasets with Less Effort Using Argilla](https://www.youtube.com/watch?v=9kOSjMFxCCc)
- Mastering LLMs - [Creating, curating, and cleaning data for LLMs](https://docs.google.com/presentation/d/12n-_ivhTQQpeTKAIvmuxnUxkJ19zvtJzKBwvZn-t8rQ/edit?usp=sharing)
- ๐งผ From GPU-poor to data-rich - [data quality practices for LLM fine-tuning](https://www.canva.com/design/DAGF-GwfVmI/ryeuPyHCz3WZl8P8MIEi_A/edit?utm_content=DAGF-GwfVmI&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton)
- Deeplearning.ai LLM workshop - [get started with Argilla for human- and distilabel for AI feedback](https://youtube.com/live/JNdRV7CDXKM?feature=shared)
- NLP Healthcare Summit 2023 - [Smart Shortcuts for Bootstrapping a Healthcare NER Project](https://youtu.be/t68kC5Dk4LA)
- Anyscale Ray Europe Meetup - [Smart shortcuts for Bootstrapping a Text Classification project](https://youtu.be/tdGvtMv8IiE)# employers ๐จ๐ฝโ๐ป
- [Hugging Face ๐ค](https://www.huggingface.co/) (2024-current) - The AI community building the future
- [Argilla](https://www.argilla.io/) (2022-current) - data annotation and monitoring for enterprise NLP
- [Pandora Intelligence](https://www.pandoraintelligence.com/) (2020-2022) - an independent intelligence company, specialized in security risks# open source โญ๏ธ
## maintainer ๐ค
- [observers](https://github.com/cfahlgren1/observers) - A Lightweight Library for AI Observability
- [dataset-viber](https://github.com/davidberenstein1957/data-viber) - Data viber is your chill repo for data collection and vibe checks
- [concise-concepts](https://github.com/davidberenstein1957/concise-concepts) - a word similarity approach to few-shot NER
- [fast-sentence-transformers](https://github.com/davidberenstein1957fast-sentence-transformers) - simply, faster, sentence-transformers
- [classy-classification](https://github.com/davidberenstein1957/classy-classification) - a quick and dirty few-shot text classification solution
- [crosslingual-coreference](https://github.com/davidberenstein1957/crosslingual-coreference) - a multi-lingual CoRef resolver using cross-lingual training
- [adept-augmentations](https://github.com/argilla-io/adept-augmentations) - a Python library aimed at dissecting and augmenting NER training data
- [spacy-setfit](https://github.com/davidberenstein1957/spacy-setfit) - a Python library aimed to facilitate easy SetFit usage in spaCy## contributions ๐ซฑ๐พโ๐ซฒ๐ผ
- [Haystack](https://github.com/deepset-ai/haystack) - small feature and CI/CD updates
- [InMemoryDatabase](https://github.com/deepset-ai/haystack/pull/7888) - Serialization + to and from disk methods
- [GitHub Actions](https://github.com/deepset-ai/haystack/pull/7890) - caching for pip environment
- [spaCy](https://github.com/explosion/spaCy) - several additions to the spacy-universe
- [spanmarker](https://github.com/tomaarsen/SpanMarkerNER/pull/16) - added `.pipe()` method to spaCy integration
- [spacy-dbpedia-spotlight](https://github.com/MartinoMensio/spacy-dbpedia-spotlight) - added a batch processing functionality
- [spacy-fishing](https://github.com/Lucaterre/spacyfishing) - added a batch processing functionality + bug fixes
- [spacy-opentapioca](https://github.com/UB-Mannheim/spacyopentapioca) - added a batch processing functionality
- [streamlit-url-fragment](https://github.com/ktosiek/streamlit-url-fragment) - resolved Python versioning issues
- [allennlp-models](https://github.com/allenai/allennlp-models) - added a batch processing functionality
- [mutate](https://github.com/infinitylogesh/mutate) - resolved Python versioning issues and added `PyPI` support
- [rebel](https://github.com/Babelscape/rebel) - added a batch processing functionality
- [trl](https://github.com/huggingface/trl/pull/665) - updated RLHF documentation for `PPOTrainer`# volunteering ๐
- [Bonfari](https://bonfari.nl/) - small to medium sustainable scale projects in Gambia ๐ฌ๐ฒ
- [510 red-cross](https://www.510.global/) - occasional projects to improve humanitarian aid with data# Contacts
[![Gmail](https://img.shields.io/badge/Gmail-D14836?style=for-the-badge&logo=gmail&logoColor=white)](mailto:[email protected])
[![LinkedIn](https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/davidberenstein/)
[![Twitter](https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/davidberenstei)