Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/davidberenstein1957/davidberenstein1957

๐Ÿ‘จ๐Ÿฝโ€๐Ÿณ Cooking, ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป Coding, ๐Ÿ† Committing
https://github.com/davidberenstein1957/davidberenstein1957

Last synced: 10 days ago
JSON representation

๐Ÿ‘จ๐Ÿฝโ€๐Ÿณ Cooking, ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป Coding, ๐Ÿ† Committing

Awesome Lists containing this project

README

        

### Hi there ๐Ÿ‘‹
From failing to study medicine โžก๏ธ BSc industrial engineer โžก๏ธ MSc computer scientist. \
Life can be strange, so better enjoy it. \
Iยดm sure I do by: ๐Ÿ‘จ๐Ÿฝโ€๐Ÿณ Cooking, ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป Coding, ๐Ÿ† Committing.

# Conference slides ๐Ÿ“–

- No data? No problem! - [synthetic data to the rescue](https://www.canva.com/design/DAGViIBmdic/yUJ02U4pP9qTLvChf--gVg/edit?utm_content=DAGViIBmdic&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton)
- Practical AI Podcast - [Towards high-quality (maybe synthetic) datasets](https://practicalai.fm/290)
- Code Together Podcast | Intel Software - [Scaling LLM Datasets with Less Effort Using Argilla](https://www.youtube.com/watch?v=9kOSjMFxCCc)
- Mastering LLMs - [Creating, curating, and cleaning data for LLMs](https://docs.google.com/presentation/d/12n-_ivhTQQpeTKAIvmuxnUxkJ19zvtJzKBwvZn-t8rQ/edit?usp=sharing)
- ๐Ÿงผ From GPU-poor to data-rich - [data quality practices for LLM fine-tuning](https://www.canva.com/design/DAGF-GwfVmI/ryeuPyHCz3WZl8P8MIEi_A/edit?utm_content=DAGF-GwfVmI&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton)
- Deeplearning.ai LLM workshop - [get started with Argilla for human- and distilabel for AI feedback](https://youtube.com/live/JNdRV7CDXKM?feature=shared)
- NLP Healthcare Summit 2023 - [Smart Shortcuts for Bootstrapping a Healthcare NER Project](https://youtu.be/t68kC5Dk4LA)
- Anyscale Ray Europe Meetup - [Smart shortcuts for Bootstrapping a Text Classification project](https://youtu.be/tdGvtMv8IiE)

# employers ๐Ÿ‘จ๐Ÿฝโ€๐Ÿ’ป

- [Hugging Face ๐Ÿค—](https://www.huggingface.co/) (2024-current) - The AI community building the future
- [Argilla](https://www.argilla.io/) (2022-current) - data annotation and monitoring for enterprise NLP
- [Pandora Intelligence](https://www.pandoraintelligence.com/) (2020-2022) - an independent intelligence company, specialized in security risks

# open source โญ๏ธ

## maintainer ๐Ÿค“

- [observers](https://github.com/cfahlgren1/observers) - A Lightweight Library for AI Observability
- [dataset-viber](https://github.com/davidberenstein1957/data-viber) - Data viber is your chill repo for data collection and vibe checks
- [concise-concepts](https://github.com/davidberenstein1957/concise-concepts) - a word similarity approach to few-shot NER
- [fast-sentence-transformers](https://github.com/davidberenstein1957fast-sentence-transformers) - simply, faster, sentence-transformers
- [classy-classification](https://github.com/davidberenstein1957/classy-classification) - a quick and dirty few-shot text classification solution
- [crosslingual-coreference](https://github.com/davidberenstein1957/crosslingual-coreference) - a multi-lingual CoRef resolver using cross-lingual training
- [adept-augmentations](https://github.com/argilla-io/adept-augmentations) - a Python library aimed at dissecting and augmenting NER training data
- [spacy-setfit](https://github.com/davidberenstein1957/spacy-setfit) - a Python library aimed to facilitate easy SetFit usage in spaCy

## contributions ๐Ÿซฑ๐Ÿพโ€๐Ÿซฒ๐Ÿผ

- [Haystack](https://github.com/deepset-ai/haystack) - small feature and CI/CD updates
- [InMemoryDatabase](https://github.com/deepset-ai/haystack/pull/7888) - Serialization + to and from disk methods
- [GitHub Actions](https://github.com/deepset-ai/haystack/pull/7890) - caching for pip environment
- [spaCy](https://github.com/explosion/spaCy) - several additions to the spacy-universe
- [spanmarker](https://github.com/tomaarsen/SpanMarkerNER/pull/16) - added `.pipe()` method to spaCy integration
- [spacy-dbpedia-spotlight](https://github.com/MartinoMensio/spacy-dbpedia-spotlight) - added a batch processing functionality
- [spacy-fishing](https://github.com/Lucaterre/spacyfishing) - added a batch processing functionality + bug fixes
- [spacy-opentapioca](https://github.com/UB-Mannheim/spacyopentapioca) - added a batch processing functionality
- [streamlit-url-fragment](https://github.com/ktosiek/streamlit-url-fragment) - resolved Python versioning issues
- [allennlp-models](https://github.com/allenai/allennlp-models) - added a batch processing functionality
- [mutate](https://github.com/infinitylogesh/mutate) - resolved Python versioning issues and added `PyPI` support
- [rebel](https://github.com/Babelscape/rebel) - added a batch processing functionality
- [trl](https://github.com/huggingface/trl/pull/665) - updated RLHF documentation for `PPOTrainer`

# volunteering ๐ŸŒ

- [Bonfari](https://bonfari.nl/) - small to medium sustainable scale projects in Gambia ๐Ÿ‡ฌ๐Ÿ‡ฒ
- [510 red-cross](https://www.510.global/) - occasional projects to improve humanitarian aid with data

# Contacts

[![Gmail](https://img.shields.io/badge/Gmail-D14836?style=for-the-badge&logo=gmail&logoColor=white)](mailto:[email protected])
[![LinkedIn](https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/davidberenstein/)
[![Twitter](https://img.shields.io/badge/Twitter-1DA1F2?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/davidberenstei)