Projects in Awesome Lists by davanstrien
A curated list of projects in awesome lists by davanstrien .
https://github.com/davanstrien/haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
Last synced: 22 Mar 2025
https://github.com/davanstrien/ocr-bench
Per-collection OCR leaderboards using VLM-as-judge
Last synced: 04 Apr 2026
https://github.com/davanstrien/flyswot
Command Line Interface for running 🤗 Transformers Image Classification locally
cli command-line-tool computer-vision glam huggingface-transformers image-classification python
Last synced: 17 Mar 2025
https://github.com/davanstrien/huggingface-tldr
Experimental tl;dr summaries for datasets on the Hugging Face Hub!
ai artificial-intelligence chrome-extension datasets huggingface
Last synced: 11 Apr 2025
https://github.com/davanstrien/python-introduction-for-digital-collections
Workshop materials on Python as part of a series of Library Carpentry workshops at the British Library
Last synced: 11 Apr 2025
https://github.com/davanstrien/auto_dataset_card
Wouldn't it be nice to generate parts of our dataset card automagically?
Last synced: 11 Apr 2025
https://github.com/davanstrien/llm-pubmed-query-generation-evaluation
LLM PubMed Query Generation Evaluation
Last synced: 27 Oct 2025
https://github.com/davanstrien/imagein
Find illustrations in historic book using computer vision
data-centric-ai fsdl fullstack-deeplearning
Last synced: 08 Mar 2026
https://github.com/davanstrien/hugit-cli
push ImageFolder style image datasets to the 🤗 Hub from the command line
cli datasets huggingface-datasets
Last synced: 05 Apr 2025
https://github.com/davanstrien/data-lifeboat-converter
Convert Data Lifeboats from Flickr Foundation to HuggingFace datasets and Spaces
Last synced: 02 Apr 2026
https://github.com/davanstrien/uk-web-archive-open-data-wellcome-project
Experimenting with web archive data
Last synced: 17 Mar 2025
https://github.com/davanstrien/altoxml2dataset
Prepare ALTO XML datasets to Hugging Face datasets compatible format
Last synced: 29 Oct 2025
https://github.com/davanstrien/computer-vision-for-the-humanities-an-introduction-to-deep-learning-for-image-classification
This repository contains alternative setup instructions for a forthcoming Programming Historian lesson 'Computer Vision for the Humanities: an introduction to deep learning for image classification'.
computer-vision deep-learning glam
Last synced: 26 Aug 2025
https://github.com/davanstrien/digitised-books-ocr-and-metadata
British Library Digitised Books c. 1510 - c. 1900: JSONL (OCR derived text & metadata)
Last synced: 17 Mar 2025
https://github.com/davanstrien/computer-vision-dhnordic-2020-workshop
An introduction to computer vision for working with maps: workshop at DHN 2020
Last synced: 17 Mar 2025
https://github.com/davanstrien/programming-historian-computer-vision-lessons-submission
Materials for Programming Historian Lessons on Computer Vision
Last synced: 17 Mar 2025
https://github.com/davanstrien/website-classification
Trying to classify web archives using metadata...
Last synced: 27 Oct 2025
https://github.com/davanstrien/introduction-to-digital-scholarship-and-open-research
Workshop materials for a course introducing digital scholarship and open research
digital-scholarship open-access open-data open-scholarship open-science reproducible-experiments reproducible-research reproducible-science
Last synced: 03 Feb 2026
https://github.com/davanstrien/flyswot-gym
🦾 training flyswot models (fairly) easily
Last synced: 23 Jul 2025
https://github.com/davanstrien/nlpwithpytorchbook
NLP with pytorch notes/notebooks
Last synced: 26 Sep 2025
https://github.com/davanstrien/open-access-dissertation-literature-notes
A collection of notes for the literature review section of a dissertation on Open Access and the funding of APCs
Last synced: 03 Jan 2026