Projects in Awesome Lists by caltechlibrary
A curated list of projects in awesome lists by caltechlibrary .
https://github.com/caltechlibrary/handprint
Apply different text recognition services to images of handwritten documents.
amazon-rekognition amazon-textract google-api google-cloud google-vision-api handwritten-text-recognition htr library-automation machine-learning microsoft-azure ocr optical-character-recognition python
Last synced: 07 Apr 2025
https://github.com/caltechlibrary/datatools
A set of tools for working with JSON, CSV and Excel workbooks
csv data-munging excel-workbook json shell-scripting structured-data xlsx
Last synced: 09 Apr 2025
https://github.com/caltechlibrary/dibs
DIBS is an implementation of a basic controlled digital lending (CDL) system using IIIF to make scanned books available for time-limited viewing.
caltech-library cdl controlled-digital-lending folio folio-lsp iiif library-management library-systems python tind tind-lsp
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/waystation
Automatically archive your repository's GitHub Pages in the Wayback Machine.
archiving automation documentation github-action github-actions github-automation github-pages internet-archive preservation wayback-machine
Last synced: 12 Jun 2025
https://github.com/caltechlibrary/dataset
dataset is a command line tool, Go package, shared library and Python package for working with JSON objects as collections
Last synced: 29 May 2026
https://github.com/caltechlibrary/commonpy
Collection of common Python utility functions and classes used in other Caltech Library programs.
data-utilities file-utilities file-utils network-utilities python python-functions python3 utility-classes utility-function
Last synced: 11 Aug 2025
https://github.com/caltechlibrary/bun
A Python package for a basic CLI and GUI user interface
cli command-line graphical-user-interface gui python python3 user-interface
Last synced: 07 Mar 2026
https://github.com/caltechlibrary/baler
Bad link reporter for GitHub repositories
automation github-action github-actions issues links markdown reporting testing url urls
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/documentarist
Process Caltech Archives' digital documents and photos, and annotate each page or image with information about its contents
annotation annotator document-classification document-image-classification document-image-processing handwriting-recognition handwritten-character-recognition handwritten-mathematical-symbols handwritten-text-recognition htr image-classification image-recognition image-tagging machine-learning math-recognition tagging
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/iga
IGA is the InvenioRDM GitHub Archiver, a standalone program as well as a GitHub Action that lets you automatically archive GitHub software releases in an InvenioRDM repository.
archives archiving automation code-preservation github-action github-actions github-automation invenio invenio-rdm preservation reproducibility reproducible-research research-data-management research-software software-archiving software-preservation source-code-archiving
Last synced: 24 Jun 2025
https://github.com/caltechlibrary/caltechdata_api
Python library for using the CaltechDATA API
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/foliage
Foliage is the FOLIo chAnGe Editor, a tool to do bulk changes in FOLIO using the network API.
bulk-operation editor folio folio-lsp
Last synced: 27 Oct 2025
https://github.com/caltechlibrary/pokapi
Simple Python object-oriented interface for getting records from FOLIO
api folio folio-lsp library-systems python python-library
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/caltechdata
The CaltechDATA InvenioRDM source code
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/codemeta2cff
GitHub Action converting a codemeta file to CITATION.cff
Last synced: 06 Apr 2026
https://github.com/caltechlibrary/newt
Newt a microservice for integrating Postgres+PostgREST and Pandoc
data-router microservice pandoc postgresql postgrest
Last synced: 03 Sep 2025
https://github.com/caltechlibrary/cloud-init-examples
This repository includes an example of cloud-init YAML files for use with multipass VMs.
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/crossrefapi
This is a Go package fork working politely with the CrossRef API.
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/pairtree
A simple encoder/decoder for converting object identifiers into a Pair Tree Path (path)
file-system-layout library-science pairtree
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/caltechauthors
The CaltechAUTHORS InvenioRDM source code
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/epxml_to_datacite
Transform Eprints XML to DataCite XML and mint DOIs in Eprints repositories
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/template
Template repository for software projects by the Caltech Library
best-practices bestpractices github-template github-templates template-project
Last synced: 10 Aug 2025
https://github.com/caltechlibrary/coif
Cover image finder
api book book-cover book-covers book-jacket books cover cover-art cover-image isbn python python3
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/eprints2archives
Send records from an EPrints server to the Internet Archive and other web archives
archiving eprints internet-archive memento preservation python terminal web-archives web-archiving
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/boffo
Boffo is an add-on for Google Sheets written by the Caltech Library. It lets you select item barcodes in a spreadsheet and retrieve information about the item records from a FOLIO server.
folio folio-lsp google-apps-script google-scripts google-sheet google-sheets javascript library-automation
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/eprints2bags
Download records and documents from an EPrints server and put them in BagIt format.
archives bagit digital-repositories eprints preservation
Last synced: 06 Oct 2025
https://github.com/caltechlibrary/caltechauthors?tab=readme-ov-file
The CaltechAUTHORS InvenioRDM source code
Last synced: 08 Apr 2025
https://github.com/caltechlibrary/rdmworkbook
bookdown files for "The Research Data Management Workbook"
Last synced: 11 Aug 2025
https://github.com/caltechlibrary/namaste
Go package and command line implementation of "NAMe AS TExt" metadata embedding for directories.
namaste name-as-text repository-utilities
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/urlup
Find the ultimate destination for URLs after following redirections.
http redirection redirects url
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/corsproxy
Simple CORS proxy server suitable for use as a system daemon on CentOS/RHEL systems
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/splitit
Split range values produced by Caltech.tind.io in spreadsheet output
caltech-library holdings library-automation library-management-system tind
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/convert_codemeta
Convert and validate codemeta files using crosswalk
Last synced: 27 Aug 2025
https://github.com/caltechlibrary/pubarchiver
Package up microPublication.org and other journals for archiving into Portico and PMC
archive archiving crossref crossref-data datacite datacite-metadata jats jats-xml journal pmc portico preservation publications
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/git-desktop
Modified version of the Software Carpentry git-novice lesson that uses GitHub Desktop
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/libguine
Caltech Library customizations for LibGuides CMS
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/eprinttools
EPrintTools is a Go package, command line utilities and a service for working with EPrints 3.3.x EPrint XML and REST API
Last synced: 05 Jan 2026
https://github.com/caltechlibrary/collaborator_reports
Generate collaborator reports from data sources
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/sidetrack
Simple debug tracing package for Python, with optimization support.
debugging logging python python3 software-testing telemetry tracing
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/persistent_url_resolver
This repository contains a new version of Caltech Library's Persistent URL Resolver, based on AWS S3.
Last synced: 08 Mar 2026
https://github.com/caltechlibrary/dataciteapi
A Golang package and command line utility for working with the public DataCite API
Last synced: 07 Feb 2026
https://github.com/caltechlibrary/wsfn
Go package for standardize web service functionality across our library's go projects
Last synced: 15 Jun 2026
https://github.com/caltechlibrary/liblog
experiment in tracking website content changes
Last synced: 20 Mar 2026
https://github.com/caltechlibrary/inveniordm-migrate
Scripts to migrate content into Invenio RDM
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/acacia
Automated CaltechAUTHORS Catalog Ingest Agent
Last synced: 02 Mar 2026
https://github.com/caltechlibrary/2019-01-22-python-workshop
Workshop page for CPA Python Workshop
Last synced: 27 Aug 2025
https://github.com/caltechlibrary/popstar
Phone-Oriented Processing SofTware for ARchives
archiving digitization document-processing iphone libraries scanning shortcuts-app workflow-automation
Last synced: 20 Mar 2026
https://github.com/caltechlibrary/py_dataset
Python package of dataset (https://github.com/caltechlibrary/dataset) for working with JSON objects as collections on disc
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/caltechlibrary.github.io
Caltech Library's Digital Library Development sandbox.
Last synced: 14 Apr 2025
https://github.com/caltechlibrary/py-cli-template
GitHub template project for non-web Python application projects. To use this, DO NOT CLONE OR FORK this repository; click on "Use this template". After it's used to create a new repo, this will run a GitHub Actions workflow to update files and directories, so give it a minute and refresh your browser to see the finished result.
boilerplate generator github-actions project-template python templates
Last synced: 27 Aug 2025
https://github.com/caltechlibrary/wos_reports
Scripts to generate reports from Web of Science
Last synced: 30 Jul 2025
https://github.com/caltechlibrary/searchtools
A Python3 package for working with Elasticsearch and LunrJS.
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/irdm_harvester
Automatically harvest publications for an InvenioRDM repository
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/cl-js
CL-js is a JavaScript library for integration with feeds and other library supplied services. It provides functionality through a global CL object.
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/dibsiiif
Scripts to support the Caltech Library DIBS application
controlled-digital-lending iiif
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/htr-test-cases
Images of documents for testing HTR.
handwritten-text-recognition htr machine-learning ocr text-recognition
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/caltechdata_backup
Back up data from CaltechDATA (Invenio 3) repository
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/html_footer
Universal footer code to be embedded in various systems for a consistent look.
Last synced: 24 Oct 2025
https://github.com/caltechlibrary/codemeta-pandoc-examples
This repository describes how to generate a CITATION.cff, about.md and installer.sh from a codemeta.json file using Pandoc.
citation-cff codemeta installer-automation installer-script pandoc pandoc-templates
Last synced: 20 Mar 2026
https://github.com/caltechlibrary/metagenesys
Take the information from a Python setup.cfg file and generate a codemeta.json file
Last synced: 20 Mar 2026
https://github.com/caltechlibrary/adage
Authors Dimensional Analysis and General Exploration
Last synced: 22 Jan 2026
https://github.com/caltechlibrary/r-carpentry
Repository for Carpentry Lessons based on R
Last synced: 23 Jan 2026
https://github.com/caltechlibrary/manage_dois
Scripts for managing custom DOIs
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/command-line-for-librarians
MMWConf presentation Fall 2016
Last synced: 22 Jan 2026
https://github.com/caltechlibrary/caltechdata_map
Map interface for CaltechDATA files
Last synced: 22 Jan 2026
https://github.com/caltechlibrary/caltechpeople_demo
Python web service demo of unified search across Archives, Thesis and Authors for Caltech People.
Last synced: 11 Feb 2026
https://github.com/caltechlibrary/irdmtools
A Go and Python package for working with InvenioRDM repositories.
Last synced: 02 Apr 2026
https://github.com/caltechlibrary/doitools
A Go package for working with DOI
Last synced: 31 Oct 2025
https://github.com/caltechlibrary/trinomial
A very simple name anonymization library
Last synced: 09 Oct 2025
https://github.com/caltechlibrary/library-shell-curl-and-api
An intermediate exploration of Bash, curl and working with content from web API
Last synced: 02 Feb 2026
https://github.com/caltechlibrary/tccon-caltechdata
Scripts for uploading TCCON data to CaltechDATA
Last synced: 11 Oct 2025
https://github.com/caltechlibrary/dotpath
A Go, Python and TypeScript modules for working with map, dicts and objects accessing them using a dot path notation.
Last synced: 05 Jul 2025
https://github.com/caltechlibrary/2020-03-09-python-workshop
Workshop page for the 2020 CPA Python Workshop
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/archives-hale-processing
Scripts and CSV files for processing the George Ellery Hale Papers for the Caltech Archives.
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/caltechdata_migrate
Assorted scripts for migrating content to CaltechDATA
Last synced: 27 Aug 2025
https://github.com/caltechlibrary/dataset-instruction
Instructional content for the dataset package
Last synced: 11 Feb 2026
https://github.com/caltechlibrary/book-template
A Bookdown book template with Caltech customizations
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/caltechdata_plot
Demo interactive plotting tool that uses Bokeh server to produce an interactive plot by calling the caltechDATA (Invenio 3) API
bokeh bokeh-server invenio plot
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/cold
Controlled Object Lists and Datum
library-systems metadata-management
Last synced: 14 Oct 2025
https://github.com/caltechlibrary/caltechdatasplash
Splash page for CaltechDATA
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/digital-signage
Project to manage content for Caltech Library's digital signage
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/irdm-queue-portal
A basic view of content currently in an InvenioRDM community queue
Last synced: 14 Oct 2025
https://github.com/caltechlibrary/refoliate
REstore FOLIo sAved insTancEs
folio library-automation library-management
Last synced: 13 Jul 2025
https://github.com/caltechlibrary/dataset-demo
This is a demo of dataset for T2T3 and OR 2017
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/docuserve_analysis
R Scripts for processing DocuServe data.
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/glaser-1
Donald Glaser: Education at Caltech
Last synced: 13 Apr 2025
https://github.com/caltechlibrary/vaugment
tracking Volunteer Augmentation of ArchivesSpace Records
Last synced: 04 Jul 2025