https://github.com/phette23/scrape_rda_vocabs
RDA content, media, & carrier vocabs
https://github.com/phette23/scrape_rda_vocabs
cataloging koha marc rda
Last synced: 4 days ago
JSON representation
RDA content, media, & carrier vocabs
- Host: GitHub
- URL: https://github.com/phette23/scrape_rda_vocabs
- Owner: phette23
- Created: 2025-01-31T20:20:02.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-06T19:48:55.000Z (about 1 year ago)
- Last Synced: 2025-06-06T20:55:22.543Z (about 1 year ago)
- Topics: cataloging, koha, marc, rda
- Language: JavaScript
- Homepage:
- Size: 22.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Scrape RDA media vocabs
Scrape the RDA MARC 336/7/8 content/media/carrier vocabularies. I should have used an already structured source of these vocabularies before scraping them, like [the RDA Registry](https://www.rdaregistry.info/termList/), though unless I'm missing something that doesn't show the relationship between the codes and the terms. LC's [Linked Data Service](https://id.loc.gov/) does, though ([example](https://id.loc.gov/vocabulary/contentTypes/crd.html)).
## Setup & Use
`pnpm install`. Only needed for scrape.js, other scripts have no dependencies.
node 22+ because of the JSON import in koha-sql.js
`node scrape.js` to get data off of LC's website and into a JSON file.
`node koha-sql.js` to take the JSON and turn it into SQL `INSERT` statements for Koha authorized values.
`node csv vocabs.json` to convert the JSON to CSV.
`node yaml vocabs.json` to convert the JSON to Koha Authorized Values YAML.