Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gbv/k10plus-cleanup
Scripts to clean up PICA records in K10plus catalogue
https://github.com/gbv/k10plus-cleanup
Last synced: 19 days ago
JSON representation
Scripts to clean up PICA records in K10plus catalogue
- Host: GitHub
- URL: https://github.com/gbv/k10plus-cleanup
- Owner: gbv
- Created: 2023-08-16T08:33:45.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-28T10:43:32.000Z (over 1 year ago)
- Last Synced: 2024-11-10T03:18:04.109Z (3 months ago)
- Language: Shell
- Size: 9.77 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# K10plus cleanup
This repository contains scripts to clean up PICA records in K10plus catalogue.
Analysis of K10plus catalogue with **QA Catalogue** reveals errors such as
typos and wrongly used fields. Some of these can automatically be fixed using
Catmandu and [Catmandu::PICA].## Installation
Requires Catmandu and current version of [Catmandu::PICA] as listed in `cpanfile`. Install with:
cpanm --installdeps .
## Usage
Create or reuse a `.fix` file in Catmandu fix language. Either call `catmandu`
as documented with selected data source (e.g. records retrieved via unAPI or
SRU or a plain PICA dump file) or use the tiny shell script `fix`:./fix $FIX_FILE [$INPUT_FILE] [$OUTPUT_FILE]
The default output file name is fix file name with extension changed to `.patch`changed to .
Some file extensions of input file are detected:
- `*.dat.gz` is gzipped [normalized PICA]
- `*.dat` is [normalized PICA]
- everything else is [PICA Plain]Script `kxpo` can be used to get K10plus records via SRU in online expansion form:
./kxpo 1543420281 # get via PPN
./kxpo pica.isb=9783894017316 --total 1 # get first matching record by ISBNTo validate a Patch file and give statistics run (given `$patchfile` has extension `.patch`):
picadata count $patchfile
picadata fields $patchfile## Examples
See `*.fix` files in this repository for examples:
fix file | description | JIRA issue
---------|-------------|------------
[ddc-edition.fix](ddc-edition.fix) | unify DDC edition statementin `045F$e` |
[bk-74.50.fix](bk-74.50.fix) | deleted BK class `74.50X` |
[bk-remove-invalid.fix](bk-remove-invalid.fix) | remove invalid BK notation `XX.XX` |## See also
- [Einführung in die Verarbeitung von PICA-Daten](https://pro4bib.github.io/pica/)
with a section on editing PICA with Catmandu::PICA[Catmandu::PICA]: https://metacpan.org/release/Catmandu-PICA
[normalized PICA]: https://format.gbv.de/pica/normalized
[PICA Plain]: https://format.gbv.de/pica/plain