https://github.com/sts10/catalogers_toolkit
An experiment in adding a cleaning layer on top of pymarc
https://github.com/sts10/catalogers_toolkit
Last synced: 12 days ago
JSON representation
An experiment in adding a cleaning layer on top of pymarc
- Host: GitHub
- URL: https://github.com/sts10/catalogers_toolkit
- Owner: sts10
- License: mit
- Created: 2026-04-24T20:29:53.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-15T18:58:38.000Z (about 1 month ago)
- Last Synced: 2026-05-15T20:04:40.560Z (about 1 month ago)
- Language: mIRC Script
- Homepage:
- Size: 43 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.markdown
- License: LICENSE
Awesome Lists containing this project
README
# Cataloger's Toolkit
An experiment in adding a data-cleaning layer on top of [pymarc](https://gitlab.com/pymarc/pymarc), aimed at collection assessment.
Note that pymarc is NOT included as a dependency in this project at this time, and thus must be installed and imported separately.
Run `python examples/example.py` for an example of what this code can do.
## To use in a script
First, install the package from GitHub
```sh
pip install git+https://github.com/sts10/catalogers_toolkit.git
```
Then import the CRecord class:
```python
from catalogers_toolkit import CRecord
```
### When using a notebook like with Google Colab
I think for Google Colab, you'll want to start your file with something like this:
```
from google.colab import drive
pip install pymarc
pip install git+https://github.com/sts10/catalogers_toolkit.git
from pymarc import MARCReader
from catalogers_toolkit import CRecord
# Set a variable called marc_file
with open(marc_file, "rb") as fh:
reader = MARCReader(fh, to_unicode=True, force_utf8=True, utf8_handling="replace")
for record in reader:
c_record = CRecord(record)
print("OCN: " + c_record.ocn)
```
See the `/examples` directory for more example usage.