https://github.com/obrad1984/dspace-v510-metadata-simplifier
This Python script enriches Scopus-exported CSV, skips duplicates via DSpace API, fetches additional data from Scopus API, formats it for repository import, and logs actions—streamlining accurate, duplicate-free metadata preparation through a user-friendly GUI and automated workflow.
https://github.com/obrad1984/dspace-v510-metadata-simplifier
dspace dspace-api dublin-core metadata-extraction repository-management scopus-api
Last synced: 3 months ago
JSON representation
This Python script enriches Scopus-exported CSV, skips duplicates via DSpace API, fetches additional data from Scopus API, formats it for repository import, and logs actions—streamlining accurate, duplicate-free metadata preparation through a user-friendly GUI and automated workflow.
- Host: GitHub
- URL: https://github.com/obrad1984/dspace-v510-metadata-simplifier
- Owner: obrad1984
- License: apache-2.0
- Created: 2024-12-14T16:04:52.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-11-17T19:26:36.000Z (7 months ago)
- Last Synced: 2025-11-17T21:18:11.553Z (7 months ago)
- Topics: dspace, dspace-api, dublin-core, metadata-extraction, repository-management, scopus-api
- Language: Python
- Homepage: https://doi.org/10.5281/zenodo.17634681
- Size: 687 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# dspace-v510-metadata-simplifier
(gitHub repository)
# DSpace Repository Metadata Entry Tools
## Intro
This Python script is designed to simplify and streamline metadata entry for DSpace v5.10 repositories. Manual metadata entry in DSpace can be complex and error-prone; this tool automates common tasks[...]
Although DSpace v5.10 is now deprecated, many institutions still rely on it for their digital collections. This script is dedicated solely to DSpace v5.10, addressing its unique requirements and makin[...]
We invite the community to fork this code, use it for your own needs, and contribute improvements – together, we can make metadata entry for DSpace v5.10 repositories effortless and efficient!
## Description
This Python script is designed to facilitate the preparation of bibliographic metadata for import into the institutional repository by transforming and enriching data from a CSV file exported from Sco[...]
1. User Interactivity via GUI:
The script uses a graphical file dialog (via Tkinter) to let the user select the input CSV file (containing publication metadata, including DOIs) and specify the output file location for the processed[...]
2. Duplicate Checking:
For each row in the input CSV, the script checks if the DOI (Digital Object Identifier) already exists in the VinaR repository by sending a POST request to the VinaR REST API. If the DOI already exist[...]
3. Metadata Enrichment:
For every DOI that is not a duplicate, the script queries the Scopus API (via Elsevier’s REST API) to retrieve additional metadata, such as authors, page numbers, publication date, abstract, ISSN/IS[...]
4. Data Transformation and Output:
The retrieved and processed metadata is formatted according to the requirements of the VinaR repository and written as a new row in the output CSV file. The output CSV includes specific headers needed[...]
5. Error Handling and Logging:
The script prints informative messages to the console, such as when DOIs are found to be duplicates, when rows are skipped, or if there are network/API errors.
## .env.example File
The `.env.example` file contains template environment variable definitions required to run the script, such as API keys and endpoint URLs.
**Do not store real credentials in `.env.example`.** Instead, copy `.env.example` to a new file named `.env` and fill in your actual credentials and settings there.
The script automatically loads variables from your `.env` file to securely provide access to APIs and services.
**Instructions:**
1. Copy `.env.example` to a new file named `.env` in the root project directory.
2. Edit the `.env` file and enter your API keys, access tokens, and other configuration details as indicated.
3. Keep your `.env` file private and do not commit it to version control (e.g., GitHub).
4. The script will use the variables from your `.env` file during execution.
### Approved
The script was tested and approved on [VinaR](https://vinar.vin.bg.ac.rs/), the institutional repository of the VINCA Institute of Nuclear Sciences - University of Belgrade.
## Copyright 2025 Obrad Vučkovac
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
You may obtain a copy of the License at [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)