An open API service indexing awesome lists of open source software.

https://github.com/obrad1984/dspace-v510-metadata-simplifier

This Python script enriches Scopus-exported CSV, skips duplicates via DSpace API, fetches additional data from Scopus API, formats it for repository import, and logs actions—streamlining accurate, duplicate-free metadata preparation through a user-friendly GUI and automated workflow.
https://github.com/obrad1984/dspace-v510-metadata-simplifier

dspace dspace-api dublin-core metadata-extraction repository-management scopus-api

Last synced: 3 months ago
JSON representation

This Python script enriches Scopus-exported CSV, skips duplicates via DSpace API, fetches additional data from Scopus API, formats it for repository import, and logs actions—streamlining accurate, duplicate-free metadata preparation through a user-friendly GUI and automated workflow.

Awesome Lists containing this project

README

          

# dspace-v510-metadata-simplifier
(gitHub repository)

# DSpace Repository Metadata Entry Tools

## Intro
This Python script is designed to simplify and streamline metadata entry for DSpace v5.10 repositories. Manual metadata entry in DSpace can be complex and error-prone; this tool automates common tasks[...]

Although DSpace v5.10 is now deprecated, many institutions still rely on it for their digital collections. This script is dedicated solely to DSpace v5.10, addressing its unique requirements and makin[...]

We invite the community to fork this code, use it for your own needs, and contribute improvements – together, we can make metadata entry for DSpace v5.10 repositories effortless and efficient!

## Description
This Python script is designed to facilitate the preparation of bibliographic metadata for import into the institutional repository by transforming and enriching data from a CSV file exported from Sco[...]

1. User Interactivity via GUI:
The script uses a graphical file dialog (via Tkinter) to let the user select the input CSV file (containing publication metadata, including DOIs) and specify the output file location for the processed[...]

2. Duplicate Checking:
For each row in the input CSV, the script checks if the DOI (Digital Object Identifier) already exists in the VinaR repository by sending a POST request to the VinaR REST API. If the DOI already exist[...]

3. Metadata Enrichment:
For every DOI that is not a duplicate, the script queries the Scopus API (via Elsevier’s REST API) to retrieve additional metadata, such as authors, page numbers, publication date, abstract, ISSN/IS[...]

4. Data Transformation and Output:
The retrieved and processed metadata is formatted according to the requirements of the VinaR repository and written as a new row in the output CSV file. The output CSV includes specific headers needed[...]

5. Error Handling and Logging:
The script prints informative messages to the console, such as when DOIs are found to be duplicates, when rows are skipped, or if there are network/API errors.

## .env.example File

The `.env.example` file contains template environment variable definitions required to run the script, such as API keys and endpoint URLs.
**Do not store real credentials in `.env.example`.** Instead, copy `.env.example` to a new file named `.env` and fill in your actual credentials and settings there.
The script automatically loads variables from your `.env` file to securely provide access to APIs and services.

**Instructions:**
1. Copy `.env.example` to a new file named `.env` in the root project directory.
2. Edit the `.env` file and enter your API keys, access tokens, and other configuration details as indicated.
3. Keep your `.env` file private and do not commit it to version control (e.g., GitHub).
4. The script will use the variables from your `.env` file during execution.

### Approved
The script was tested and approved on [VinaR](https://vinar.vin.bg.ac.rs/), the institutional repository of the VINCA Institute of Nuclear Sciences - University of Belgrade.

## Copyright 2025 Obrad Vučkovac
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License.
You may obtain a copy of the License at [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)