https://github.com/dataesr/bso-parser-html

Extract structured metadata (affiliations, authors name and orcid, keywords ...) from raw html pages
https://github.com/dataesr/bso-parser-html

Last synced: 5 months ago
JSON representation

Extract structured metadata (affiliations, authors name and orcid, keywords ...) from raw html pages

Host: GitHub
URL: https://github.com/dataesr/bso-parser-html
Owner: dataesr
License: mit
Created: 2021-06-14T07:46:09.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2024-04-05T11:57:04.000Z (about 2 years ago)
Last Synced: 2025-09-11T10:28:08.348Z (9 months ago)
Language: Python
Homepage:
Size: 249 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# bso-parser-html

Extract structured metadata informations from raw html

Metadata extracted includes, when possible:

- affiliations
- keywords
- authors name
- authors affiliations
- authors orcid
- abstract
- ackowledgments
- funding