https://github.com/dataesr/bso-parser-html
Extract structured metadata (affiliations, authors name and orcid, keywords ...) from raw html pages
https://github.com/dataesr/bso-parser-html
Last synced: 5 months ago
JSON representation
Extract structured metadata (affiliations, authors name and orcid, keywords ...) from raw html pages
- Host: GitHub
- URL: https://github.com/dataesr/bso-parser-html
- Owner: dataesr
- License: mit
- Created: 2021-06-14T07:46:09.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2024-04-05T11:57:04.000Z (about 2 years ago)
- Last Synced: 2025-09-11T10:28:08.348Z (9 months ago)
- Language: Python
- Homepage:
- Size: 249 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# bso-parser-html
Extract structured metadata informations from raw html
Metadata extracted includes, when possible:
- affiliations
- keywords
- authors name
- authors affiliations
- authors orcid
- abstract
- ackowledgments
- funding