Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/clariah/hhucap

Current historical studies of career mobility often focus on linkage of personal records such as baptism records. More qualitative sources, such as biographies contain vital information as well, but are labour intensive to process. We propose a combination of Robust Semantic Parsing and Linked Data conversion tools to automatically derive career patterns from 35,000 biographies in the Biography Portal in the period 1815-1940. Substantively, we answer the question what career patterns looked like and changed over the long Nineteenth century. Methodologically, we evaluate to what extent current CLARIAH tools are up to automate this process. We will progress the semantic parsing tools by improving the linguistic expression set related to HISCO, adding an OCR cleaning step to the pipeline and experimenting with alternative CLARIAH tools for Dutch. This will result in a detailed report on the performance of CLARIAH tools on this data.
https://github.com/clariah/hhucap

advertisements biographies biographynet career-mobility clariah newspaper nlp

Last synced: 10 days ago
JSON representation

Current historical studies of career mobility often focus on linkage of personal records such as baptism records. More qualitative sources, such as biographies contain vital information as well, but are labour intensive to process. We propose a combination of Robust Semantic Parsing and Linked Data conversion tools to automatically derive career patterns from 35,000 biographies in the Biography Portal in the period 1815-1940. Substantively, we answer the question what career patterns looked like and changed over the long Nineteenth century. Methodologically, we evaluate to what extent current CLARIAH tools are up to automate this process. We will progress the semantic parsing tools by improving the linguistic expression set related to HISCO, adding an OCR cleaning step to the pipeline and experimenting with alternative CLARIAH tools for Dutch. This will result in a detailed report on the performance of CLARIAH tools on this data.

Awesome Lists containing this project

README

        

# hhucap
Current historical studies of career mobility often focus on linkage of personal records such as baptism records. More qualitative sources, such as biographies contain vital information as well, but are labour intensive to process. We propose a combination of Robust Semantic Parsing and Linked Data conversion tools to automatically derive career patterns from 35,000 biographies in the Biography Portal in the period 1815-1940. Substantively, we answer the question what career patterns looked like and changed over the long Nineteenth century. Methodologically, we evaluate to what extent current CLARIAH tools are up to automate this process. We will progress the semantic parsing tools by improving the linguistic expression set related to HISCO, adding an OCR cleaning step to the pipeline and experimenting with alternative CLARIAH tools for Dutch. This will result in a detailed report on the performance of CLARIAH tools on this data.

Update 2020-03-20:
The code for the simple tagger tool is available via: https://github.com/cltl/SimpleTagger