Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lhncbc/metamaplite
A near real-time named-entity recognizer
https://github.com/lhncbc/metamaplite
metamap named-entity-recognition umls
Last synced: about 1 month ago
JSON representation
A near real-time named-entity recognizer
- Host: GitHub
- URL: https://github.com/lhncbc/metamaplite
- Owner: LHNCBC
- License: other
- Created: 2018-02-08T19:17:44.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-01-16T14:17:39.000Z (12 months ago)
- Last Synced: 2024-05-20T02:20:39.186Z (8 months ago)
- Topics: metamap, named-entity-recognition, umls
- Language: Java
- Homepage: https://metamap.nlm.nih.gov/MetaMapLite.shtml
- Size: 1010 KB
- Stars: 54
- Watchers: 8
- Forks: 14
- Open Issues: 16
-
Metadata Files:
- Readme: README.bioc
- License: LICENSE.md
Awesome Lists containing this project
README
## A BioC XML to A BioC XML implementation of MetaMapLite
The class gov.nih.nlm.nls.metamap.lite.BioCProcess allows MetaMapLite
to process BioC XML input and write the results in BioC XML.### Using from Java
Below is an example using BioC Processing with MetaMapLite
// Initialize MetaMapLite
Properties defaultConfiguration = getDefaultConfiguration();
String configPropertiesFilename =
System.getProperty("metamaplite.propertyfile",
"config/metamaplite.properties");
Properties configProperties = new Properties();
// set any in-line properties here.
FileReader fr = new FileReader(configPropertiesFilename);
configProperties.load(fr);
fr.close();configProperties.setProperty("metamaplite.semanticgroup",
"acab,anab,bact,cgab,dsyn,emod,inpo,mobd,neop,patf,sosy");
Properties properties =
Configuration.mergeConfiguration(configProperties,
defaultConfiguration);
BioCProcess process = new BioCProcess(properties);// read BioC XML collection
Reader inputReader = new FileReader(inputFile);
BioCFactory bioCFactory = BioCFactory.newFactory("STANDARD");
BioCCollectionReader collectionReader =
bioCFactory.createBioCCollectionReader(inputReader);
BioCCollection collection = collectionReader.readCollection();// Run named entity recognition on collection
BioCCollection newCollection = process.processCollection(collection);// write out the annotated collection
File outputFile = new File(outputFilename);
Writer outputWriter = new PrintWriter(outputFile, "UTF-8");
BioCCollectionWriter collectionWriter = bioCFactory.createBioCCollectionWriter(outputWriter);
collectionWriter.writeCollection(newCollection);
outputWriter.close();This process attempts to preserve any annotation present in the
original BioC XML input. Some annotations applied to BioC
structures before entity lookup, including `tokenization and
part-of-speech-tagging, are discarded before the writing out the final
annotated collection. This occurs in the entity lookup class
BioCEntityLookup (java package: gov.nih.nlm.nls.metamap.lite).### Omissions in BioC version
The current version of the BioCProcess class does not call the
abbreviation detector or the negation detector. This should be
relatively simple to add (particularly the abbreviation detector) and
will probably be added in the next release.