https://github.com/searchivarius/medline-query-with-entities
pubmed-query-with-entities
https://github.com/searchivarius/medline-query-with-entities
Last synced: about 1 month ago
JSON representation
pubmed-query-with-entities
- Host: GitHub
- URL: https://github.com/searchivarius/medline-query-with-entities
- Owner: searchivarius
- Created: 2015-10-19T18:18:48.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2021-05-12T00:16:20.000Z (almost 4 years ago)
- Last Synced: 2025-03-17T18:19:57.198Z (about 2 months ago)
- Language: Java
- Size: 18.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
A self-sufficient Solr/Lucene 5.x query application for the index created using https://github.com/searchivarius/medline-index-with-entities
The index can be used as a regular SOLR instance. If one needs to use annotations, there is a special extension to do so. The query language is hard for humans to use. Therefore, there is a special simplified query language and an API to transform this simpler query language into the format that the extension can understand.
Query language description
==The query contains several terms that will be OR-ed together (you can also AND them if you wish). Alternatively, you can generate a query using the API, surround it with brackets, and combine this piece with other SOLR constructs.
Summary of term types:
1) Keyword, e.g., ESR1
2) Quoted phrase, e.g., "heart attack"
3) Concept query. It is a keyword or a quoted phrase followed by a pseudo-field name in brackets, e.g.:
esr1[gene]
Another example with quotes:
"heptanoic acid"[chemical]
Case doesn't matter. To query any gene, use the query *[gene]. Note that there should be an asterisk before the field name.The list of pseudo fields is:Chemical , Disease, DNAMutation, Gene, ProteinMutation, SNP, Species
4) A concept id query. This is very similar to the concept query, but the field name should be suffixed with _id. For example, the following query:
5888[gene_id]
will search for concept IDs 5888, which are also annotated as genes.I hope that I understood your explanation of concept_id queries correctly. If not, it can probably be fixed.
5) Note on complex names. Those need to be quoted. You also need escape brackets and quotes if they are a part of the name, e.g. (note the backslash):
"\\[Ca2+]i"Again, a test application can be found at:
https://github.com/searchivarius/medline-index-with-entities/blob/master/src/main/java/edu/cmu/lti/oaqa/annographix/solr/SimpleQueryApp.java