Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jbesomi/korono
👑Korono: question answering platform for COVID-19 papers
https://github.com/jbesomi/korono
bm25 covid covid-19 covid19 qa question-answering search-engine
Last synced: 3 months ago
JSON representation
👑Korono: question answering platform for COVID-19 papers
- Host: GitHub
- URL: https://github.com/jbesomi/korono
- Owner: jbesomi
- Created: 2020-04-04T17:18:26.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-04-06T16:52:27.000Z (almost 5 years ago)
- Last Synced: 2023-03-08T11:23:16.692Z (almost 2 years ago)
- Topics: bm25, covid, covid-19, covid19, qa, question-answering, search-engine
- Language: Jupyter Notebook
- Homepage: https://jbesomi.github.io/Korono/
- Size: 2.73 MB
- Stars: 12
- Watchers: 1
- Forks: 5
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
👑 Korono
A Question-Answering system for COVID-19 papers
Introduction •
Getting started •
Under the hoods •
Server and Client API
Introduction
**information.** The amount of documents related to COVID-19 is increasing exponentially. With such a massive amount of information, it's getting harder for the research community to find the relevant pieces of information.
**Search-engine-on-steroids.** Korono is a question-answering platform conceived to facilitate the research of information regarding COVID-19. You can think of Korono as a search-engine-on-steroids.
**Working principle.** Korono engine is composed of two phases: the search engine phase and the question-answering phase. In the first place, given a query `q`, the search engine returns all relevant papers from that query. Later on, the answer is extracted from each paper and displayed.
Getting started
You can either use the online version (coming soon) or run your own server.
Run a server locally:
```
./run_server.sh
```Run client and ask a question:
```python
> from korono import client
> client.get_answers("What is coronavirus?")
```Under the hoods
**Search engine**. The search engine use a ranking algorithm known as Okapi BM25, where BM stands for _best matching_. BM25 is a bag-of-words retrieval function that sort documents based on the query terms appearing in each document.
**Question answering**. The questions are extracted from the corpus using [Transformers](https://transformer.huggingface.co/), large neural networks language models. As of now, only the `distilbert-base-uncased-distilled-squad` model is supported. Soon, we plan to extend support.
Server and Client API
#### Server API
- `load_data.get_df()`
Returns the underline dataset.- `load_data.get_metadata_df()`
Returns the CORD-19 metadata pandas DataFrame.- `korono_model.answer_question(question, context)`
Given a question and a context, returns the answer.- `korono.model.get_summary(text)`
Given a text, the model returns the abstractive summary.- `korono_model.find_start_end_index_substring(context, answer)`
Return the start and end index, if they exists, of the `answer` string in the `context` string.#### Client API
- `client.get_answers_json(question)`
Return a JSON object of the form:
```json
{
"results": [
{
"context": "coronavirus is an infectious disease",
"question": "what is coronavirus?",
"answer": "an infectious disease",
},
]
}
```- `client.get_answers(question)`
Return a list of all answers.