An open API service indexing awesome lists of open source software.

https://github.com/pdoup/atd

Task for the Advanced Topics in Databases course - DWS MSc Spring '22
https://github.com/pdoup/atd

databases nlp postgresql search-engine

Last synced: about 1 month ago
JSON representation

Task for the Advanced Topics in Databases course - DWS MSc Spring '22

Awesome Lists containing this project

README

        

# ATD
Task for the ATD course - DWS MSc Spring 2022

![ATD](https://naftemporiki.gr/fu/p/1496776/638/399/0x000000000167f33c/2/megaro-maksimou.jpg)

### TODO

- [X] Create a crawler to get articles and save them in csv files
- [X] Add them to Postgres
- [X] Connect Postgres with Python using a connector (psycopg3)
- [X] Read credentials from config file
- [X] Add create directory if not exists in `extract_body.py`
- [X] Fix `article_path.csv`
- [X] Add threshold to relevant docs in `text_query.py`
- [X] Add columns to show in `text_query.py`
- [X] Show lines that have keywords (grep maybe?)
- [X] Add `requirements.txt`
- [X] Fix - In `text_query.py`:301 -> check if list empty
- [X] Move `links.csv` to `csv_files`
- [X] ~~Add show vector in `text_query.py` output~~
- [X] Use GIN index on docvec column
- [X] Displaying docvec troublesome in terminal
- [X] Add comments