https://github.com/zenwor/equilibrium
đī¸ Article Management System
https://github.com/zenwor/equilibrium
cosine-similarity crud data-science machine-learning python pytorch tfidf tfidf-vectorizer
Last synced: 4 months ago
JSON representation
đī¸ Article Management System
- Host: GitHub
- URL: https://github.com/zenwor/equilibrium
- Owner: zenwor
- License: mit
- Created: 2023-12-23T15:21:02.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-12-01T18:39:04.000Z (over 1 year ago)
- Last Synced: 2025-08-17T11:41:51.484Z (10 months ago)
- Topics: cosine-similarity, crud, data-science, machine-learning, python, pytorch, tfidf, tfidf-vectorizer
- Language: Jupyter Notebook
- Homepage:
- Size: 32.9 MB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
đ° Equilibrium
Equilibrium is an "Article Management System" , created as a little project for "Scripting Languages" course (Faculty of Sciences, University of Novi Sad).
âšī¸ Dataset used in final model can be found here .
đĨ Installation
Installation is a very simple process:
-
Clone the repository using:                         git clone https://github.com/LukaNedimovic/equilibrium
-
Run the following to install dependencies: pip install -r requirements.txt
đĨ Motivation
- Research of machine learning models in creation of text-based recommendation systems.
- Creation of console application with "modern" GUI (multiple input boxes at the same time; selection / movement using arrow keys)
âī¸ Features
Equilibrium is a slightly-more-complex CRUD application - one can create an account, log in, create an article, delete it, interact with it (like / dislike / save), search for articles based on keywords, and get recommended an article similar to the one currently reading.
Administrator account is already created and can be used to interact with platform completely - capable of deleting all articles, viewing keyword statistics and so on.
đ¤ Machine Learning Model
Main motivation behind creating such project was to get a bit more knowledge on how some machine learning concepts work - especially embeddings .
I have tried implementing 3 different models:
- Article x Tag Model                 - where premise is that similar articles share more of the similar tags
- Collaborative Filtering Model - where articles are suggested by trying to predict the rating based on other user's ratings
- TF-IDF                                     - standard method, combined with cosine similarity, which gave the best performance (being the simplest model out of these three)
I wished to create a NN model that would eventually have a good performance, but noticed following:
- I don't have sufficient data to create a good-working NN (for my current knowledge level, at the very least)
- It's better to use a simpler model if possible
- Hybrid model was possible and the most "modern" choice, but that would require a bit more time to implement and train
TF-IDF performed extremely well on given dataset, was quick to train and easy to implement. My wish to learn more about some NN models were also fulfilled by creating these two "less good" models, so it balanced out meaningfully.
đ What can be improved?
I believe that this project is good, especially being a "first semester" one. However, some meaningful changes can (and hopefully will) be made:
- Paths generation                                         - a function should be implemented to generate the paths, or some other interesting workaround. It should be platform-agnostic, too.
- Generalized version of prompt rendering - it would be fun to create a module that renders these prompts dynamically, like miniature version of HTML. It could be also useful for programming newbies, who would then be able to create very nice console UIs.