{"id":35032150,"url":"https://github.com/pyramidheadshark/adaptive_search_pg","last_synced_at":"2026-04-16T05:04:46.748Z","repository":{"id":329636762,"uuid":"1120275967","full_name":"pyramidheadshark/adaptive_search_pg","owner":"pyramidheadshark","description":"Adaptive Semantic Search Engine based on PostgreSQL (pgvector) \u0026 FastAPI. Implements a Feedback Loop to dynamically re-rank results using Log-Decay algorithms","archived":false,"fork":false,"pushed_at":"2025-12-23T18:41:13.000Z","size":11503,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-31T10:46:38.779Z","etag":null,"topics":["docker","fastapi","feedback-loop","machine-learning","nlp","pgvector","postgresql","python","ranking-algorithms","semantic-search","sentence-transformers","student-project"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pyramidheadshark.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-12-20T21:21:09.000Z","updated_at":"2025-12-23T18:41:17.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/pyramidheadshark/adaptive_search_pg","commit_stats":null,"previous_names":["pyramidheadshark/adaptive_search_pg"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pyramidheadshark/adaptive_search_pg","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyramidheadshark%2Fadaptive_search_pg","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyramidheadshark%2Fadaptive_search_pg/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyramidheadshark%2Fadaptive_search_pg/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyramidheadshark%2Fadaptive_search_pg/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pyramidheadshark","download_url":"https://codeload.github.com/pyramidheadshark/adaptive_search_pg/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pyramidheadshark%2Fadaptive_search_pg/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31872036,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"online","status_checked_at":"2026-04-16T02:00:06.042Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","fastapi","feedback-loop","machine-learning","nlp","pgvector","postgresql","python","ranking-algorithms","semantic-search","sentence-transformers","student-project"],"created_at":"2025-12-27T07:15:05.192Z","updated_at":"2026-04-16T05:04:46.744Z","avatar_url":"https://github.com/pyramidheadshark.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 Adaptive Semantic Search Engine\r\n\r\n![Python](https://img.shields.io/badge/python-3.11-blue.svg)\r\n![FastAPI](https://img.shields.io/badge/FastAPI-0.125-009688.svg)\r\n![PostgreSQL](https://img.shields.io/badge/PostgreSQL-16-336791.svg)\r\n![Pgvector](https://img.shields.io/badge/Extension-pgvector-orange.svg)\r\n![Docker](https://img.shields.io/badge/Docker-Compose-2496ED.svg)\r\n![License](https://img.shields.io/badge/license-MIT-green.svg)\r\n\r\n\u003e **Курсовой проект**: Интеграция методов семантического поиска и статистического ранжирования в архитектуру реляционной СУБД PostgreSQL\r\n\r\n---\r\n\r\n### 👨‍🎓 Информация о проекте\r\n\r\n* **Авторы:** Смирнов Никита ([@pyramidheadshark](https://github.com/pyramidheadshark)), Кирилл Мельников ([@Chaberis](https://github.com/Chaberis))\r\n* **Университет:** РТУ МИРЭА, 3 курс\r\n* **Дисциплина:** Программные средства манипулирования данными\r\n\r\n---\r\n\r\n## 📖 Суть проблемы и решение\r\n\r\nКлассический векторный поиск (Vector Search / Cosine Similarity) статичен. Он выдает результаты только на основе семантической близости текста. Если модель считает документ релевантным, он будет в топе всегда, даже если пользователи считают иначе.\r\n\r\nЭтот проект реализует **Динамическое Ре-ранжирование (Dynamic Re-ranking)**. Система \"слушает\" клики пользователей и корректирует веса документов в реальном времени, используя гибридную формулу ранжирования.\r\n\r\n### 📊 Результаты бенчмарков\r\n\r\nНиже представлен анализ эффективности алгоритмов ранжирования (Linear vs Log-Decay vs Sigmoid), полученный в ходе симуляции на датасете NFCorpus.\r\n\r\n![Benchmark Results](data/old/dashboard_img_results.png)\r\n\r\n**Ключевой вывод:** Стратегия **Log-Decay** (Логарифмическое затухание) показала наилучший результат, обеспечивая быстрый рост релевантности (MRR) без риска переобучения и накрутки.\r\n\r\n---\r\n\r\n## 🛠 Технический стек\r\n\r\n* **Core:** Python 3.11, FastAPI\r\n* **Database:** PostgreSQL 16 + `pgvector`\r\n* **ML:** `sentence-transformers` (модель `all-MiniLM-L6-v2`)\r\n* **Analytics:** Pandas, Plotly\r\n* **Infrastructure:** Docker Compose, uv\r\n\r\n---\r\n\r\n## 🚀 Установка и запуск\r\n\r\n### Требования\r\n\r\n* Docker \u0026 Docker Compose\r\n* Токен HuggingFace (для скачивания датасета)\r\n\r\n### Пошаговая инструкция\r\n\r\n1. **Настройка окружения:**\r\n    Создайте файл `.env` и добавьте ваш токен:\r\n\r\n    ```bash\r\n    cp .env.example .env\r\n    # Впишите HF_TOKEN=... внутри .env\r\n    ```\r\n\r\n2. **Запуск контейнеров:**\r\n\r\n    ```bash\r\n    make build\r\n    make up\r\n    ```\r\n\r\n    *При первом запуске будет скачана ML-модель.*\r\n\r\n3. **Загрузка данных (ETL):**\r\n    Скрипт скачает датасет NFCorpus, векторизует его и сохранит в Postgres.\r\n\r\n    ```bash\r\n    make load\r\n    ```\r\n\r\n4. **Проверка работы API:**\r\n\r\n    ```bash\r\n    curl -X POST \"http://localhost:8000/api/v1/search\" \\\r\n         -H \"Content-Type: application/json\" \\\r\n         -d '{\"query\": \"vitamin c benefits\", \"limit\": 3}'\r\n    ```\r\n\r\n---\r\n\r\n## 🧪 Воспроизведение исследования\r\n\r\nПроект включает модуль симуляции, который эмулирует поведение пользователей (клики, шум, ошибки) и сравнивает математические стратегии.\r\n\r\n1. **Запустить бенчмарк (3 сценария: эффективность, шум, насыщение):**\r\n\r\n    ```bash\r\n    docker compose exec app python -m src.scripts.benchmark\r\n    ```\r\n\r\n2. **Сгенерировать HTML-отчет:**\r\n\r\n    ```bash\r\n    docker compose exec app python -m src.scripts.visualize\r\n    ```\r\n\r\n3. **Результат:** Откройте файл `data/final_advanced_dashboard_ru.html`.\r\n\r\n---\r\n\r\n## 📂 Структура проекта\r\n\r\n* `src/database.py` — Схемы данных (`Document`, `Interaction`) и подключение к БД\r\n* `src/search.py` — Ядро логики: математические формулы ре-ранжирования\r\n* `src/ml.py` — Инициализация ML-модели.\r\n* `src/main.py` — API роуты и конфигурация\r\n* `src/scripts/` — Скрипты ETL и аналитики\r\n\r\n---\r\n\r\n## ✅ Тестирование\r\n\r\nЗапуск интеграционных тестов (проверка Feedback Loop):\r\n\r\n```bash\r\nmake test\r\n```\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyramidheadshark%2Fadaptive_search_pg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpyramidheadshark%2Fadaptive_search_pg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpyramidheadshark%2Fadaptive_search_pg/lists"}