Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with information-retrieval

A curated list of projects in awesome lists tagged with information-retrieval .

https://github.com/kuutsav/information-retrieval

Neural information retrieval / semantic-search / Bi-Encoders

information-retrieval machine-learning nlp semantic-search

Last synced: 03 Aug 2024

https://github.com/project-miracl/miracl

A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.

benchmark dataset information-retrieval multilingual

Last synced: 02 Aug 2024

https://github.com/rth/vtext

Simple NLP in Rust with Python bindings

bag-of-words information-retrieval nlp tf-idf tokenization

Last synced: 03 Oct 2024

https://github.com/ot/ds2i

A library of inverted index data structures

information-retrieval inverted-index search

Last synced: 31 Jul 2024

https://github.com/arian-askari/ChatGPT-RetrievalQA

A dataset for training/evaluating Question Answering Retrieval models on ChatGPT responses with the possibility to training/evaluating on real human responses.

ai chatgpt chatgpt-information-retrieval chatgpt-ir data-augmentation dataset deep-learning gpt-3 gpt2 gpt3 information-retrieval information-retrieval-chatgpt ir ir-chatgpt machine-learning nlp openai python sequence-to-sequence text-retrieval

Last synced: 31 Jul 2024

https://github.com/microsoft/rag-experiment-accelerator

The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.

acs azure chunking dense embedding evaluation experiment genai indexing information-retrieval llm openai rag sparse vectors

Last synced: 01 Aug 2024

https://github.com/jingtaozhan/DRhard

SIGIR'21: Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track.

information-retrieval pytorch web-search

Last synced: 03 Aug 2024

https://github.com/nicholasmamo/multiplex-plot

Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful network graph visualizations, text visualizations and more.

data-science data-visualisation graph-visualization graphs information-retrieval matplotlib natural-language-processing network-visualization python text-mining text-visualisation text-visualization visualisation visualizations viz vizualisation

Last synced: 01 Oct 2024

https://github.com/NicholasMamo/multiplex-plot

Multiplex: visualizations that tell stories—A Python library to create and annotate beautiful network graph visualizations, text visualizations and more.

data-science data-visualisation graph-visualization graphs information-retrieval matplotlib natural-language-processing network-visualization python text-mining text-visualisation text-visualization visualisation visualizations viz vizualisation

Last synced: 07 Aug 2024

https://github.com/YadaYuki/omochi

Full text search engine from scratch by Goʕ◔ϖ◔ʔ (Just a toy) 😊

ddd ent go golang information-retrieval search search-engine

Last synced: 02 Aug 2024

https://github.com/CurrySoftware/rust-stemmers

A rust implementation of some popular snowball stemming algorithms

information-retrieval nlp-stemming snowball

Last synced: 04 Aug 2024

https://github.com/capreolus-ir/capreolus

A toolkit for end-to-end neural ad hoc retrieval

deep-learning information-retrieval

Last synced: 02 Aug 2024

https://github.com/huangtinglin/MixGCF

MixGCF: An Improved Training Method for Graph Neural Network-based Recommender Systems, KDD2021

graph-neural-network information-retrieval negative-sampling network-embedding pytorch recommender-system

Last synced: 08 Aug 2024

https://github.com/uutils/platform-info

A cross-platform way to get information about your machine

cross-platform information-retrieval rust uname

Last synced: 20 Aug 2024

https://github.com/felladrin/MiniSearch

Minimalist web-searching app with an AI assistant that runs directly from your browser. Uses Web-LLM, Ratchet-ML, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space

ai artificial-intelligence generative-ai gpu-accelerated information-retrieval llm llm-inference machine-learning nlp question-answering ratchet-ml retrieval-augmented-generation search search-engine searxng typescript web-llm webapp wllama

Last synced: 31 Jul 2024

https://github.com/dheeraj7596/SCDV

Text classification with Sparse Composite Document Vectors.

document-vector emnlp emnlp2017 information-retrieval natural-language-processing text-classification

Last synced: 05 Aug 2024

https://github.com/xyntopia/pydoxtools

Effortlessly extract information from unstructured data with this library, utilizing advanced AI techniques. Compose AI in customizable pipelines and diverse sources for your projects.

chatgpt document-analysis document-extraction extraction information-retrieval llm nlp pdf python

Last synced: 03 Aug 2024

https://github.com/arosh/BM25Transformer

(Python) transform a document-term matrix to an Okapi/BM25 representation

information-retrieval machine-learning natural-language-processing python scikit-learn

Last synced: 05 Aug 2024

https://github.com/Ryanglambert/3d_model_retriever

Experimenting with a newly published deep learning paper and how it can be used for content-based 3D model retrieval. (info retrieval for CAD)

cad-models capsnets deep-learning information-retrieval neural-network python

Last synced: 31 Jul 2024

https://github.com/rahulrajpl/netizenship

a commandline #OSINT tool to find the online presence of a username in popular social media websites like Facebook, Instagram, Twitter, etc.

cybersecurity information-gathering information-retrieval information-security infosec osint-python websec websecurity

Last synced: 02 Aug 2024

https://github.com/allenai/aspire

Repo for Aspire - A scientific document similarity model based on matching fine-grained aspects of scientific papers.

document-similarity information-retrieval machine-learning natural-language-processing

Last synced: 02 Aug 2024

https://github.com/Phate6660/nixinfo

A lib crate for gathering system info such as cpu, distro, environment, kernel, etc in Rust.

information-retrieval lib linux rust

Last synced: 01 Aug 2024

https://github.com/hical/HiCAL

HiCAL is a system for efficient high-recall retrieval with an adaptable assessing interface.

active-learning cal document-assessment high-recall information-retrieval machine-learning search-engine test-collection

Last synced: 02 Aug 2024

https://github.com/AnonCatalyst/Coeus-OSINT-ToolBox

Coeus 🌐 is an OSINT ToolBox empowering users with tools for effective intelligence gathering from open sources. From social media monitoring 📱 to data analysis 📊, it offers a centralized platform for seamless OSINT investigations.

data-science data-visualization database forensic-analysis forensics forensics-tools framework information-retrieval infosec osint osint-framework osint-python osint-resources osint-tool osint-toolkit people-search reconnaissance

Last synced: 02 Aug 2024

https://github.com/ucasir/NPRF

NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval

information-retrieval neural-network pseudo-relevance-feedback

Last synced: 30 Jul 2024

https://github.com/castorini/hf-spacerini

Plug-and-play Search Interfaces with Pyserini and Hugging Face

information-retrieval search-interface

Last synced: 02 Aug 2024

https://github.com/mathetake/intergo

A package for interleaving / multileaving ranking generation in go

ab-testing go golang information-retrieval interleaving multileaving ranking ranking-algorithm recommendation-system

Last synced: 02 Aug 2024

https://github.com/shrebox/Personified-Chatbot

A personified chatbot responding to a query based on the answering pattern of Dr. APJ Abdul Kalam using Information Retrieval, Natural Language Processing, and Deep Learning techniques.

apj-abdul-kalam chatbot deep-learning information-retrieval lstm natural-language-processing nlp ranking-algorithm seq2seq-chatbot seq2seq-model summarization word2vec

Last synced: 02 Aug 2024

https://github.com/krishpranav/emailosint

emailosint is a tool for gathering email accounts informations (ip,hostname,country, etc...)

email-osint hacking hacking-tool information-extraction information-gathering information-retrieval osint python python3

Last synced: 01 Oct 2024

https://github.com/mikemajesty/github-scrap-api

👾 Project - Now getting information from your github is easy.

api git-hub-api github github-api github-scraping githubscrapapi information-retrieval scrap-github

Last synced: 02 Oct 2024

https://github.com/thisisbhavin/graphicalForest

Using the adjacency matrix and random forest get the Name, Address, Items, Prices, Grand total from all kind of invoices.

adjacency-matrix graph graph-convolution graph-neural-networks information-retrieval invoice-parser random-forest

Last synced: 03 Sep 2024

https://github.com/hrs/docsim

A simple, fast command-line tool for searching and comparing text documents.

go information-retrieval markdown note-taking org-mode similarity tf-idf zettelkasten

Last synced: 30 Sep 2024

https://github.com/krishpranav/infogather

A simple go web app and a cli tool for gathering information

go golang golang-library hacking information-extraction information-gathering information-retrieval osint

Last synced: 01 Oct 2024

https://github.com/RiccardoAncarani/python_recommender_system

A simple user-based collaborative filtering recommender system, built with Python and Flask

collaborative-filtering flask information-retrieval machine-learning python recommendation-engine recommender-system

Last synced: 31 Jul 2024

https://github.com/reblox01/ip-tracer-v3.0.2

Track any ip address with IP-Tracer. IP-Tracer v3.0.2 is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.

api debian-linux hacking-tools hacktoberfest information-gathering information-retrieval ip ip-address ip-location ip-tracer linear-regression linux linux-tools terminal termux-api termux-tool tools

Last synced: 29 Sep 2024

https://github.com/luca-software-developer/g03mysimpleirtool

G03MySimpleIRTool è un'applicazione Java che, dato un corpus di documenti testuali presenti in una directory, consente di elencare quelli più rilevanti rispetto ad una query costituita da una o più parole.

information-retrieval java java-8 javafx javafx-application javafx-desktop-apps javafx-gui javafx-project search-engine

Last synced: 29 Sep 2024

https://github.com/elmiraghorbani/dynamic_prompting

Dynamic Few-Shot Prompting is a Python package that dynamically selects N samples that are contextually close to the user's task or query from a knowledge base (similar to RAG) to include in the prompt.

few-shot-classifcation few-shot-prompting generative-ai gpt information-retrieval knowledge-base llama llama3 llama3-prompts llm nlp prompt prompt-engineering prompt-tuning python rag retreival retrieval-augmented-generation

Last synced: 27 Sep 2024

https://github.com/gcarreno/testlinuxdistinfo

Retrieve Linux Distribution Information using various methods

fpc free-pascal freepascal information information-retrieval lazarus lazarus-ide linux-distribution

Last synced: 28 Sep 2024

https://github.com/AnonCatalyst/WebHound

WebHound is your Python-powered command-line assistant for sharp and efficient web searches! It sniffs out data from major search engines and detects social platforms, helping you uncover valuable insights and stay ahead of the game. 🌐🔍✨

awareness data-analysis data-visualization information-retrieval osint osint-python osint-reconnaissance osint-resources osint-tool osint-tools osinttools web-osint webscraping

Last synced: 13 Aug 2024

https://github.com/antdragiotis/embedding-models-vs-languages-performance

This repository leverages **OpenAI** and **LangChain** to assess the accuracy of embedding models across multiple languages. As benchmarks, it utilizes a set of Q&A documents related with EU Regulations, written in different European languages.

banking embeddings finance information-retrieval langchain openai regulations

Last synced: 01 Oct 2024

https://github.com/phantom0004/elk-stack-tools

A comprehensive collection of tools, scripts, and documentation for managing and utilizing the ELK (Elasticsearch, Logstash, Kibana) stack effectively. This repository compiles information and best practices from several authoritative sources, providing a centralized resource for deploying and maintaining the ELK stack.

cybersecurity educational elasticsearch elk elk-configuration elk-stack information-retrieval kibana linux logstash monitoring networking operation security setup-script siem

Last synced: 28 Sep 2024

https://github.com/krishpranav/xspear

xspear is a xss vulnerability scanner made in ruby

information-retrieval information-security ruby xspear xss xss-scanner xss-vulnerability

Last synced: 01 Oct 2024

https://github.com/blackwinter/birds

Experimental information retrieval system for bibliographic data.

bibliographic-references information-retrieval ruby rubynlp

Last synced: 03 Oct 2024

https://github.com/bnvulpe/paperslab

The project aims to automate content classification and knowledge retrieval, as well as to perform analysis on the temporal and thematic impact on research over a time period. In addition, the possibility of performing network analysis to analyze communication in the community is contemplated for users.

api-extractor big-data big-data-and-ml big-data-infrastructure docker elasticsearch etl-pipeline information-retrieval knowledge-discovery mysql neo4j network-analysis spark temporal-analysis

Last synced: 26 Sep 2024

https://github.com/komosny/geo-web

GeoWeb - A Method for Online Information Retrieval Related to a Geographical Area

data-extraction geographical-information-system geography gps-location information-retrieval internet-archive openstreetmap

Last synced: 29 Sep 2024

https://github.com/AndMastro/EmergencyAwareness

Repository regarding WIR project about emergency awareness on social media.

clustering information-retrieval machine-learning nlp twitter

Last synced: 29 Jul 2024

https://github.com/ingsintitulo/visitante

Paquete de NPM para obtener datos de quien ejecuta nuestro codigo JavaScript

browser data-mining deno information-retrieval nodejs user-agent-parser

Last synced: 28 Sep 2024

https://github.com/mistekko/ccfinancialadvisor

Some scripts to help you spend your hard-earned cookies as wisely as possible.

cookie-clicker information-retrieval script userscript

Last synced: 01 Oct 2024

https://github.com/hungreeee/q-rag

(In progress) Q-RAG: Learning from Feedback to Improve Context Retrieval

information-retrieval langchain llm neo4j rag

Last synced: 26 Sep 2024

https://github.com/rtmigo/skifts_py

Search for the most relevant documents containing words from a query. Uses Scikit-learn and Numpy

cosine-similarity information-retrieval numpy python scikit-learn text-mining tf-idf

Last synced: 30 Sep 2024

https://github.com/rjurney/lovecraft

This is an eldritch project to resurrect the man himself using retrieval augmented generation on his letters

eldritch horror information-retrieval large-language-model llm lovecraft lovecraftian machine-learning natural-language-processing nlp nlu rag vector-search

Last synced: 02 Oct 2024