Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/adwaith-rajesh/calsen
A search engine for local files.
https://github.com/adwaith-rajesh/calsen
c file-search search search-engine tf-idf
Last synced: 1 day ago
JSON representation
A search engine for local files.
- Host: GitHub
- URL: https://github.com/adwaith-rajesh/calsen
- Owner: Adwaith-Rajesh
- License: gpl-3.0
- Created: 2023-05-11T17:00:04.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-04T17:57:09.000Z (7 months ago)
- Last Synced: 2024-05-04T18:42:12.391Z (7 months ago)
- Topics: c, file-search, search, search-engine, tf-idf
- Language: C
- Homepage:
- Size: 168 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# loCAL Search ENgine
---
A search engine to search for local files based on their contents and not just their file names.
---### Road Map
- [ ] parsers:
- [ ] docx
- [ ] pptx
and other files- [ ] more docs
---
### Why?
- I've never done any file I/O intensive project.
- I'm really bad at organizing files, so I've files everywhere on my system.
- And with `Calsen` what I aim is to be able to search for a function name that I remembered in a file and then `Calsen` would 'seamlessly' find the source file that contains it.
- It's fun to do something like this.
- And I can learn a lot of things from this.---
### Getting started
#### Installation
- clone the repo and cd into calsen.
```console
git clone --depth=1 https://github.com/Adwaith-Rajesh/calsen.git
cd calsen
```- dependencies (I've plans to make this optional [#2](https://github.com/Adwaith-Rajesh/calsen/issues/2))
```console
apt install libmagic-dev
```#### Compiling
[`Calsen`](https://github.com/Adwaith-Rajesh/calsen/) makes use of [nobuild](https://github.com/tsoding/nobuild) as it's build system. To compile run the following commands
```console
gcc -o nobuild ./nobuild.c
./nobuild --release
ln -s ./build/bin/calsen ./calsen
```#### Indexing
To index the required directories run.
```console
./calsen reindex --dir path/to/dir/1 --dir path/to/dir/2 -o sample.index
```> Use `--verbose` to get additional output
This will create a `.index` file that _Calsen_ will use during the search process.
#### Searching
Inorder to search through the indexed file you can use the following command.
```console
./calsen search -i sample.index -q 'search query'
```_Calsen_ will find all the files that matches the _"search query"_ and arranges them in descending of relevancy
- To get top N files
```calsen
./calsen search -i sample.index -q 'search query' -n 10
```> Use `--verbose` to get the calculated TF-IDF score for each file
### Bye...