Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/david-allison/manx-corpus-search
https://github.com/david-allison/manx-corpus-search
Last synced: 4 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/david-allison/manx-corpus-search
- Owner: david-allison
- License: mit
- Created: 2021-05-09T21:39:35.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2024-07-13T19:15:04.000Z (7 months ago)
- Last Synced: 2024-07-14T00:22:18.504Z (7 months ago)
- Language: C#
- Size: 3.09 MB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 31
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# manx-corpus-search
A corpus search for primarily bilingual manx to english texts.
Deployed at https://corpus.gaelg.im/
To add/modify documents, see: [manx-search-data](https://github.com/david-allison/manx-search-data)
## Installation
1. Clone the source
2. Copy the `OpenData` folder from [manx-search-data](https://github.com/david-allison/manx-search-data/) into `CorpusSearch/OpenData` folder
3. `dotnet run`## Tech Stack
* React
* C# (ASP.NET Core, both WebAPI and content server)
* Document Searching: [Apache Lucene.NET](https://github.com/apache/lucenenet)
* Query Search Syntax: [csly](https://github.com/b3b00/csly)
* CSV: [CsvHelper](https://github.com/JoshClose/CsvHelper)
* JSON: Newtonsoft.Json## Aims
* Run in RAM on a cheap (<$20/m) droplet
* No expectation of scaling up for a large number of users
* Expected corpus size is unlikely to exceed 10MM words of Manx (and 10MM words of English)
* Stateless## Deployment
Deployable on a $5 DigitalOcean droplet. See GitHub actions
## Analytics
* Uses https://app.segment.com/ anonymously - tracking the count of searches
## Server requirements
- git
- dotnet-sdk-6.0
- TODO