Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/david-allison/manx-corpus-search


https://github.com/david-allison/manx-corpus-search

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

        

# manx-corpus-search

A corpus search for primarily bilingual manx to english texts.

Deployed at https://corpus.gaelg.im/

To add/modify documents, see: [manx-search-data](https://github.com/david-allison/manx-search-data)

## Installation

1. Clone the source
2. Copy the `OpenData` folder from [manx-search-data](https://github.com/david-allison/manx-search-data/) into `CorpusSearch/OpenData` folder
3. `dotnet run`

## Tech Stack

* React
* C# (ASP.NET Core, both WebAPI and content server)
* Document Searching: [Apache Lucene.NET](https://github.com/apache/lucenenet)
* Query Search Syntax: [csly](https://github.com/b3b00/csly)
* CSV: [CsvHelper](https://github.com/JoshClose/CsvHelper)
* JSON: Newtonsoft.Json

## Aims

* Run in RAM on a cheap (<$20/m) droplet
* No expectation of scaling up for a large number of users
* Expected corpus size is unlikely to exceed 10MM words of Manx (and 10MM words of English)
* Stateless

## Deployment

Deployable on a $5 DigitalOcean droplet. See GitHub actions

## Analytics

* Uses https://app.segment.com/ anonymously - tracking the count of searches

## Server requirements

- git
- dotnet-sdk-6.0
- TODO