https://github.com/addok/addok-trigrams
Trigrams based index for Addok.
https://github.com/addok/addok-trigrams
Last synced: 11 months ago
JSON representation
Trigrams based index for Addok.
- Host: GitHub
- URL: https://github.com/addok/addok-trigrams
- Owner: addok
- License: mit
- Created: 2016-01-26T09:53:23.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2022-03-13T17:54:03.000Z (about 4 years ago)
- Last Synced: 2025-03-23T22:41:51.920Z (about 1 year ago)
- Language: Python
- Size: 20.5 KB
- Stars: 0
- Watchers: 5
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Addok-trigrams
Alternative indexation pattern for Addok, based on trigrams.
## Installation
pip install addok-trigrams
## Configuration
In your local configuration file:
- remove unwanted RESULTS_COLLECTORS_PYPATHS:
from addok.config.default import RESULTS_COLLECTORS_PYPATHS
RESULTS_COLLECTORS_PYPATHS.remove('addok.helpers.collectors.extend_results_reducing_tokens')
RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_but_geohash_try_autocomplete_collector')
RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.no_meaningful_but_common_try_autocomplete_collector')
RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.only_commons_try_autocomplete_collector')
RESULTS_COLLECTORS_PYPATHS.remove('addok.autocomplete.autocomplete_meaningful_collector')
RESULTS_COLLECTORS_PYPATHS.remove('addok.fuzzy.fuzzy_collector')
- remove all `autocomplete` and `fuzzy` RESULTS_COLLECTORS_PYPATHS, add new ones:
RESULTS_COLLECTORS_PYPATHS += [
'addok_trigrams.extend_results_removing_numbers',
'addok_trigrams.extend_results_removing_one_whole_word',
'addok_trigrams.extend_results_removing_successive_trigrams',
]
- add `trigramize` to PROCESSORS_PYPATHS:
from addok.config.default import PROCESSORS_PYPATHS
PROCESSORS_PYPATHS += [
'addok_trigrams.trigramize',
]
- remove pairs and autocomplete indexers from `INDEXERS_PYPATHS`:
from addok.config.default import INDEXERS_PYPATHS
INDEXERS_PYPATHS.remove('addok.pairs.PairsIndexer')
INDEXERS_PYPATHS.remove('addok.autocomplete.EdgeNgramIndexer')
By default, digit only words are not turned into trigrams. To prevent this,
set `TRIGRAM_SKIP_DIGIT=False`.
## Usage
Use `addok batch` just like with genuine addok for importing documents, but no
need for running `addok ngrams`, given they are already part of the index
strategy.