https://github.com/entitizer/concepts-parser-js
Nodejs module for Extracting Concepts from text.
https://github.com/entitizer/concepts-parser-js
concept concepts extracting-concepts find-concepts
Last synced: 5 months ago
JSON representation
Nodejs module for Extracting Concepts from text.
- Host: GitHub
- URL: https://github.com/entitizer/concepts-parser-js
- Owner: entitizer
- Created: 2015-09-15T21:00:02.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2023-07-12T11:05:55.000Z (almost 3 years ago)
- Last Synced: 2025-09-22T02:19:42.141Z (9 months ago)
- Topics: concept, concepts, extracting-concepts, find-concepts
- Language: TypeScript
- Size: 4.34 MB
- Stars: 10
- Watchers: 1
- Forks: 5
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# concepts-parser
Nodejs module for extracting concepts from text.
A Concept is a part of a text that may be a [Named entity](https://en.wikipedia.org/wiki/Named_entity). We use Concepts for learning new named-entities, for searching known entities, for identifying entity names(synonyms, abbreviations), etc.
## Usage
JavaScript:
```js
const parser = require('concepts-parser');
const concepts = parser.parse({ text: 'Some text', lang: 'ru', country: 'ru' });
```
TypeScript:
```ts
import { parse } from 'concepts-parser';
const concepts = parse({ text: 'Some text', lang: 'ru', country: 'ru '});
```
## API
### parse(context, options)
Finds concepts in a context.
- `context` (Object) **required** - Context
+ `text` (String) **required** - Text to find concepts;
+ `lang` (String) **required** - Text language, 2 chars code: `en`, `ru`;
+ `country` (String) **optional** - Context country: `ru`, `it`;
- `options` (Object) **optional**:
+ `mode` (String) **optional** - Can be **identity** or **collect**. Default: **identity**. **identity** mode excludes filters: `start_word`, `duplicate` and `partial`;
+ `filters` (String[]) **optional** - Ordered list of filters;
#### Valid filters
1. `invalid_prefix` - deletes invalid prefixes;
2. `invalid` - exclude invalid concepts;
3. `partial` - exclude partial concepts;
4. `prefix` - add prefixes to concepts;
5. `suffix` - add suffixes to concepts;
6. `start_word` - exclude sentence start words;
7. `abbr` - finds concepts abbreviations;
8. `known` - finds known concepts;
9. `duplicate` - exclude duplicates;
10. `quote` - concats concepts in quotes: `Teatrul Național "Mihai Eminescu"`;