https://github.com/alephdata/countrytagger
Extract names of places from text and determine which country they may refer to
https://github.com/alephdata/countrytagger
Last synced: about 2 months ago
JSON representation
Extract names of places from text and determine which country they may refer to
- Host: GitHub
- URL: https://github.com/alephdata/countrytagger
- Owner: alephdata
- License: mit
- Created: 2020-02-05T09:50:32.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2021-05-05T08:14:17.000Z (about 4 years ago)
- Last Synced: 2025-05-07T03:03:29.673Z (about 2 months ago)
- Language: Python
- Size: 679 KB
- Stars: 8
- Watchers: 6
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# countrytagger
This library finds the names of places in a string of text and tries to associate
them with countries. The goal is to tag a piece (or set) of text with country
metadata. The place names are derived from the GeoNames database, and they include
names of countries, major administrative areas and large cities. Place names that
are used in several countries are not used.## Usage
```python
import countrytagger# match in a string using sequential matching:
text = 'I am in Berlin'
for (code, score, country) in countrytagger.tag_text_countries(text):
print(score, country)# find precise matches:
code, score, country = countrytagger.tag_place('Berlin')
```## Building the data
You can re-generate the place database like this:
```bash
$ make generate
```This will download GeoNames and parse it into the format used by this library.