Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mandarancio/tag-extractor
Python flickr and instagram tag extractor (by location)
https://github.com/mandarancio/tag-extractor
flickr gis hashtag instagram nlp ontology semantic splitter tag tokenizer twitter
Last synced: about 1 month ago
JSON representation
Python flickr and instagram tag extractor (by location)
- Host: GitHub
- URL: https://github.com/mandarancio/tag-extractor
- Owner: Mandarancio
- Created: 2017-03-25T13:10:42.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2017-06-06T15:11:30.000Z (over 7 years ago)
- Last Synced: 2024-07-17T14:45:30.223Z (6 months ago)
- Topics: flickr, gis, hashtag, instagram, nlp, ontology, semantic, splitter, tag, tokenizer, twitter
- Language: Web Ontology Language
- Homepage:
- Size: 1.61 MB
- Stars: 1
- Watchers: 5
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![Build][travis-image]][travis-url]
# tag-extractor
Python *Flickr* and *Instagram* tag extractor (by location) using **Python 3**## How to Run
Simply:
```bash
python3 setup.py develop
```
After setup please insteall
To install the nltk modules, run the following ```python3``` script:
```python
#! /usr/bin/python3
import nltk
nltk.download("omw")
nltk.download("brown")
```Then:
```bash
tagextractor --config YOURCONFIG.yml
```## Configuration
Example of configuration
```yaml
--- # Configuration
extraction:
enabled: true
api: instagram
api_cfg:
ACCESS_TOKEN : YOUR_TOKEN
ACCESS_SECRET : YOUR_SECRET_KEY
CONSUMER_KEY : CONSUMER_KEY
CONSUMER_SECRET : CONSUMER_SECRET
frequency: ../resources/frequs.json
location:
lat: 46.205850
lon: 6.157521
radius: 1
number: 1000
pipeline:
Babel: false
WordNet: true
storage:
module: DB
module_cfg:
path: sqlite:///database/output.db
classification:
enabled: true
inputdb: sqlite:///database/output.db
ontology_path: resources
ontology: kr-owlxml.owl
ontology_name: http://tagis.kr.com
outputdb: sqlite:///database/output-classified.db
```## Architecture
### Extraction
![extraction](documentation/Extraction_Diagram.png)
### Classification
## References and Links
### Dependencies
The project uses multiple standard python libraries:
- [nltk](http://www.nltk.org/howto/wordnet.html) (and the modules ```omw``` and ```brown```)
- [unidecode](https://pypi.python.org/pypi/Unidecode)
- [SQLAlchemy](http://docs.sqlalchemy.org/en/latest/)
- [flickrapi](https://stuvel.eu/flickrapi-doc/)
- [twitter](https://pypi.python.org/pypi/twitter)
- [requests](http://docs.python-requests.org/en/master/)
- [PyYAML](https://pypi.python.org/pypi/PyYAML)
- [Owlready](https://pypi.python.org/pypi/Owlready)### References
- [A methodology for mapping Instagram hashtags](http://firstmonday.org/article/view/5563/4195)### Links
- [PyBabelfy](https://github.com/aghie/pybabelfy)
- [python twitter examples](https://github.com/ideoforms/python-twitter-examples)
- [Piplines in Python](https://brett.is/writing/about/generator-pipelines-in-python/)
- [FlickrAPI bug fix](https://github.com/sybrenstuvel/flickrapi/issues/75)[travis-image]:https://travis-ci.org/Mandarancio/tag-extractor.svg?branch=master
[travis-url]:https://travis-ci.org/Mandarancio/tag-extractor