{"id":21971567,"url":"https://github.com/mandarancio/tag-extractor","last_synced_at":"2025-03-22T23:10:58.570Z","repository":{"id":75095795,"uuid":"86158202","full_name":"Mandarancio/tag-extractor","owner":"Mandarancio","description":"Python flickr and instagram tag extractor (by location) ","archived":false,"fork":false,"pushed_at":"2017-06-06T15:11:30.000Z","size":1687,"stargazers_count":1,"open_issues_count":0,"forks_count":2,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-15T00:14:02.420Z","etag":null,"topics":["flickr","gis","hashtag","instagram","nlp","ontology","semantic","splitter","tag","tokenizer","twitter"],"latest_commit_sha":null,"homepage":"","language":"Web Ontology Language","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Mandarancio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-03-25T13:10:42.000Z","updated_at":"2017-12-02T15:40:46.000Z","dependencies_parsed_at":"2023-10-01T07:32:51.379Z","dependency_job_id":null,"html_url":"https://github.com/Mandarancio/tag-extractor","commit_stats":{"total_commits":190,"total_committers":5,"mean_commits":38.0,"dds":"0.42105263157894735","last_synced_commit":"228c26bc694b2bd0dde5b3140d6c031824670bd0"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mandarancio%2Ftag-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mandarancio%2Ftag-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mandarancio%2Ftag-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Mandarancio%2Ftag-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Mandarancio","download_url":"https://codeload.github.com/Mandarancio/tag-extractor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245031509,"owners_count":20549925,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["flickr","gis","hashtag","instagram","nlp","ontology","semantic","splitter","tag","tokenizer","twitter"],"created_at":"2024-11-29T14:52:02.617Z","updated_at":"2025-03-22T23:10:58.549Z","avatar_url":"https://github.com/Mandarancio.png","language":"Web Ontology Language","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build][travis-image]][travis-url]\n\n# tag-extractor\nPython *Flickr* and *Instagram* tag extractor (by location) using **Python 3**\n\n## How to Run\n\nSimply:\n```bash\npython3 setup.py develop\n```\nAfter setup please insteall\nTo install the nltk modules, run the following ```python3``` script:\n```python\n#! /usr/bin/python3\nimport nltk\nnltk.download(\"omw\")\nnltk.download(\"brown\")\n```\n\nThen:\n```bash\ntagextractor --config YOURCONFIG.yml\n```\n\n## Configuration\n\nExample of configuration\n\n```yaml\n--- # Configuration\nextraction:\n  enabled: true\n  api: instagram\n  api_cfg:\n    ACCESS_TOKEN : YOUR_TOKEN\n    ACCESS_SECRET : YOUR_SECRET_KEY\n    CONSUMER_KEY : CONSUMER_KEY\n    CONSUMER_SECRET : CONSUMER_SECRET\n    frequency: ../resources/frequs.json\n  location:\n    lat: 46.205850\n    lon: 6.157521\n    radius: 1\n  number: 1000\n  pipeline:\n    Babel: false\n    WordNet: true\n  storage:\n    module: DB\n    module_cfg:\n      path: sqlite:///database/output.db\nclassification:\n  enabled: true\n  inputdb: sqlite:///database/output.db\n  ontology_path: resources\n  ontology: kr-owlxml.owl\n  ontology_name: http://tagis.kr.com\n  outputdb: sqlite:///database/output-classified.db\n```\n\n\n\n## Architecture\n\n### Extraction\n\n![extraction](documentation/Extraction_Diagram.png)\n\n### Classification\n\n## References and Links\n\n### Dependencies\n\nThe project uses multiple standard python libraries:\n - [nltk](http://www.nltk.org/howto/wordnet.html) (and the modules ```omw``` and ```brown```)\n - [unidecode](https://pypi.python.org/pypi/Unidecode)\n - [SQLAlchemy](http://docs.sqlalchemy.org/en/latest/)\n - [flickrapi](https://stuvel.eu/flickrapi-doc/)\n - [twitter](https://pypi.python.org/pypi/twitter)\n - [requests](http://docs.python-requests.org/en/master/)\n - [PyYAML](https://pypi.python.org/pypi/PyYAML)\n - [Owlready](https://pypi.python.org/pypi/Owlready)\n\n### References\n - [A methodology for mapping Instagram hashtags](http://firstmonday.org/article/view/5563/4195)\n\n### Links\n\n - [PyBabelfy](https://github.com/aghie/pybabelfy)\n - [python twitter examples](https://github.com/ideoforms/python-twitter-examples)\n - [Piplines in Python](https://brett.is/writing/about/generator-pipelines-in-python/)\n - [FlickrAPI bug fix](https://github.com/sybrenstuvel/flickrapi/issues/75)\n\n[travis-image]:https://travis-ci.org/Mandarancio/tag-extractor.svg?branch=master\n[travis-url]:https://travis-ci.org/Mandarancio/tag-extractor\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmandarancio%2Ftag-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmandarancio%2Ftag-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmandarancio%2Ftag-extractor/lists"}