https://github.com/docnow/twarc-hashtags
Report on hashtags in tweet data.
https://github.com/docnow/twarc-hashtags
Last synced: about 1 year ago
JSON representation
Report on hashtags in tweet data.
- Host: GitHub
- URL: https://github.com/docnow/twarc-hashtags
- Owner: DocNow
- License: mit
- Created: 2021-08-19T23:58:12.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2022-02-05T18:02:07.000Z (over 4 years ago)
- Last Synced: 2025-03-21T05:32:46.892Z (about 1 year ago)
- Language: Python
- Size: 8.52 MB
- Stars: 3
- Watchers: 6
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# twarc-hashtags
This module is extends [twarc] with a `hashtags` command that will extract and
count the hashtags in a tweet dataset.
## Install
pip install twarc-hashtags
Collect some Twitter data, for example:
twarc2 search blacklivesmatter tweets.jsonl
Because you installed the plugin you have a new subcommand `hashtags`:
twarc2 hashtags tweets.jsonl hashtags.csv
Then open `hashtags.csv` in your favourite spreadsheet program or
DataFrame library.
Behind the scenes twarc-hashtags uses Python's native support for SQLite to
create a database and then insert/query it. You can see this database after the
program finishes as `hashtags.db` in your current working directory.
## Options
**--group**: group results by day, week, month, year
**--limit**: limit to this number of hashtags (per group if --group is used)
**--db**: if you would like to name the database something other than
`hashtags.db`
**--no-insert**: use an existing database instead of inserting (useful for
large numbers of tweets)
[twarc]: https://github.com/docnow/twarc