Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pratikpakhale/tweets-scrapper
These are just simple scripts to scrape tweets and then do some analysis. Here, we try to search for IPL tweets and then analyse them using Gemini LLM. You can use the js snippets for scraping and further do your own analysis.
https://github.com/pratikpakhale/tweets-scrapper
js scrape tweet twitter
Last synced: about 1 month ago
JSON representation
These are just simple scripts to scrape tweets and then do some analysis. Here, we try to search for IPL tweets and then analyse them using Gemini LLM. You can use the js snippets for scraping and further do your own analysis.
- Host: GitHub
- URL: https://github.com/pratikpakhale/tweets-scrapper
- Owner: pratikpakhale
- Created: 2024-05-03T21:06:50.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-03T21:07:22.000Z (8 months ago)
- Last Synced: 2024-05-03T22:25:23.792Z (8 months ago)
- Topics: js, scrape, tweet, twitter
- Language: Python
- Homepage:
- Size: 3.24 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Manual Tweets Scrapper
## Description
These are just simple scripts to scrape tweets and then do some analysis. Here, we try to search for IPL tweets and then analyse them using Gemini LLM. You can use the js snippets for scraping and further do your own analysis.
## Steps
1. Goto Twitter [Explore](https://twitter.com/explore) section - https://twitter.com/explore
2. Add the snippets `scrape.js` and `auto_scroll.js` in your chrome devtools as snippets under `sources` section.
3. Search for your query, try to use twitter advanced search to filter out spam tweets and to use other filters. Highly recommended.
4. Run the `scrape` snippet.
5. Run the `auto_scroll` snipper.
6. Wait until you feel satisfied with the number of tweets scrapped. You can look at console to see the logs.
7. Once you get rate limited or you search bot, try log the variable `tweets` in the console. You can then right click and choose copy object.
8. In the `data/` folder create a new JSON file and paste your object in there.
9. Now you can merge all the files into one by running `merge.py` script.
10. Run the `run_genai.py` file after entering your Gemini API key in it. This will run through the tweets and create a file `analysed.json` in `results/` directory.
11. Use `preprocess.py` to make sure the results data is in consistent format.