Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jacobpclouse/webscraper
NYS Job Site Web Scrapper with Python
https://github.com/jacobpclouse/webscraper
Last synced: 4 days ago
JSON representation
NYS Job Site Web Scrapper with Python
- Host: GitHub
- URL: https://github.com/jacobpclouse/webscraper
- Owner: jacobpclouse
- Created: 2022-07-03T22:06:35.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-08-08T00:45:50.000Z (over 2 years ago)
- Last Synced: 2023-04-29T21:51:26.338Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 414 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# WebScrapper
Web Scrapper to find jobs at NYS and output to CSV and JSON file formats
(with Python, Beautiful Soup, Lxml, & Selenium)## Original Article Source:
https://oxylabs.io/blog/python-web-scraping## Sources & Helpful Links
- Installing requests with pip: https://itsmycode.com/importerror-no-module-named-requests/
- Save response to file: https://stackoverflow.com/questions/31126596/saving-response-from-requests-to-file
- Related Beautiful Soup Tutorial: https://oxylabs.io/blog/beautiful-soup-parsing-tutorial
- Path to Driver (Selenium) from StackOverflow: https://stackoverflow.com/questions/18674092/how-to-implement-chromedriver-in-selenium-in-linux-platform
- Printing to File w/ Indentation from StackOverflow: https://stackoverflow.com/questions/38192148/how-to-indent-entire-strings-of-text-in-text-file
- Write pandas DataFrame to CSV File in Python: https://statisticsglobe.com/write-pandas-dataframe-csv-file-python
- Find the second instance of a particular class using BeautifulSoup: https://stackoverflow.com/questions/64182786/find-the-second-instance-of-a-particular-class-using-beautifulsoup
- Getting first (or a specific) td in BeautifulSoup with no class: https://stackoverflow.com/questions/64542519/getting-first-or-a-specific-td-in-beautifulsoup-with-no-class
- Selecting second child using BeautifulSoup: https://stackoverflow.com/questions/38233838/selecting-second-child-using-beautifulsoup
- How to get two tags in findall using BeautifulSoup: https://www.edureka.co/community/42701/how-to-get-two-tags-in-findall-using-beautifulsoup
- Creating Key Value Pairs in Python: https://www.geeksforgeeks.org/add-a-keyvalue-pair-to-dictionary-in-python/
- Multiple Values per Key in Python: https://thispointer.com/python-dictionary-with-multiple-values-per-key/
- Python Tutorial: Working with JSON Data using the json Module: https://www.youtube.com/watch?v=9N6a-VLBa2I
- Creating JSON from Dictionaries and Lists: https://pythonexamples.org/python-create-json/