https://github.com/kalebu/link-scraper-in-python
A Python script to scrap all links in a given website using requests and Beautiful soup
https://github.com/kalebu/link-scraper-in-python
link-scraper-python python python-bs4 python-requests python-script python-webscraping-application
Last synced: 7 months ago
JSON representation
A Python script to scrap all links in a given website using requests and Beautiful soup
- Host: GitHub
- URL: https://github.com/kalebu/link-scraper-in-python
- Owner: Kalebu
- License: mit
- Created: 2021-01-26T07:56:12.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2023-02-13T14:22:42.000Z (almost 3 years ago)
- Last Synced: 2025-05-08T00:06:42.236Z (7 months ago)
- Topics: link-scraper-python, python, python-bs4, python-requests, python-script, python-webscraping-application
- Language: Python
- Homepage: https://kalebujordan.com/learn-how-to-extract-all-links-from-a-website-in-python/
- Size: 4.88 KB
- Stars: 12
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Link-scraper-in-python
A Python script to scrap all links in a given website using requests and Beautiful soup
Detailed article
-----------------
The detailed article on how to scrap all links in a given website can be found on [my blog](kalebujordan.com) with an article titled [How to extract all website link in Python](https://kalebujordan.com/learn-how-to-extract-all-links-from-a-website-in-python/).
Getting started
----------------
To get started exploring this code you might have to clone or download the repository just as shown
below;
```bash
-> git clone https://github.com/Kalebu/Link-scraper-in-python
-> cd Link-scraper-in-python
```
Dependencies
------------
To successfully run this code you're supposed to have requests and BeautifulSoup libary installed on your machine
```bash
-> pip install requests
-> pip install beautifulsoup4
```
Running
--------
Now that we have everything already set up , lets run our code just as shown below;
```bash
Link-scraper-in-python -> python link_spider.py
Enter URL of the site : https://kalebujordan.com/
['#content', 'https://kalebujordan.com/', 'https://kalebujordan.com/', 'https://kalebujordan.com/category/projects/...]
```
Explore it
-----------
Now keep explore it by testing it with various input links to see what links it will scrap
Give it a star
--------------
Did you find this information useful, then give it a star
Credits
-----------
All the credits to [kalebu](github.com/kalebu)