https://github.com/anubhav4sachan/bing-scraper

The bingscraper is python3 package which extracts the text and images content on search engine 'bing.com'
https://github.com/anubhav4sachan/bing-scraper

html package parser pypi python scraper scraping-websites

Last synced: 5 months ago
JSON representation

The bingscraper is python3 package which extracts the text and images content on search engine 'bing.com'

Host: GitHub
URL: https://github.com/anubhav4sachan/bing-scraper
Owner: anubhav4sachan
License: gpl-3.0
Created: 2018-06-26T17:24:57.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2021-06-30T01:46:08.000Z (almost 5 years ago)
Last Synced: 2026-01-03T13:12:55.785Z (6 months ago)
Topics: html, package, parser, pypi, python, scraper, scraping-websites
Language: Python
Homepage:
Size: 38.1 KB
Stars: 9
Watchers: 0
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          

 # Bing Scraper

The bingscraper is python3 package which extracts the text and images content on search engine `bing.com`.

It helps the user in a way that he/she will be getting only meaningful results and images for their search query. It does not download the ad content and hence saving data for the user.

The script working in background requests for a search term and creates directory (if not made previously) in the root directory of the script where all the content of the related particular search is stored. This script will be downloading the hypertext and hyperlink to that text and saving it to a .txt file within the directory made by itself. This directory saves the text content as well as the images downloaded using the script.

## Requirements

1.	Modules:

    a. `requests`: For requesting content through two HTTPS Methods: `GET` and `POST`. Used `GET` Method.

        

    b. `BeautifulSoup`: For creating JSON like dictionary using HTML Parser. Package uses `bs4`.

    

    c. `os`: For checking and making directories.

    

    d. `PIL.Image`: `Pillow Module`. For extracting image content.

    

    e. `io.ByteIO`: For saving the extracted image using the `PIL.Image`.

2.	Internet Connection: Continuous high speed internet connection is required for the proper function of the python package as  it continuously creates the copy of the images into the local machine.

3.  Python: Version 3.6.4 or above. This package is written in `python 3.6.4`

## Installation

For python installation:

`pip install bingscraper`

or 

`python -m pip install bingscraper`

For Anaconda installation:

`conda install bingscraper`

## How to use

Install the above modules. Successful import of `bingscraper` depends only after the above imports.

Sample code in python:

`import bingscraper as bs`

`search = str(input())`

`bs.scrape(search).text()    #For Text Scraping.`

`bs.scrape(search).image()   #For Image Scraping.`

OR

`from bingscraper import scrape`

`search = str(input())`

`scrape(search).text()    #For Text Scraping.`

`scrape(search).image()   #For Image Scraping.`

###### `scrape()` takes a string argument and the `.text()` or `.image()` does the scraping work.

# How to cite the project?

If the tool has been helpful to you and wish to cite it, you're requested to cite it as follows:

```

@misc{sachan2018bingscraper,

      title={bingscraper • pypi},

      author={Sachan, Anubhav},

      year={2018},

      url={https://pypi.org/project/bingscraper/}

}

```

For other formats, [cite as per Google Scholar](https://scholar.google.com/scholar?cluster=12184846436547447644&hl=en&oi=scholarr#d=gs_cit&u=%2Fscholar%3Fq%3Dinfo%3AXEsYpUZFGakJ%3Ascholar.google.com%2F%26output%3Dcite%26scirp%3D0%26scfhb%3D1%26hl%3Den)

### Change Log

#### Version 2.0: 

Separated `.text()` and `.image()`. Use as per requirement.

#### Version 3.0:

Minor Changes.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/anubhav4sachan/bing-scraper

Awesome Lists containing this project

README