Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ranahaani/GNews
A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.
https://github.com/ranahaani/GNews
article gnews google-news google-news-api google-news-homepage google-news-rss google-news-scraper hacktoberfest hacktoberfest-accepted hacktoberfest2023 news newspaper3k python rss-feed
Last synced: 3 months ago
JSON representation
A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.
- Host: GitHub
- URL: https://github.com/ranahaani/GNews
- Owner: ranahaani
- License: mit
- Created: 2021-02-17T12:42:51.000Z (almost 4 years ago)
- Default Branch: master
- Last Pushed: 2024-05-29T07:32:19.000Z (8 months ago)
- Last Synced: 2024-05-29T20:15:18.467Z (8 months ago)
- Topics: article, gnews, google-news, google-news-api, google-news-homepage, google-news-rss, google-news-scraper, hacktoberfest, hacktoberfest-accepted, hacktoberfest2023, news, newspaper3k, python, rss-feed
- Language: Python
- Homepage: https://pypi.org/project/gnews/
- Size: 402 KB
- Stars: 560
- Watchers: 18
- Forks: 95
- Open Issues: 23
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
- stars - ranahaani/GNews - A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response. (Python)
- stars - ranahaani/GNews - A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response. (Python)
README
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]
[![Download][download-sheild]][download-url]
[![LinkedIn][linkedin-shield]][linkedin-url]
GNews π°
A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a usable JSON response! π
If you like β€οΈ GNews or find it useful π, support the project by buying me a coffee β.
π View Demo
Β·
π Report Bug
Β·
π Request Feature
Table of Contents π
About π©
Getting Started π
Usage π§©
- Top News π
- News by Keywords π
- News by Major Topics π
- News by GEO Location π
- News by Site π°
- Results π
- Supported Countries π
- Supported Languages π
- Article Properties π
- Getting Full Article π°
- To Do π
- Roadmap π£οΈ
- Contributing π€
- License βοΈ
- Contact π¬
- Acknowledgements π
## About GNews
π© GNews is A Happy and lightweight Python Package that searches Google News RSS Feed and returns a usable JSON
response \
π© As well as you can fetch full article (**No need to write scrappers for articles fetching anymore**)Google News cover across **141+ countries** with **41+ languages**. On the bottom left side of the Google News page you
may find a `Language & region` section where you can find all of the supported combinations.### Demo
[![GNews Demo][demo-gif]](https://github.com/ranahaani/GNews)
## Getting Started
This section provides instructions for two different use cases:
1. **Installing the GNews package** for immediate use.
2. **Setting up the GNews project** for local development.### 1. Installing the GNews package
To install the package and start using it in your own projects, follow these steps:
``` shell
pip install gnews
```
### 2. Setting Up GNews for Local DevelopmentIf you want to make modifications locally, follow these steps to set up the development environment.
#### Option 1: Setup with Docker
1. Install [docker and docker-compose](https://docs.docker.com/get-docker/).
2. Configure the `.env` file by placing your MongoDB credentials.
3. Run the following command to build and start the Docker containers:``` shell
docker-compose up --build
```#### Option 2: Install Using Git Clone
1. Clone this repository:
``` shell
git clone https://github.com/ranahaani/GNews.git
```2. Set up a virtual environment:
```shell
virtualenv venv
source venv/bin/activate # MacOS/Linux
.\venv\Scripts\activate # Windows
```3. Install the required dependencies:
```shell
pip install -r requirements.txt
```### Example usage
```python
from gnews import GNewsgoogle_news = GNews()
pakistan_news = google_news.get_news('Pakistan')
print(pakistan_news[0])
``````
[{
'publisher': 'Aljazeera.com',
'description': 'Pakistan accuses India of stoking conflict in Indian Ocean '
'Aljazeera.com',
'published date': 'Tue, 16 Feb 2021 11:50:43 GMT',
'title': 'Pakistan accuses India of stoking conflict in Indian Ocean - '
'Aljazeera.com',
'url': 'https://www.aljazeera.com/news/2021/2/16/pakistan-accuses-india-of-nuclearizing-indian-ocean'
},
...]
```### Get top news
* `GNews.get_top_news()`
### Get news by keyword
* `GNews.get_news(keyword)`
### Get news by major topic
* `GNews.get_news_by_topic(topic)`
* Available topics:` WORLD, NATION, BUSINESS, TECHNOLOGY, ENTERTAINMENT, SPORTS, SCIENCE, HEALTH, POLITICS, CELEBRITIES, TV, MUSIC, MOVIES, THEATER, SOCCER, CYCLING, MOTOR SPORTS, TENNIS, COMBAT SPORTS, BASKETBALL, BASEBALL, FOOTBALL, SPORTS BETTING, WATER SPORTS, HOCKEY, GOLF,
CRICKET, RUGBY, ECONOMY, PERSONAL FINANCE, FINANCE, DIGITAL CURRENCIES, MOBILE, ENERGY, GAMING, INTERNET SECURITY, GADGETS, VIRTUAL REALITY, ROBOTICS, NUTRITION, PUBLIC HEALTH, MENTAL HEALTH, MEDICINE, SPACE, WILDLIFE, ENVIRONMENT, NEUROSCIENCE, PHYSICS, GEOLOGY, PALEONTOLOGY, SOCIAL SCIENCES, EDUCATION, JOBS, ONLINE EDUCATION, HIGHER EDUCATION, VEHICLES, ARTS-DESIGN, BEAUTY, FOOD, TRAVEL, SHOPPING, HOME, OUTDOORS, FASHION.`### Get news by geo location
* `GNews.get_news_by_location(location)`
* location can be name of city/state/country### Get news by site
* `GNews.get_news_by_site(site)`
* site should be in the format of: `"cnn.com"`### Results specification
All parameters are optional and can be passed during initialization. Hereβs a list of the available parameters:- **language**: The language in which to return results (default: 'en').
- **country**: The country code for the headlines (default: 'US').
- **period**: The time period for which you want news.
- **start_date**: Date after which results must have been published.
- **end_date**: Date before which results must have been published.
- **max_results**: The maximum number of results to return (default: 100).
- **exclude_websites**: A list of websites to exclude from results.
- **proxy**: A dictionary specifying the proxy settings used to route requests. The dictionary should contain a single key-value pair where the key is the protocol (`http` or `https`) and the value is the proxy address. Example:
```python
# Example with only HTTP proxy
proxy = {
'http': 'http://your_proxy_address',
}
# Example with only HTTPS proxy
proxy = {
'https': 'http://your_proxy_address',
}
```
#### Example Initialization
```python
from gnews import GNews# Initialize GNews with various parameters, including proxy
google_news = GNews(
language='en',
country='US',
period='7d',
start_date=None,
end_date=None,
max_results=10,
exclude_websites=['yahoo.com', 'cnn.com'],
proxy={
'https': 'https://your_proxy_address'
}
)
```* Or change it to an existing object
```python
google_news.period = '7d' # News from last 7 days
google_news.max_results = 10 # number of responses across a keyword
google_news.country = 'United States' # News from a specific country
google_news.language = 'english' # News in a specific language
google_news.exclude_websites = ['yahoo.com', 'cnn.com'] # Exclude news from specific website i.e Yahoo.com and CNN.com
google_news.start_date = (2020, 1, 1) # Search from 1st Jan 2020
google_news.end_date = (2020, 3, 1) # Search until 1st March 2020
```The format of the timeframe is a string comprised of a number, followed by a letter representing the time operator. For
example 1y would signify 1 year. Full list of operators below:```
- h = hours (eg: 12h)
- d = days (eg: 7d)
- m = months (eg: 6m)
- y = years (eg: 1y)
```
Setting the start and end dates can be done by passing in either a datetime or a tuple in the form (YYYY, MM, DD).### Supported Countries
```python
print(google_news.AVAILABLE_COUNTRIES){'Australia': 'AU', 'Botswana': 'BW', 'Canada ': 'CA', 'Ethiopia': 'ET', 'Ghana': 'GH', 'India ': 'IN',
'Indonesia': 'ID', 'Ireland': 'IE', 'Israel ': 'IL', 'Kenya': 'KE', 'Latvia': 'LV', 'Malaysia': 'MY', 'Namibia': 'NA',
'New Zealand': 'NZ', 'Nigeria': 'NG', 'Pakistan': 'PK', 'Philippines': 'PH', 'Singapore': 'SG', 'South Africa': 'ZA',
'Tanzania': 'TZ', 'Uganda': 'UG', 'United Kingdom': 'GB', 'United States': 'US', 'Zimbabwe': 'ZW',
'Czech Republic': 'CZ', 'Germany': 'DE', 'Austria': 'AT', 'Switzerland': 'CH', 'Argentina': 'AR', 'Chile': 'CL',
'Colombia': 'CO', 'Cuba': 'CU', 'Mexico': 'MX', 'Peru': 'PE', 'Venezuela': 'VE', 'Belgium ': 'BE', 'France': 'FR',
'Morocco': 'MA', 'Senegal': 'SN', 'Italy': 'IT', 'Lithuania': 'LT', 'Hungary': 'HU', 'Netherlands': 'NL',
'Norway': 'NO', 'Poland': 'PL', 'Brazil': 'BR', 'Portugal': 'PT', 'Romania': 'RO', 'Slovakia': 'SK', 'Slovenia': 'SI',
'Sweden': 'SE', 'Vietnam': 'VN', 'Turkey': 'TR', 'Greece': 'GR', 'Bulgaria': 'BG', 'Russia': 'RU', 'Ukraine ': 'UA',
'Serbia': 'RS', 'United Arab Emirates': 'AE', 'Saudi Arabia': 'SA', 'Lebanon': 'LB', 'Egypt': 'EG',
'Bangladesh': 'BD', 'Thailand': 'TH', 'China': 'CN', 'Taiwan': 'TW', 'Hong Kong': 'HK', 'Japan': 'JP',
'Republic of Korea': 'KR'}
```### Supported Languages
```python
print(google_news.AVAILABLE_LANGUAGES){'english': 'en', 'indonesian': 'id', 'czech': 'cs', 'german': 'de', 'spanish': 'es-419', 'french': 'fr',
'italian': 'it', 'latvian': 'lv', 'lithuanian': 'lt', 'hungarian': 'hu', 'dutch': 'nl', 'norwegian': 'no',
'polish': 'pl', 'portuguese brasil': 'pt-419', 'portuguese portugal': 'pt-150', 'romanian': 'ro', 'slovak': 'sk',
'slovenian': 'sl', 'swedish': 'sv', 'vietnamese': 'vi', 'turkish': 'tr', 'greek': 'el', 'bulgarian': 'bg',
'russian': 'ru', 'serbian': 'sr', 'ukrainian': 'uk', 'hebrew': 'he', 'arabic': 'ar', 'marathi': 'mr', 'hindi': 'hi',
'bengali': 'bn', 'tamil': 'ta', 'telugu': 'te', 'malyalam': 'ml', 'thai': 'th', 'chinese simplified': 'zh-Hans',
'chinese traditional': 'zh-Hant', 'japanese': 'ja', 'korean': 'ko'}
```### Article Properties
- Get news returns the list with following keys: `title`, `published_date`, `description`, `url`, `publisher`.
| Properties | Description | Example |
|--------------|------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| title | Title of the article | IMF Staff and Pakistan Reach Staff-Level Agreement on the Pending Reviews Under the Extended Fund Facility |
| url | Google news link to article | [Article Link](http://news.google.com/news/url?sa=t&fd=R&ct2=us&usg=AFQjCNGNR4Qg8LGbjszT1yt2s2lMXvvufQ&clid=c3a7d30bb8a4878e06b80cf16b898331&cid=52779522121279&ei=VQU7WYjiFoLEhQHIs4HQCQ&url=https://www.theguardian.com/commentisfree/2017/jun/07/why-dont-unicorns-exist-google) |
| published date | Published date | Wed, 07 Jun 2017 07:01:30 GMT |
| description | Short description of article | IMF Staff and Pakistan Reach Staff-Level Agreement on the Pending Reviews Under the Extended Fund Facility ... |
| publisher | Publisher of article | The Guardian | |## Getting full article
* To read a full article you can either:
* Navigate to the url directly in your browser, or
* Use `newspaper3k` library to scrape the article
* The article url, needed for both methods, is accessed as `article['url']`.#### Using newspaper3k
1. Install the library - `pip3 install newspaper3k`.
2. Use `get_full_article` method from `GNews`, that creates an `newspaper.article.Article` object from the url.```python
from gnews import GNewsgoogle_news = GNews()
json_resp = google_news.get_news('Pakistan')
article = google_news.get_full_article(
json_resp[0]['url']) # newspaper3k instance, you can access newspaper3k all attributes in article
```This new object contains `title`, `text` (full article) or `images` attributes. Examples:
```python
article.title
```> IMF Staff and Pakistan Reach Staff-Level Agreement on the Pending Reviews Under the Extended Fund Facility'
```python
article.text
```> End-of-Mission press releases include statements of IMF staff teams that convey preliminary findings after a mission. The views expressed are those of the IMF staff and do not necessarily represent the views of the IMFβs Executive Board.\n\nIMF staff and the Pakistani authorities have reached an agreement on a package of measures to complete second to fifth reviews of the authoritiesβ reform program supported by the IMF Extended Fund Facility (EFF) ..... (full article)
```python
article.images
```> `{'https://www.imf.org/~/media/Images/IMF/Live-Page/imf-live-rgb-h.ashx?la=en', 'https://www.imf.org/-/media/Images/IMF/Data/imf-logo-eng-sep2019-update.ashx', 'https://www.imf.org/-/media/Images/IMF/Data/imf-seal-shadow-sep2019-update.ashx', 'https://www.imf.org/-/media/Images/IMF/Social/TW-Thumb/twitter-seal.ashx', 'https://www.imf.org/assets/imf/images/footer/IMF_seal.png'}
````python
article.authors
```> `[]`
Read full documentation for `newspaper3k`
[newspaper3k](https://newspaper.readthedocs.io/en/latest/user_guide/quickstart.html#parsing-an-article)## Todo
- Save to MongoDB
- Save to SQLite
- Save to JSON
- Save to .CSV file
- More than 100 articles## Roadmap
See the [open issues](https://github.com/ranahaani/GNews/issues) for a list of proposed features (and known issues).
## Contributing
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any
contributions you make are **greatly appreciated**.1. Fork the Project
2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request## License
Distributed under the MIT License. See `LICENSE` for more information.
## Contact
Muhammad Abdullah - [@ranahaani](https://twitter.com/ranahaani) - [email protected]
Project Link: [https://github.com/ranahaani/GNews](https://github.com/ranahaani/GNews)
[!["Buy Me A Coffee"](https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png)](https://www.buymeacoffee.com/ranahaani)
[contributors-shield]: https://img.shields.io/github/contributors/ranahaani/GNews.svg?style=for-the-badge
[contributors-url]: https://github.com/ranahaani/GNews/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/ranahaani/GNews.svg?style=for-the-badge
[forks-url]: https://github.com/ranahaani/GNews/network/members
[stars-shield]: https://img.shields.io/github/stars/ranahaani/GNews.svg?style=for-the-badge
[stars-url]: https://github.com/ranahaani/GNews/stargazers
[issues-shield]: https://img.shields.io/github/issues/ranahaani/GNews.svg?style=for-the-badge
[issues-url]: https://github.com/ranahaani/GNews/issues
[license-shield]: https://img.shields.io/github/license/ranahaani/GNews.svg?style=for-the-badge
[license-url]: https://github.com/ranahaani/GNews/blob/master/LICENSE.txt
[download-sheild]: https://img.shields.io/pypi/dm/GNews.svg?style=for-the-badge
[download-url]: https://pypistats.org/packages/gnews
[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
[linkedin-url]: https://linkedin.com/in/ranahaani
[demo-gif]: https://github.com/ranahaani/GNews/raw/master/imgs/gnews.gif