Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pdehaan/alexa-top-sites
Scrape Alexa 'Top' pages for some popular sites for URL testing.
https://github.com/pdehaan/alexa-top-sites
Last synced: 15 days ago
JSON representation
Scrape Alexa 'Top' pages for some popular sites for URL testing.
- Host: GitHub
- URL: https://github.com/pdehaan/alexa-top-sites
- Owner: pdehaan
- Created: 2016-09-07T00:43:59.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-08-03T18:51:53.000Z (over 7 years ago)
- Last Synced: 2024-09-25T09:25:41.267Z (3 months ago)
- Language: JavaScript
- Homepage: http://npm.im/alexa-top-sites
- Size: 7.81 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# alexa-top-sites
Scrape Alexa "Top" pages for some popular sites for URL testing.
## Why?
Because you want to test some top sites, and don't want to go through an API because you only need a few URLs.
**NOTE:** At most, this API will probably only return 25 results, which seems to be the default page size of Alexa pages.
I wouldn't recommend using this garbage API for anything serious (like production sites) since it could easily break if Alexa ever changes their HTML/CSS. You're much better off using their APIs directly. I quickly built this :poop: for testing.
## Installation:
```sh
$ npm i alexa-top-sites -S
```## Usage:
### By Categories:
Note that the category name is very case specific. You should probably browse the Alexa site and find the specific URL you want and use that: http://www.alexa.com/topsites/category/Top/Sports.
```js
const { byCategory } = require('alexa-top-sites');byCategory('Sports')
.then((res) => console.log(JSON.stringify(res, null, 2)))
.catch((err) => console.error(err.message));
```OUTPUT:
```json
{
"category": "Sports",
"url": "http://www.alexa.com/topsites/category/Top/Sports",
"sites": [
"http://sports.yahoo.com",
"http://www.nbcolympics.com/",
"http://www.espncricinfo.com/",
"http://www.goal.com/",
"http://www.nfl.com/",
"http://www.cbssports.com/",
"http://bleacherreport.com",
"https://www.premierleague.com/",
"http://www.espn.com/",
"http://football.fantasysports.yahoo.com",
"http://www.livescore.com/",
"http://www.skysports.com/",
"http://www.cricbuzz.com/",
"http://deadspin.com",
"https://www.strava.com/",
"http://mlb.mlb.com/home",
"http://www.nbcsports.com/",
"http://www.bbc.com/sport/olympics",
"http://www.sbnation.com/",
"http://www.foxsports.com/",
"https://www.rei.com/",
"http://www.skysports.com/football",
"http://baseball.fantasysports.yahoo.com",
"http://www.flashscore.com/",
"http://www.si.com/"
]
}
```### By Countries:
```js
const { byCountry } = require('alexa-top-sites');byCountry('CA') // Canada
.then((res) => console.log(JSON.stringify(res, null, 2)))
.catch((err) => console.error(err));
```Output:
```json
{
"country": "CA",
"url": "http://www.alexa.com/topsites/countries/CA",
"sites": [
"http://google.ca",
"http://youtube.com",
"http://facebook.com",
"http://google.com",
"http://yahoo.com",
"http://live.com",
"http://msn.com",
"http://wikipedia.org",
"http://amazon.ca",
"http://kijiji.ca",
"http://bing.com",
"http://twitter.com",
"http://reddit.com",
"http://netflix.com",
"http://cbc.ca",
"http://amazon.com",
"http://linkedin.com",
"http://royalbank.com",
"http://instagram.com",
"http://diply.com",
"http://td.com",
"http://pinterest.com",
"http://imgur.com",
"http://ebay.ca",
"http://tumblr.com"
]
}
```### Global:
```js
const alexa = require('alexa-top-sites');alexa.global()
.then((res) => console.log(JSON.stringify(res, null, 2)))
.catch((err) => console.error(err));
```Output:
```json
{
"url": "http://www.alexa.com/topsites",
"sites": [
"http://google.com",
"http://youtube.com",
"http://facebook.com",
"http://baidu.com",
"http://yahoo.com",
"http://amazon.com",
"http://wikipedia.org",
"http://qq.com",
"http://google.co.in",
"http://twitter.com",
"http://live.com",
"http://taobao.com",
"http://google.co.jp",
"http://bing.com",
"http://weibo.com",
"http://instagram.com",
"http://sina.com.cn",
"http://vk.com",
"http://yahoo.co.jp",
"http://msn.com",
"http://linkedin.com",
"http://yandex.ru",
"http://google.de",
"http://hao123.com",
"http://google.co.uk"
]
}
```### Paged results:
Retrieve the first X pages of results for a category or country:
```js
const { getPages, byCategory } = require('alexa-top-sites');// Get the first 10 pages (250 results) of http://www.alexa.com/topsites/category/Top/Computers/Internet
getPages(byCategory, 'Computers/Internet', 10)
.then((res) => console.log(JSON.stringify(res, null, 2)))
.catch((err) => console.error(err.message));
```Output:
```js
[
"http://google.com",
"https://www.youtube.com/",
"https://www.facebook.com/",
"https://mail.google.com/",
"http://yahoo.com",
"https://twitter.com/",
"http://mail.yahoo.com",
"https://www.bing.com/",
"http://search.yahoo.com",
"https://www.linkedin.com/",
"http://msn.com",
"https://www.pinterest.com/",
"http://wordpress.com",
"http://tumblr.com",
"http://imgur.com",
...
]
```