https://github.com/edsu/wplinks
utility to get a list of Wikipedia articles that point at a particular website
https://github.com/edsu/wplinks
Last synced: about 1 year ago
JSON representation
utility to get a list of Wikipedia articles that point at a particular website
- Host: GitHub
- URL: https://github.com/edsu/wplinks
- Owner: edsu
- Created: 2014-01-27T18:09:54.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2014-01-30T21:34:25.000Z (over 12 years ago)
- Last Synced: 2025-05-08T22:15:28.197Z (about 1 year ago)
- Language: Python
- Size: 176 KB
- Stars: 7
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
wplinks
=======
[](http://travis-ci.org/edsu/wplinks)
wplinks provides a generator function called `extlinks` that lets you iterate
through links from Wikipedia articles to a particular website, or portion
of a website. It also provides `links` which lets you iterate through other
Wikipedia URLs that are linked from a given Wikipedia URL.
So for example, to see what Wikipedia articles point at interviews on the
The Paris Review website:
```python
from wplinks import extlinks
for src, target in extlinks('http://www.theparisreview.org/interviews'):
print src, target
```
By default you get links for English Wikipedia, but if you'd like results for
the French Wikipedia instead use the `lang` parameter:
```python
from wplinks import extlinks
for src, target in extlinks('http://www.theparisreview.org/interviews', lang='fr'):
print src, target
```
If you'd like to see what other Wikipedia articles a given Wikipedia article
links to use the `links` function. For example lets say you want to see what
articles the James Joyce article points to:
```
from wplinks import links
for url in links('http://en.wikipedia.org/wiki/James_Joyce'):
print url
```
Why?
----
wplinks used to be somewhat involved since it scraped the
[External links search][1] page. It became quite a bit simpler once I
discovered the `exturlusage` [API][2] call. You might want to make this
API call yourself and page through the results, without including wplinks
as a dependency. But I left it here just in case you'd rather not.
License
-------
* CC0
[1]: https://en.wikipedia.org/wiki/Special:LinkSearch
[2]: https://en.wikipedia.org/w/api.php