https://github.com/intracer/scalawiki
scalawiki is a MediaWiki client in Scala
https://github.com/intracer/scalawiki
mediawiki mediawiki-api mediawiki-client scala wikipedia-api wikipedia-bot
Last synced: 2 months ago
JSON representation
scalawiki is a MediaWiki client in Scala
- Host: GitHub
- URL: https://github.com/intracer/scalawiki
- Owner: intracer
- License: apache-2.0
- Created: 2015-03-01T18:43:06.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2024-12-29T10:50:42.000Z (4 months ago)
- Last Synced: 2025-01-30T18:03:01.975Z (3 months ago)
- Topics: mediawiki, mediawiki-api, mediawiki-client, scala, wikipedia-api, wikipedia-bot
- Language: Scala
- Homepage:
- Size: 6.63 MB
- Stars: 31
- Watchers: 7
- Forks: 12
- Open Issues: 49
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# scalawiki
![]()
scalawiki is an experimental MediaWiki client in Scala on early stages of development.[](https://travis-ci.com/intracer/scalawiki?branch=master)
[](https://ci.appveyor.com/project/intracer/scalawiki/branch/master)
[](http://codecov.io/github/intracer/scalawiki?branch=master)
[](https://gitter.im/intracer/scalawiki?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
[  ](https://bintray.com/intracer/maven/scalawiki/_latestVersion)Why [another client library for MediaWiki](https://www.mediawiki.org/wiki/API:Client_code)?
I didn't know any Java client that supported [generators](https://www.mediawiki.org/wiki/API:Query#Generators) (fetching properties from articles listed by list query in a single request). JWBF [only recently] (https://github.com/eldur/jwbf/issues/21) got the ability to query more than 1 page at a time.
When Wikipedia sites are real Big Data it is just a show stopper. Fetching information about Wiki Loves Monuments uploads in such ineffective way will take almost a day even for one country, when could be done in several minutes otherwise in batches.
This library uses [Scala Futures](http://docs.scala-lang.org/overviews/core/futures.html) for easy job parallelization.
# Goals
* Fully support [MediaWiki API](https://www.mediawiki.org/wiki/API:Main_page)
* Support different backends - MediaWiki API, [xml dumps](https://meta.wikimedia.org/wiki/Data_dumps), [MediWiki database](https://www.mediawiki.org/wiki/Manual:Database_layout). Support copying data between backends (importing and exporting xml dumps to database, storing data retrived by MediaWiki API to xml dumps or database).
* Good test coverage