https://github.com/catalystcode/streaming-bing
A library for reading public web news results from Bing Custom Search using Spark Streaming.
https://github.com/catalystcode/streaming-bing
Last synced: 2 months ago
JSON representation
A library for reading public web news results from Bing Custom Search using Spark Streaming.
- Host: GitHub
- URL: https://github.com/catalystcode/streaming-bing
- Owner: CatalystCode
- License: apache-2.0
- Created: 2017-05-31T03:45:47.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-06-26T23:40:06.000Z (over 8 years ago)
- Last Synced: 2025-01-22T15:48:00.443Z (11 months ago)
- Language: Scala
- Size: 21.5 KB
- Stars: 1
- Watchers: 14
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# streaming-bing
[](https://travis-ci.org/CatalystCode/streaming-bing)
A library for reading public web news results from [Bing Custom Search](https://customsearch.ai/) using Spark Streaming.

## Usage example ##
Run a demo via:
```sh
# set up all the requisite environment variables
export BING_SEARCH_INSTANCE_ID="..."
export BING_AUTH_TOKEN="..."
# compile scala, run tests, build fat jar
sbt assembly
# run locally
java -cp target/scala-2.11/streaming-bing-assembly-0.0.7.jar BingDemo standalone
# run on spark
spark-submit --class BingDemo --master local[2] target/scala-2.11/streaming-bing-assembly-0.0.7.jar spark
```
## How does it work? ##
Bing Custom Search doesn't support streamed web results so we currently poll the service based on a polling interval rate. The BingReceiver pings the Bing Search API every few
seconds and pushes any newly indexed web results into Spark Streaming for further processing.
## Release process ##
1. Configure your credentials via the `SONATYPE_USER` and `SONATYPE_PASSWORD` environment variables.
2. Update `version.sbt`
3. Run `sbt sonatypeOpen "enter staging description here"`
4. Run `sbt publishSigned`
5. Run `sbt sonatypeRelease`