{"id":18400704,"url":"https://github.com/databricks/spark-integration-tests","last_synced_at":"2025-04-07T06:33:42.938Z","repository":{"id":21093899,"uuid":"24393917","full_name":"databricks/spark-integration-tests","owner":"databricks","description":"Integration tests for Spark","archived":false,"fork":false,"pushed_at":"2023-05-20T21:05:07.000Z","size":379,"stargazers_count":68,"open_issues_count":8,"forks_count":23,"subscribers_count":355,"default_branch":"master","last_synced_at":"2025-04-03T00:59:02.625Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/databricks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-09-23T23:58:51.000Z","updated_at":"2024-11-20T10:54:25.000Z","dependencies_parsed_at":"2022-09-02T15:31:45.495Z","dependency_job_id":null,"html_url":"https://github.com/databricks/spark-integration-tests","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fspark-integration-tests","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fspark-integration-tests/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fspark-integration-tests/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/databricks%2Fspark-integration-tests/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/databricks","download_url":"https://codeload.github.com/databricks/spark-integration-tests/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247607782,"owners_count":20965945,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T02:36:13.076Z","updated_at":"2025-04-07T06:33:39.545Z","avatar_url":"https://github.com/databricks.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Spark Integration Tests\n\nThis project contains [Docker](http://docker.com)-based integration tests for Spark, including fault-tolerance tests for Spark's standalone cluster manager.\n\n## Installation / Setup\n\n### Install Docker\n\nThis project depends on Docker \u003e= 1.3.0 (it may work with earlier versions, but this hasn't been tested).\n\n#### On Linux\n\nInstall Docker.  This test suite requires that Docker can run without `sudo` (see http://docs.docker.io/en/latest/use/basics/).\n\n#### On OSX\n\nOn OSX, these integration tests can be run using [boot2docker](https://github.com/boot2docker/boot2docker).\nFirst, [download `boot2docker`](https://github.com/boot2docker/osx-installer/releases/tag/v1.3.2), run the installer, then run `~/Applications/boot2docker` to perform some one-time setup (create the VM, etc.).  This project has been tested with `boot2docker` 1.3.0+.\n\nWith `boot2docker`, the Docker containers will be run inside of a VirtualBox VM, which creates some difficulties for communication between the Mac host and the containers.  Follow these instructions to work around those issues:\n   \n- **Network access**:  Our tests currently run the SparkContext from outside of the containers, so we need both host \u003c-\u003e container and container \u003c-\u003e container networking to work properly.  This is complicated by the fact that `boot2docker` runs the containers behind a NAT in VirtualBox.\n\n  [One workaround](https://github.com/boot2docker/boot2docker/issues/528) is to add a routing table entry that routes traffic to containers to the VirtualBox VM's IP address:\n  \n  ```\n  sudo route -n add 172.17.0.0/16 `boot2docker ip`    \n  ```\n  \n  You'll have to re-run this command if you restart your computer or assign a new IP to the VirtualBox VM.\n  \n  \n### Install Docker images\n\nThe integration tests depend on several Docker images.  To set them up, run\n\n```\n./docker/build.sh\n```\n\nto build our custom Docker images and download other images from the Docker repositories.  This needs to download a fair amount of stuff, so make sure that you're on a fast internet connection (or be prepared to wait a while).\n\n### Configure your environment\n\n**Quickstart**: Running `./init.sh` will perform environment sanity checking and tell you which shell exports to perform.\n\n**Details**:\n\n- The `SPARK_HOME` environment variable should to a Spark source checkout where an assembly has been built.  This directory will be shared with Docker containers; Spark workers and masters will use this `SPARK_HOME/work` as their work directory.  This effectively treats host machine's `SPARK_HOME` directory as a directory on a network-mounted filesystem.\n\n  Additionally, this Spark sbt project will added as a dependency of this sbt project, so the integration test code will be compiled against that Spark version.\n\n\n### Test-specific requirements\n\n#### Mesos\n\nThe Mesos integration tests require `MESOS_NATIVE_LIBRARY` to be set.  For Mac users, the easiest way to install Mesos is through Homebrew:\n\n```\nbrew install mesos\n```\n\nthen\n\n```\nexport MESOS_NATIVE_LIBRARY=$(brew --repository)/lib/libmesos.dylib\n```\n\nSpark on Mesos requires a Spark binary distribution `.tgz` file.  To build this, run `./make-distribution.sh --tgz` in your Spark checkout.\n\n## Running the tests\n\nThese integration tests are implemented as ScalaTest suites and can be run through sbt.  Note that you will probably need to give sbt extra memory; with newer versions of the sbt launcher script, this can be done with the `-mem` option, e.g.\n\n```\nsbt -mem 2048 test:package \"test-only org.apache.spark.integrationtests.MesosSuite\"\n```\n\n*Note:* Although our Docker-based test suites attempt to clean up the containers that they create, this cleanup may not be performed if the test runner's JVM exits abruptly.  To kill **all** Docker containers (including ones that may not have been launched by our tests), you can run `docker kill $(docker ps -q)`.\n\n## License\n\nThis project is licensed under the Apache 2.0 License. See LICENSE for full license text.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks%2Fspark-integration-tests","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatabricks%2Fspark-integration-tests","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatabricks%2Fspark-integration-tests/lists"}