{"id":15934614,"url":"https://github.com/datitran/pyspark-app-cf","last_synced_at":"2025-08-15T19:11:18.092Z","repository":{"id":107998016,"uuid":"80026545","full_name":"datitran/PySpark-App-CF","owner":"datitran","description":null,"archived":false,"fork":false,"pushed_at":"2017-02-06T14:23:26.000Z","size":128,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-10-29T08:04:39.440Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datitran.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-01-25T15:20:05.000Z","updated_at":"2020-03-24T11:54:30.000Z","dependencies_parsed_at":null,"dependency_job_id":"4667ba24-2eb6-4ca8-b12c-50270850524b","html_url":"https://github.com/datitran/PySpark-App-CF","commit_stats":{"total_commits":30,"total_committers":1,"mean_commits":30.0,"dds":0.0,"last_synced_commit":"9270b5974c0a1b5d2a22a5fb08e935e0644789f3"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2FPySpark-App-CF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2FPySpark-App-CF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2FPySpark-App-CF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datitran%2FPySpark-App-CF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datitran","download_url":"https://codeload.github.com/datitran/PySpark-App-CF/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247024152,"owners_count":20870940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-07T03:20:20.848Z","updated_at":"2025-04-03T15:16:40.171Z","avatar_url":"https://github.com/datitran.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PySpark-App-CF\n\nA simple example which uses the [PySpark buildpack](https://github.com/andreasf/pyspark-buildpack) to deploy an [Apache Spark](http://spark.apache.org/) application, particularly using its Python API, on Cloud Foundry.\n\n### Getting Started\n\n- Use `cf push` to deploy the application\n\n### Testing\nA local `apache-spark` instance and `nosetests` need to be installed before running the tests. For more information see [here](https://github.com/datitran/spark-tdd-example).\n\n- Run tests with: `nosetests -vs tests/`\n\n### CI/CD\n[Concourse](https://concourse.ci/) is used as our CI tool due to its seamless integration with Cloud Foundry. The fastest way to use Concourse is with [Vagrant](https://www.vagrantup.com/):\n \n- Install Vagrant and run `vagrant init concourse/lite \u0026\u0026 vagrant up`\n- Connect to the CI: `fly -t pyspark-app-cf login -c http://192.168.100.4:8080`\n- Fill in the credential details in `credentials.yml.example` and rename the file to `credentials.yml`\n- Register the pipeline: `fly -t pyspark-app-cf set-pipeline -p pyspark-app-ci -c pipeline.yml -l credentials.yml`\n- Unpause the pipeline: `fly -t pyspark-app-cf unpause-pipeline -p pyspark-app-ci`\n\n## Dependencies\n- [Apache Spark 2.1.0](http://spark.apache.org/)\n- OpenJDK 1.8.0_91\n- [Anaconda](https://www.continuum.io/downloads) Python 3.5.0\n- Python conda environment (install with `conda env create --file environment.yml`)\n\n## Copyright\n\nSee [LICENSE](LICENSE) for details.\nCopyright (c) 2017 [Dat Tran](http://www.dat-tran.com/), [Andreas Fleig](https://github.com/andreasf).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatitran%2Fpyspark-app-cf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatitran%2Fpyspark-app-cf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatitran%2Fpyspark-app-cf/lists"}