{"id":283,"url":"https://github.com/awesome-spark/awesome-spark","last_synced_at":"2025-10-03T01:31:07.485Z","repository":{"id":43309084,"uuid":"50860011","full_name":"awesome-spark/awesome-spark","owner":"awesome-spark","description":"A curated list of awesome Apache Spark packages and resources.","archived":false,"fork":false,"pushed_at":"2024-04-08T14:17:29.000Z","size":214,"stargazers_count":1629,"open_issues_count":17,"forks_count":323,"subscribers_count":83,"default_branch":"main","last_synced_at":"2024-05-20T05:08:41.315Z","etag":null,"topics":["apache-spark","awesome","pyspark","sparkr"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc0-1.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/awesome-spark.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"contributing.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2016-02-01T18:15:42.000Z","updated_at":"2024-05-20T03:30:14.000Z","dependencies_parsed_at":"2024-04-24T04:48:01.855Z","dependency_job_id":"22870cad-0025-4553-bef8-90a72bfce2d9","html_url":"https://github.com/awesome-spark/awesome-spark","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awesome-spark%2Fawesome-spark","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awesome-spark%2Fawesome-spark/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awesome-spark%2Fawesome-spark/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/awesome-spark%2Fawesome-spark/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/awesome-spark","download_url":"https://codeload.github.com/awesome-spark/awesome-spark/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234931017,"owners_count":18909038,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-spark","awesome","pyspark","sparkr"],"created_at":"2024-01-05T20:12:50.929Z","updated_at":"2025-10-03T01:31:02.448Z","avatar_url":"https://github.com/awesome-spark.png","language":"Shell","funding_links":[],"categories":["Data Engineering","Big Data","Technical","Uncategorized","Related resources","Shell","Technology","Live Site:   [searchAwesome](https://search-awesome.vercel.app/)","大数据","Other Lists","What to hire for:","Themed Directories","awesome-repos"],"sub_categories":["awesome-*","Uncategorized","Domain specific formats","Apache Spark","TeX Lists","Updated in the last 6 months"],"readme":"[\u003cimg src=\"https://cdn.rawgit.com/awesome-spark/awesome-spark/f78a16db/spark-logo-trademark.svg\" align=\"right\"\u003e](https://spark.apache.org/)\n\n# Awesome Spark [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)\n\nA curated list of awesome [Apache Spark](https://spark.apache.org/) packages and resources.\n\n_Apache Spark is an open-source cluster-computing framework. Originally developed at the [University of California](https://www.universityofcalifornia.edu/), [Berkeley's AMPLab](https://amplab.cs.berkeley.edu/), the Spark codebase was later donated to the [Apache Software Foundation](https://www.apache.org/), which has maintained it since. Spark provides an interface for programming entire clusters with implicit data parallelism and fault-tolerance_  ([Wikipedia 2017](#wikipedia-2017)).\n\nUsers of Apache Spark may choose between different the Python, R, Scala and Java programming languages to interface with the Apache Spark APIs.\n\n## Packages\n\n### Language Bindings\n\n* [Kotlin for Apache Spark](https://github.com/Kotlin/kotlin-spark-api) \u003cimg src=\"https://img.shields.io/github/last-commit/Kotlin/kotlin-spark-api.svg\"\u003e - Kotlin API bindings and extensions.\n* [.NET for Apache Spark](https://github.com/dotnet/spark) \u003cimg src=\"https://img.shields.io/github/last-commit/dotnet/spark.svg\"\u003e - .NET bindings.\n* [sparklyr](https://github.com/rstudio/sparklyr) \u003cimg src=\"https://img.shields.io/github/last-commit/rstudio/sparklyr.svg\"\u003e - An alternative R backend, using [`dplyr`](https://github.com/hadley/dplyr).\n* [sparkle](https://github.com/tweag/sparkle) \u003cimg src=\"https://img.shields.io/github/last-commit/tweag/sparkle.svg\"\u003e - Haskell on Apache Spark.\n* [spark-connect-rs](https://github.com/sjrusso8/spark-connect-rs) \u003cimg src=\"https://img.shields.io/github/last-commit/sjrusso8/spark-connect-rs.svg\"\u003e - Rust bindings.\n* [spark-connect-go](https://github.com/apache/spark-connect-go) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/spark-connect-go.svg\"\u003e - Golang bindings.\n* [spark-connect-csharp](https://github.com/mdrakiburrahman/spark-connect-csharp) \u003cimg src=\"https://img.shields.io/github/last-commit/mdrakiburrahman/spark-connect-csharp.svg\"\u003e - C# bindings.\n\n### Notebooks and IDEs\n* [almond](https://almond.sh/) \u003cimg src=\"https://img.shields.io/github/last-commit/almond-sh/almond.svg\"\u003e - A scala kernel for [Jupyter](https://jupyter.org/).\n* [Apache Zeppelin](https://zeppelin.incubator.apache.org/) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/zeppelin.svg\"\u003e - Web-based notebook that enables interactive data analytics with plugable backends, integrated plotting, and extensive Spark support out-of-the-box.\n* [Polynote](https://polynote.org/)  \u003cimg src=\"https://img.shields.io/github/last-commit/polynote/polynote.svg\"\u003e - Polynote: an IDE-inspired polyglot notebook. It supports mixing multiple languages in one notebook, and sharing data between them seamlessly. It encourages reproducible notebooks with its immutable data model. Originating from [Netflix](https://medium.com/netflix-techblog/open-sourcing-polynote-an-ide-inspired-polyglot-notebook-7f929d3f447).\n* [sparkmagic](https://github.com/jupyter-incubator/sparkmagic) \u003cimg src=\"https://img.shields.io/github/last-commit/jupyter-incubator/sparkmagic.svg\"\u003e - [Jupyter](https://jupyter.org/) magics and kernels for working with remote Spark clusters, for interactively working with remote Spark clusters through [Livy](https://github.com/cloudera/livy), in Jupyter notebooks.\n\n### General Purpose Libraries\n\n* [itachi](https://github.com/yaooqinn/itachi) \u003cimg src=\"https://img.shields.io/github/last-commit/yaooqinn/itachi.svg\"\u003e - A library that brings useful functions from modern database management systems to Apache Spark.\n* [spark-daria](https://github.com/mrpowers-io/spark-daria) \u003cimg src=\"https://img.shields.io/github/last-commit/mrpowers-io/spark-daria.svg\"\u003e - A Scala library with essential Spark functions and extensions to make you more productive.\n* [quinn](https://github.com/mrpowers-io/quinn) \u003cimg src=\"https://img.shields.io/github/last-commit/mrpowers-io/quinn.svg\"\u003e - A native PySpark implementation of spark-daria.\n* [Apache DataFu](https://github.com/apache/datafu/tree/master/datafu-spark) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/datafu.svg\"\u003e - A library of general purpose functions and UDF's.\n* [Joblib Apache Spark Backend](https://github.com/joblib/joblib-spark) \u003cimg src=\"https://img.shields.io/github/last-commit/joblib/joblib-spark.svg\"\u003e - [`joblib`](https://github.com/joblib/joblib) backend for running tasks on Spark clusters.\n\n### SQL Data Sources\n\nSparkSQL has [serveral built-in Data Sources](https://spark.apache.org/docs/latest/sql-data-sources-load-save-functions.html#manually-specifying-options) for files. These include `csv`, `json`, `parquet`, `orc`, and `avro`. It also supports JDBC databases as well as Apache Hive. Additional data sources can be added by including the packages listed below, or writing your own.\n\n* [Spark XML](https://github.com/databricks/spark-xml) \u003cimg src=\"https://img.shields.io/github/last-commit/databricks/spark-xml.svg\"\u003e - XML parser and writer.\n* [Spark Cassandra Connector](https://github.com/datastax/spark-cassandra-connector) \u003cimg src=\"https://img.shields.io/github/last-commit/datastax/spark-cassandra-connector.svg\"\u003e - Cassandra support including data source and API and support for arbitrary queries.\n* [Mongo-Spark](https://github.com/mongodb/mongo-spark) \u003cimg src=\"https://img.shields.io/github/last-commit/mongodb/mongo-spark.svg\"\u003e - Official MongoDB connector.\n\n### Storage\n\n* [Delta Lake](https://github.com/delta-io/delta) \u003cimg src=\"https://img.shields.io/github/last-commit/delta-io/delta.svg\"\u003e - Storage layer with ACID transactions.\n* [Apache Hudi](https://github.com/apache/hudi) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/hudi.svg\"\u003e - Upserts, Deletes And Incremental Processing on Big Data..\n* [Apache Iceberg](https://github.com/apache/iceberg) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/iceberg.svg\"\u003e - Upserts, Deletes And Incremental Processing on Big Data..\n* [lakeFS](https://docs.lakefs.io/integrations/spark.html) \u003cimg src=\"https://img.shields.io/github/last-commit/treeverse/lakefs.svg\"\u003e - Integration with the lakeFS atomic versioned storage layer.\n\n### Bioinformatics\n\n* [ADAM](https://github.com/bigdatagenomics/adam) \u003cimg src=\"https://img.shields.io/github/last-commit/bigdatagenomics/adam.svg\"\u003e - Set of tools designed to analyse genomics data.\n* [Hail](https://github.com/hail-is/hail) \u003cimg src=\"https://img.shields.io/github/last-commit/hail-is/hail.svg\"\u003e - Genetic analysis framework.\n\n### GIS\n\n* [Apache Sedona](https://github.com/apache/incubator-sedona) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/incubator-sedona.svg\"\u003e - Cluster computing system for processing large-scale spatial data.\n\n### Graph Processing\n\n* [GraphFrames](https://github.com/graphframes/graphframes) \u003cimg src=\"https://img.shields.io/github/last-commit/graphframes/graphframes.svg\"\u003e - Data frame based graph API.\n* [neo4j-spark-connector](https://github.com/neo4j-contrib/neo4j-spark-connector) \u003cimg src=\"https://img.shields.io/github/last-commit/neo4j-contrib/neo4j-spark-connector.svg\"\u003e - Bolt protocol based, Neo4j Connector with RDD, DataFrame and GraphX / GraphFrames support.\n\n### Machine Learning Extension\n\n* [Apache SystemML](https://systemml.apache.org/) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/systemml.svg\"\u003e - Declarative machine learning framework on top of Spark.\n* [Mahout Spark Bindings](https://mahout.apache.org/users/sparkbindings/home.html) \\[status unknown\\] - linear algebra DSL and optimizer with R-like syntax.\n* [KeystoneML](http://keystone-ml.org/) - Type safe machine learning pipelines with RDDs.\n* [JPMML-Spark](https://github.com/jpmml/jpmml-spark) \u003cimg src=\"https://img.shields.io/github/last-commit/jpmml/jpmml-spark.svg\"\u003e - PMML transformer library for Spark ML.\n* [ModelDB](https://mitdbg.github.io/modeldb) \u003cimg src=\"https://img.shields.io/github/last-commit/mitdbg/modeldb.svg\"\u003e - A system to manage machine learning models for `spark.ml` and [`scikit-learn`](https://github.com/scikit-learn/scikit-learn) \u003cimg src=\"https://img.shields.io/github/last-commit/scikit-learn/scikit-learn.svg\"\u003e.\n* [Sparkling Water](https://github.com/h2oai/sparkling-water) \u003cimg src=\"https://img.shields.io/github/last-commit/h2oai/sparkling-water.svg\"\u003e -  [H2O](http://www.h2o.ai/) interoperability layer.\n* [BigDL](https://github.com/intel-analytics/BigDL) \u003cimg src=\"https://img.shields.io/github/last-commit/intel-analytics/BigDL.svg\"\u003e - Distributed Deep Learning library.\n* [MLeap](https://github.com/combust/mleap) \u003cimg src=\"https://img.shields.io/github/last-commit/combust/mleap.svg\"\u003e - Execution engine and serialization format which supports deployment of `o.a.s.ml` models without dependency on `SparkSession`.\n* [Microsoft ML for Apache Spark](https://github.com/Azure/mmlspark) \u003cimg src=\"https://img.shields.io/github/last-commit/Azure/mmlspark.svg\"\u003e - A distributed ml library with support for LightGBM, Vowpal Wabbit, OpenCV, Deep Learning, Cognitive Services, and Model Deployment.\n* [MLflow](https://mlflow.org/docs/latest/python_api/mlflow.spark.html#module-mlflow.spark) \u003cimg src=\"https://img.shields.io/github/last-commit/mlflow/mlflow.svg\"\u003e - Machine learning orchestration platform. \n\n### Middleware\n\n* [Livy](https://github.com/apache/incubator-livy) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/incubator-livy.svg\"\u003e - REST server with extensive language support (Python, R, Scala), ability to maintain interactive sessions and object sharing.\n* [spark-jobserver](https://github.com/spark-jobserver/spark-jobserver) \u003cimg src=\"https://img.shields.io/github/last-commit/spark-jobserver/spark-jobserver.svg\"\u003e - Simple Spark as a Service which supports objects sharing using so called named objects. JVM only.\n* [Apache Toree](https://github.com/apache/incubator-toree) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/incubator-toree.svg\"\u003e - IPython protocol based middleware for interactive applications.\n* [Apache Kyuubi](https://github.com/apache/kyuubi) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/kyuubi.svg\"\u003e - A distributed multi-tenant JDBC server for large-scale data processing and analytics, built on top of Apache Spark.\n\n### Monitoring\n\n* [Data Mechanics Delight](https://github.com/datamechanics/delight) \u003cimg src=\"https://img.shields.io/github/last-commit/datamechanics/delight.svg\"\u003e - Cross-platform monitoring tool (Spark UI / Spark History Server replacement).\n\n### Utilities\n\n* [sparkly](https://github.com/Tubular/sparkly) \u003cimg src=\"https://img.shields.io/github/last-commit/Tubular/sparkly.svg\"\u003e - Helpers \u0026 syntactic sugar for PySpark.\n* [Flintrock](https://github.com/nchammas/flintrock) \u003cimg src=\"https://img.shields.io/github/last-commit/nchammas/flintrock.svg\"\u003e - A command-line tool for launching Spark clusters on EC2.\n* [Optimus](https://github.com/ironmussa/Optimus/) \u003cimg src=\"https://img.shields.io/github/last-commit/ironmussa/Optimus.svg\"\u003e - Data Cleansing and Exploration utilities with the goal of simplifying data cleaning.\n\n### Natural Language Processing\n\n* [spark-nlp](https://github.com/JohnSnowLabs/spark-nlp) \u003cimg src=\"https://img.shields.io/github/last-commit/JohnSnowLabs/spark-nlp.svg\"\u003e - Natural language processing library built on top of Apache Spark ML.\n\n### Streaming\n\n* [Apache Bahir](https://bahir.apache.org/) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/bahir.svg\"\u003e - Collection of the streaming connectors excluded from Spark 2.0 (Akka, MQTT, Twitter. ZeroMQ).\n\n### Interfaces\n\n* [Apache Beam](https://beam.apache.org/) \u003cimg src=\"https://img.shields.io/github/last-commit/apache/beam.svg\"\u003e - Unified data processing engine supporting both batch and streaming applications. Apache Spark is one of the supported execution environments.\n* [Koalas](https://github.com/databricks/koalas) \u003cimg src=\"https://img.shields.io/github/last-commit/databricks/koalas.svg\"\u003e - Pandas DataFrame API on top of Apache Spark.\n\n### Data quality\n\n* [deequ](https://github.com/awslabs/deequ) \u003cimg src=\"https://img.shields.io/github/last-commit/awslabs/deequ.svg\"\u003e - Deequ is a library built on top of Apache Spark for defining \"unit tests for data\", which measure data quality in large datasets.\n* [python-deequ](https://github.com/awslabs/python-deequ) \u003cimg src=\"https://img.shields.io/github/last-commit/awslabs/python-deequ.svg\"\u003e - Python API for Deequ.\n\n### Testing\n\n* [spark-testing-base](https://github.com/holdenk/spark-testing-base) \u003cimg src=\"https://img.shields.io/github/last-commit/holdenk/spark-testing-base.svg\"\u003e - Collection of base test classes.\n* [spark-fast-tests](https://github.com/mrpowers-io/spark-fast-tests) \u003cimg src=\"https://img.shields.io/github/last-commit/mrpowers-io/spark-fast-tests.svg\"\u003e - A lightweight and fast testing framework.\n* [chispa](https://github.com/MrPowers/chispa) \u003cimg src=\"https://img.shields.io/github/last-commit/MrPowers/chispa.svg\"\u003e - PySpark test helpers with beautiful error messages.\n\n### Web Archives\n\n* [Archives Unleashed Toolkit](https://github.com/archivesunleashed/aut) \u003cimg src=\"https://img.shields.io/github/last-commit/archivesunleashed/aut.svg\"\u003e -  Open-source toolkit for analyzing web archives.\n\n### Workflow Management\n\n* [Cromwell](https://github.com/broadinstitute/cromwell#spark-backend) \u003cimg src=\"https://img.shields.io/github/last-commit/broadinstitute/cromwell.svg\"\u003e - Workflow management system with [Spark backend](https://github.com/broadinstitute/cromwell#spark-backend).\n\n## Resources\n\n### Books\n\n* [Learning Spark, 2nd Edition](https://www.oreilly.com/library/view/learning-spark-2nd/9781492050032/) - Introduction to Spark API with Spark 3.0 covered. Good source of knowledge about basic concepts.\n* [Advanced Analytics with Spark](http://shop.oreilly.com/product/0636920035091.do) - Useful collection of Spark processing patterns. Accompanying GitHub repository: [sryza/aas](https://github.com/sryza/aas).\n* [Mastering Apache Spark](https://jaceklaskowski.gitbooks.io/mastering-apache-spark/) - Interesting compilation of notes by [Jacek Laskowski](https://github.com/jaceklaskowski). Focused on different aspects of Spark internals.\n* [Spark in Action](https://www.manning.com/books/spark-in-action) - New book in the Manning's \"in action\" family with +400 pages. Starts gently, step-by-step and covers large number of topics. Free excerpt on how to [setup Eclipse for Spark application development](http://freecontent.manning.com/how-to-start-developing-spark-applications-in-eclipse/) and how to bootstrap a new application using the provided Maven Archetype. You can find the accompanying GitHub repo [here](https://github.com/spark-in-action/first-edition).\n\n### Papers\n\n* [Large-Scale Intelligent Microservices](https://arxiv.org/pdf/2009.08044.pdf) - Microsoft paper that presents an Apache Spark-based micro-service orchestration framework that extends database operations to include web service primitives.\n* [Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing](https://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf) - Paper introducing a core distributed memory abstraction.\n* [Spark SQL: Relational Data Processing in Spark](https://amplab.cs.berkeley.edu/wp-content/uploads/2015/03/SparkSQLSigmod2015.pdf) - Paper introducing relational underpinnings, code generation and Catalyst optimizer.\n* [Structured Streaming: A Declarative API for Real-Time Applications in Apache Spark](https://cs.stanford.edu/~matei/papers/2018/sigmod_structured_streaming.pdf) - Structured Streaming is a new high-level streaming API, it is a declarative API based on automatically incrementalizing a static relational query.\n\n### MOOCS\n\n* [Data Science and Engineering with Apache Spark (edX XSeries)](https://www.edx.org/xseries/data-science-engineering-apache-spark) - Series of five courses ([Introduction to Apache Spark](https://www.edx.org/course/introduction-apache-spark-uc-berkeleyx-cs105x), [Distributed Machine Learning with Apache Spark](https://www.edx.org/course/distributed-machine-learning-apache-uc-berkeleyx-cs120x), [Big Data Analysis with Apache Spark](https://www.edx.org/course/big-data-analysis-apache-spark-uc-berkeleyx-cs110x), [Advanced Apache Spark for Data Science and Data Engineering](https://www.edx.org/course/advanced-apache-spark-data-science-data-uc-berkeleyx-cs115x), [Advanced Distributed Machine Learning with Apache Spark](https://www.edx.org/course/advanced-distributed-machine-learning-uc-berkeleyx-cs125x)) covering different aspects of software engineering and data science. Python oriented.\n* [Big Data Analysis with Scala and Spark (Coursera)](https://www.coursera.org/learn/big-data-analysys) - Scala oriented introductory course. Part of [Functional Programming in Scala Specialization](https://www.coursera.org/specializations/scala).\n\n### Workshops\n\n* [AMP Camp](http://ampcamp.berkeley.edu) - Periodical training event organized by the [UC Berkeley AMPLab](https://amplab.cs.berkeley.edu/). A source of useful exercise and recorded workshops covering different tools from the [Berkeley Data Analytics Stack](https://amplab.cs.berkeley.edu/software/).\n\n### Projects Using Spark\n\n* [Oryx 2](https://github.com/OryxProject/oryx) - [Lambda architecture](http://lambda-architecture.net/) platform built on Apache Spark and [Apache Kafka](http://kafka.apache.org/) with specialization for real-time large scale machine learning.\n* [Photon ML](https://github.com/linkedin/photon-ml) - A machine learning library supporting classical Generalized Mixed Model and Generalized Additive Mixed Effect Model.\n* [PredictionIO](https://prediction.io/) - Machine Learning server for developers and data scientists to build and deploy predictive applications in a fraction of the time.\n* [Crossdata](https://github.com/Stratio/Crossdata) - Data integration platform with extended DataSource API and multi-user environment.\n\n\n### Docker Images\n\n- [apache/spark](https://hub.docker.com/r/apache/spark) - Apache Spark Official Docker images.\n- [jupyter/docker-stacks/pyspark-notebook](https://github.com/jupyter/docker-stacks/tree/master/pyspark-notebook) - PySpark with Jupyter Notebook and Mesos client.\n- [sequenceiq/docker-spark](https://github.com/sequenceiq/docker-spark) - Yarn images from [SequenceIQ](http://www.sequenceiq.com/).\n- [datamechanics/spark](https://hub.docker.com/r/datamechanics/spark) - An easy to setup Docker image for Apache Spark from [Data Mechanics](https://www.datamechanics.co/).\n\n### Miscellaneous\n\n- [Spark with Scala Gitter channel](https://gitter.im/spark-scala/Lobby) - \"_A place to discuss and ask questions about using Scala for Spark programming_\" started by [@deanwampler](https://github.com/deanwampler).\n- [Apache Spark User List](http://apache-spark-user-list.1001560.n3.nabble.com/) and [Apache Spark Developers List](http://apache-spark-developers-list.1001551.n3.nabble.com/) - Mailing lists dedicated to usage questions and development topics respectively.\n\n## References\n\n\u003cp id=\"wikipedia-2017\"\u003eWikipedia. 2017. “Apache Spark — Wikipedia, the Free Encyclopedia.” \u003ca href=\"https://en.wikipedia.org/w/index.php?title=Apache_Spark\u0026amp;oldid=781182753\" class=\"uri\"\u003ehttps://en.wikipedia.org/w/index.php?title=Apache_Spark\u0026amp;oldid=781182753\u003c/a\u003e.\u003c/p\u003e\n\n## License\n\n\u003cp xmlns:dct=\"http://purl.org/dc/terms/\"\u003e\n\u003ca rel=\"license\" href=\"http://creativecommons.org/publicdomain/mark/1.0/\"\u003e\n\u003cimg src=\"https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/publicdomain.svg\"\n     style=\"border-style: none;\" alt=\"Public Domain Mark\" /\u003e\n\u003c/a\u003e\n\u003cbr /\u003e\nThis work (\u003cspan property=\"dct:title\"\u003eAwesome Spark\u003c/span\u003e, by \u003ca href=\"https://github.com/awesome-spark/awesome-spark\" rel=\"dct:creator\"\u003ehttps://github.com/awesome-spark/awesome-spark\u003c/a\u003e), identified by \u003ca href=\"https://github.com/zero323\" rel=\"dct:publisher\"\u003e\u003cspan property=\"dct:title\"\u003eMaciej Szymkiewicz\u003c/span\u003e\u003c/a\u003e, is free of known copyright restrictions.\n\u003c/p\u003e\n\nApache Spark, Spark, Apache, and the Spark logo are \u003ca href=\"https://www.apache.org/foundation/marks/\"\u003etrademarks\u003c/a\u003e of\n  \u003ca href=\"http://www.apache.org\"\u003eThe Apache Software Foundation\u003c/a\u003e. This compilation is not endorsed by The Apache Software Foundation.\n\nInspired by [sindresorhus/awesome](https://github.com/sindresorhus/awesome).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fawesome-spark%2Fawesome-spark","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fawesome-spark%2Fawesome-spark","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fawesome-spark%2Fawesome-spark/lists"}