{"id":24946393,"url":"https://github.com/machinezone/sparkml-dag","last_synced_at":"2025-04-10T05:14:43.983Z","repository":{"id":76807846,"uuid":"139511235","full_name":"machinezone/SparkML-DAG","owner":"machinezone","description":null,"archived":false,"fork":false,"pushed_at":"2020-10-13T06:49:51.000Z","size":17,"stargazers_count":4,"open_issues_count":1,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-10T05:14:39.609Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/machinezone.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-07-03T01:09:45.000Z","updated_at":"2019-05-21T01:00:13.000Z","dependencies_parsed_at":"2023-07-07T12:17:02.842Z","dependency_job_id":null,"html_url":"https://github.com/machinezone/SparkML-DAG","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/machinezone%2FSparkML-DAG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/machinezone%2FSparkML-DAG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/machinezone%2FSparkML-DAG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/machinezone%2FSparkML-DAG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/machinezone","download_url":"https://codeload.github.com/machinezone/SparkML-DAG/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248161278,"owners_count":21057555,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-02T20:24:20.072Z","updated_at":"2025-04-10T05:14:43.976Z","avatar_url":"https://github.com/machinezone.png","language":"Scala","readme":"\n# SparkML-DAG\n\nImplementation of a DAG Pipeline for SparkML.\n\n# Motivation\n\nThis library extends SparkML to allow for Pipelines that are DAG based. \nThat is multiple input datasets can be manipulated to create complex models. \nOne such example can be seen in the [test](src/test/scala/org/apache/spark/ml/feature/dag/DAGPipelineTest.scala).\n\n# Development\n\nClone this repository and run `mvn clean test`\n\nTo build for a custom version of Spark/Scala, run \n`mvn clean package \\\n-Dscala.major.version=\u003cSCALA_MAJOR\u003e \\\n-Dscala.minor.version=\u003cSCALA_MINOR\u003e\\\n-Dspark.version=\u003cSPARK_VERSION\u003e`\n\ne.g. \n```bash\nmvn clean package \\\n-Dscala.major.version=2.11 \\\n-Dscala.minor.version=2.11.8 \\\n-Dspark.version=2.3.0\n```\n\n## build profiles\n\nAlternatively one can build against a limited number of pre-defined profiles.\nSee the [pom](pom.xml) for a list of the profiles.\n\nExample build with profiles: \n\n`mvn clean package -Pspark_2.3,scala_2.11`\n\n`mvn clean package -Pspark_2.0,scala_2.10`\n\n\n# Support\n\nHere is a handy table of supported build version combinations:\n\n| Apache Spark | Scala |\n|:------------:|:-----:|\n| 2.0.x        | 2.10  |\n| 2.0.x        | 2.11  | \n| 2.1.x        | 2.10  |\n| 2.1.x        | 2.11  |\n| 2.2.x        | 2.10  |\n| 2.2.x        | 2.11  |\n| 2.3.x        | 2.11  |\n\n# License\n\nsee the [license](LICENSE) for license information.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmachinezone%2Fsparkml-dag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmachinezone%2Fsparkml-dag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmachinezone%2Fsparkml-dag/lists"}