{"id":23779515,"url":"https://github.com/saagie/proto-job-integration","last_synced_at":"2026-04-10T11:30:16.787Z","repository":{"id":39940773,"uuid":"254054517","full_name":"saagie/proto-job-integration","owner":"saagie","description":null,"archived":false,"fork":false,"pushed_at":"2022-05-20T21:32:24.000Z","size":92,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-01-01T10:19:01.027Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Kotlin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/saagie.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-04-08T10:19:31.000Z","updated_at":"2020-04-08T10:20:37.000Z","dependencies_parsed_at":"2022-08-19T14:31:56.834Z","dependency_job_id":null,"html_url":"https://github.com/saagie/proto-job-integration","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saagie%2Fproto-job-integration","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saagie%2Fproto-job-integration/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saagie%2Fproto-job-integration/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saagie%2Fproto-job-integration/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/saagie","download_url":"https://codeload.github.com/saagie/proto-job-integration/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239979372,"owners_count":19728507,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-01T10:18:59.841Z","updated_at":"2026-04-10T11:30:16.754Z","avatar_url":"https://github.com/saagie.png","language":"Kotlin","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# POC - Job management \n\nThis POC consists on a simple library which will provide elementary functionalities to handle your jobs management.    \nThis management is made by delegation, using one of our partner's software :    \n- Knime    \n- Trifacta    \n- Dataiku    \n- Nifi  \n- AWS Glue\n    \n## Content \n### Modules \nThis project can be logically splitted into three parts :  \n  \n1) *The business logic* (`domain`) : Defines all informations required to define what a job is (`Job` = name, project, id, status), and    \nhow we can manage it (`JobManager`).    \n    \n2) *The implementation parts* (`infra.right`) : Contains all API requests and DTO to make it run with our job managers.    \n    \n3) *The demo apps* (`infra.left`) : Simple apps to manipulate all available commands.    \n    \n### Concepts    \n - **Jobs** : A Job is a specific action which will produces an output result on a given data input.    \n- **Dataset** : A dataset is used to store a specified quantity of data. A dataset is usually created from an import process,    \nor as the result of a given jobs's execution. *(In some apps, jobs are related to datasets as their source.)* \n- **Project** : A project is a group of datasets and jobs, and describes how they're related on to another, in order to produce    \nthe solution to given problem.    \n    \n### Compatibility \n#### Concepts \nDescribes the matching between a given app's concepts and ours.    \n    \n|Concept|Dataiku|Trifacta|Knime|Nifi|AWS Glue\n|:-:|:-:|:-:|:-:|:-:|:-:\n|Project|Project| ---| Workflow| ProcessGroup| ---    \n|Dataset|Dataset|WrangledDataset|---|---|---    \n|Job|Job|JobGroup|Job| Processor | Job\n    \n#### Methods \nDescribes which methods are currently available for each app.    \n    \n|Functionnality|Dataiku|Trifacta|Knime|Nifi|AWS Glue  \n|:-|:-:|:-:|:-:|:-:|:-:\n|Retrieve all projects | OK | --| OK | OK | --\n|Retrieve all datasets for a given project | OK | OK | -- | -- | --\n|Retrieve all jobs for a given project | OK | OK | OK | OK | OK \n|Retrieve a job with a specific ID | OK | OK | OK | OK | OK \n|Retrieve a job's current status | OK | OK | OK | OK | OK \n|Start a specific job| OK | OK | OK | OK | OK \n|Stop a given job| OK | -- | -- | OK | OK \n|Import a job| -- | -- | OK | OK | -- \n|Export a job| -- | OK | OK | OK | OK \n|Import a project| -- | -- | -- | OK | --\n|Export a project| -- | -- | -- | OK | --\n|Security profile|`basic`|`basic`|`basic`|`none`|`none`*\n\n*Securization base on a specific system, not by the use of an option decided by profile at runtime.\n    \n## Demonstration tools \n\nBy using a correct spring profile, you can select which demonstration tool to use, for a rapid test of the functionalities :    \n- `demo` : Consists of an interactive demo which commands are described below.    \n- `starter` : An automatic execution of all library's methods, with a simple display of the results.    \nNote that it will require two additionnal parameters as environment variable (`PREDEFINED_PROJECT` and `PREDIFINED_JOB`) to function.    \n    \nAnd to select your app, you can add (only one of them) : `dataiku`, `trifacta` or `knime`.    \n    \nTo modify the profile at launch, you should use a command like :    \n`java -Dspring.profiles.active=dataiku,demo -jar {YOUR_JAR}` *(In this example, we'd use the `demo` and `dataiku` profile)*    \n    \n ### Interactive demo    \n *In this demo, every job or project will be displayed with a specific number like : `$ID : $OBJECT`.    \n This number will define a local ID, which is required by the local demo tool to realize some of the other functionalities*    \n **At any time** - `projects` : Displays a list of all projects registered on the selected platform. (In Trifacta, as the *'project'* notion doesn't exist, only the value DEFAULT will be displayed.)    \n  \n**After setting the project** - `use $PROJECT_ID` : Set which project to use for the next commands.    \n- `download $PROJECT_ID` : Retrieve the selected project as a JSON file ready to be imported elsewhere.  \n- `upload $PROJECT_ID` : Create a new project based on the previously downloaded project. *Only usable after a first download !*  \n- `datasets` :  Displays a list of all datasets.    \n- `jobs` : Displays a list of all jobs and store their informations in order to manipulate them later.     \n    \n**After setting the project and getting all jobs** - `status $JOB_ID` : Give the current status of the selected job    \n- `start $JOB_ID` : Starts the specified job.    \n- `stop $JOB_ID` : Stops the given job if currently running. *Useless if the job's already done.*  \n- `export $JOB_ID` : Retrieve the selected job as a JSON file ready to be imported in an other project.  \n- `import $JOB_ID` : Create a new job based on the previously exported job. *Only usable after a first job export !*  \n  \n## Dev' setup **KISS**  \n  \n1) In **IntelliJ**, open `Run` \u003e `Edit configurations`    \n 2) Add a new Spring Boot configuration (+), and specify the main class of the app `io.saagie.poc.infra.AppKt`    \n 3) *(Optional, if specified at launch)* Change the current app's action by changing the `Active profiles` attribute.    \n    \n4) Update your environment varaibles as needed by your app in the `Override parameters` menu. :  \n- a)   \n*Most of the time,you'll only need to change the service's URL   \n(Please check the `src/main/resources/application.yml` for the exact syntax).*  \n  \n- b) Select with a profile argument (`spring.profile.active`), which kind of security to use :  \n  - `none` : No security, I hope you know what you're doing :)  \n  - `basic` : You'll have to inform your username and password under the `common` section of the `application.yml`  \n - `token` : You'll have to inform your URL under the `common` section of the `application.yml`,  \nand eventually your (username, password) if the token's request is secured with basic auth.  \n    \n5) Run it through your IDE.  \n    \n**Build and run**  \n  \nBuild the app by using `mvn clean package`, and then edit the provided utility script `start.sh`, by adding/updating, all environment variables to overload before executing the jar.   \n  \nThen, you'll be able to launch the app by using `./start.sh $PATH_TO_YOUR_JAR`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaagie%2Fproto-job-integration","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsaagie%2Fproto-job-integration","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaagie%2Fproto-job-integration/lists"}