{"id":21930734,"url":"https://github.com/previsionio/prevision-component-template","last_synced_at":"2026-05-13T05:35:38.567Z","repository":{"id":86604787,"uuid":"395631350","full_name":"previsionio/prevision-component-template","owner":"previsionio","description":"Template to create custom pipeline components","archived":false,"fork":false,"pushed_at":"2021-08-31T08:37:39.000Z","size":338,"stargazers_count":1,"open_issues_count":0,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-27T12:21:17.833Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/previsionio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-13T11:36:49.000Z","updated_at":"2021-10-08T07:38:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"c9f45c4a-a662-4cab-983f-8d25658ce08e","html_url":"https://github.com/previsionio/prevision-component-template","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/previsionio%2Fprevision-component-template","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/previsionio%2Fprevision-component-template/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/previsionio%2Fprevision-component-template/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/previsionio%2Fprevision-component-template/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/previsionio","download_url":"https://codeload.github.com/previsionio/prevision-component-template/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244959443,"owners_count":20538626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-28T23:11:15.190Z","updated_at":"2026-05-13T05:35:38.523Z","avatar_url":"https://github.com/previsionio.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Component Tutorial\n\n## Why am I here ?\n\nYou want to create a custom component to be use in a Prevision.io \npipeline.  \n\nA custom component is a piece of code mostly use for transforming dataset.\n\n## General setup\n\nIn order to build a component and use it as a custom component in a pipeline, you need 3 files :\n\n- A Dockerfile\n- A yaml description file\n- A python src code with arguments\n\nNote that only the Dockerfile and yaml file are mandatory.\n\nA default Dockerfile is provided and you do not need to modify it for simple component. For more complex components, with many modules, you could have to.\n\n## What you going to do ?\n\nBuilding a custom component implies 6 steps : \n\n- Setup the env\n- Write a script with arguments\n- Test it locally\n- Write the associated yaml file\n- Upload your project to a gitlab or github repo\n- Install it into your pipeline component library from your repo\n\n\n### Requirement\n\n* You need a gitlab or github account to install your own component \n* You need to set it up in your profile page ( under the \"Federated Identity\" tab)\n\n\n![Accessing your profile](profile.png)\n\n## Process\n\nNote that most of the component process is the same that [kubeflow components](https://www.kubeflow.org/docs/components/pipelines/sdk/component-development/)\n\n### Setup the env\n\nClone this repo :\n\n```\ngit clone  https://github.com/previsionio/prevision-component-template.git my-component\ncd my-component\n```\n\nBe sure to keep an isolated environment in order in order to not forget any dependencies\n\n```\npython3 -m venv env\nsource env/bin/activate\npip install -r requirements.txt\n```\n\nYou need a requirements.txt file to list all your python modules. A default one is provided \n\n**All components must always have a requirements.txt file even if there is no requirements. If you have no requirements, update the Dockerfile to remove the install phase :**\n\n`RUN python -m pip install -r /requirements.txt`\n\n**or create an empty one if your component does not have requirements**\n\nThen test that requirements meet your needs :\n\n```\npython src/main.py\n```\n\n### Write the script\n\nWhen your is ready, you can write your script in main.py. You can change the name of this script but then you need to update it in the component.yaml file \n\nYour script must use args as input ( see [this excellent tutorial about Command Line Interfaces](https://realpython.com/command-line-interfaces-python-argparse/)) then each arguments of your script has to be written down in the component.yaml.\n\n### Test your script\n\nThere is two test to run before commiting your work.\n\nFirst check that using your code in commande line works with some data :\n\n```\npython src/main.py --arg-1 value1 --src ~/Documents/Dataset/Tabular/data.csv  --dst  output/resultat.csv\n```\n\nCheck that the command you used for test ( `python src/main.py` ) is the one provided in the yaml file and check the name of your args and the mandatory one\n\n```\nimplementation:\n  container:\n    command: [python, /src/main.py]\n    args: [\n      --src, {inputPath: src},\n      --dst, {outputPath: dst}, \n      --arg-1, {inputValue: arg_1}\n      ]\n``` \n\nAll arguments of your scripts must be in the implementation section of the component.yaml AND they must be mapped with the inputs section.\n\nThe inputs section is what the user gonna see when using your component and the implementation section is the way you components is going to be ran when used in a scheduled pipeline.\n\n\nIf you are a power user you can [install Docker](https://docs.docker.com/engine/install/ubuntu/) and you can check that your Dockerfile is ok with the following command\n\n```sh\ndocker build -t my-component .\ndocker images\ndocker run -d  my-component\n```\n\nCheck that some basic  command line allow to run your transformation on test data :\n\n### List your args in the yaml file\n\nA component.yaml file is provided for example.\n\nThe yaml file should reflect how to use your script :\n- how to run it : implementation -\u003e container -\u003e command\n- how to pass the parameters :  implementation -\u003e container -\u003e command\n- how to map the parameters :  implementation -\u003e container -\u003e args\n\nThe `name`, `label` and `description` are  what your user will see in the component library.\n\nThe `inputs` are the description of your variable.\n\n![Mapping your variables](screenshot.png)\n\nThe `implementation` fields is how the script will be run. The `args` subfield is how to map the user parameter to your script args. It's a list of element separated by `,`. The command will be composed by inlining them (and replacing those in {} by the corresponding input)\n\n`--args-name, {type: input_name},`\n\n\n- args-name : the name of the args parsed by your scripts\n- type :  one of `inputPath, outputPath, inputValue`. \n  - inputValue will be used in the UX as a user defined parameter. \n  - inputPath is the the mapping of node input to argument name. For example ` --src, {inputPath: src}` means that the input of your node will be mapped to the `src` argument\n  - outputPath is the mapping of your output\n\ninputPath and outputPath will be auto filled by the Scheduler when a pipeline Template is ran.\n\n\n### Optionnal : write your Dockerfile\n\nA default Dockerfile is provided. It works in most of case but you're free to adapt it.\n\n### Commit and push your component\n\nCommit all your work and push it to your git repo ( github or gitlab ). Note the name of the repo and the branch will be asked when installing the component in your Prevision library.\n\n### Install it in your account\n\nGo to your prevision account :\n\n- pipelines \u003e pipeline Components \u003e New Pipeline Components\n\nAnd input your repo information.\n\nIf everything is ok, the component status will be list as `done` in about 5 minutes.\n\nNote that the component is not ran when installed. It only checks that the dockerFile works fine and all parameters in component.yaml are fine.\n\nThe component will be be ran only when a pipeline template using the component is scheduled to run.\n\n## Trouble shooting\n\nIf the status is not done after a few minutes ( ~10mn ) something went wrong ( except if you got a lot of module to install and setup in your Dockerfile )\n\n- check that your variable name in the yaml file and the argparser in python source code are the same.\n- check your hyphen and underscore ('-' and '_') in input name and arguments name\n- check that all file are at the right place and that the command in the yaml file use the path built in the dockerfile \n\n```\n# in yaml file\ncommand: [python, /src/main.py]\n```\n\nmust use the same path than \n\n```\nCOPY requirements.txt /requirements.txt\nRUN python -m pip install -r /requirements.txt\n\nCOPY . /\n```\n- you cannot use positionnal parameters in args. Check that your script does not require positionnal argument.\n- IMPORTANT : if your component save a file ( output path ), it must create the directory structure. For example :\n\n```\nPath(args.output_dataset_path).parent.mkdir(parents=True, exist_ok=True)\ndf.to_csv(args.output_dataset_path, index=False)\n```\n\nIf the components failed while executed, a log is provided in the \"Schedule run\" section, for each component unn the pipeline.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprevisionio%2Fprevision-component-template","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprevisionio%2Fprevision-component-template","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprevisionio%2Fprevision-component-template/lists"}