{"id":46442501,"url":"https://github.com/keboola/python-component","last_synced_at":"2026-03-05T22:10:11.096Z","repository":{"id":39612102,"uuid":"310528226","full_name":"keboola/python-component","owner":"keboola","description":"General library for Python applications running in Keboola Connection environment","archived":false,"fork":false,"pushed_at":"2026-02-24T16:50:57.000Z","size":1801,"stargazers_count":7,"open_issues_count":9,"forks_count":2,"subscribers_count":17,"default_branch":"main","last_synced_at":"2026-02-24T20:50:08.127Z","etag":null,"topics":["component","kbc"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/keboola.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2020-11-06T07:49:47.000Z","updated_at":"2026-01-28T17:19:13.000Z","dependencies_parsed_at":"2026-02-26T05:02:22.513Z","dependency_job_id":null,"html_url":"https://github.com/keboola/python-component","commit_stats":{"total_commits":136,"total_committers":7,"mean_commits":"19.428571428571427","dds":0.3161764705882353,"last_synced_commit":"541f455c4014192d2502a9b84a70faab26e16609"},"previous_names":[],"tags_count":44,"template":false,"template_full_name":null,"purl":"pkg:github/keboola/python-component","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keboola%2Fpython-component","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keboola%2Fpython-component/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keboola%2Fpython-component/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keboola%2Fpython-component/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/keboola","download_url":"https://codeload.github.com/keboola/python-component/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/keboola%2Fpython-component/sbom","scorecard":{"id":355018,"data":{"date":"2025-08-11","repo":{"name":"github.com/keboola/python-component","commit":"b313e2a09bf76ede114e634501d57a2ee70c6bda"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":5.2,"checks":[{"name":"Code-Review","score":10,"reason":"all changesets reviewed","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":4,"reason":"5 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 4","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/deploy.yml:1","Warn: no topLevel permission defined: .github/workflows/deploy_to_test.yml:1","Warn: no topLevel permission defined: .github/workflows/push_dev.yml:1","Warn: no topLevel permission defined: .github/workflows/push_main.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/deploy.yml:13: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/deploy.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/deploy.yml:15: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/deploy.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/deploy_to_test.yml:9: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/deploy_to_test.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/deploy_to_test.yml:11: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/deploy_to_test.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/push_dev.yml:17: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/push_dev.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/push_dev.yml:19: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/push_dev.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/push_main.yml:12: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/push_main.yml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/push_main.yml:14: update your workflow using https://app.stepsecurity.io/secureworkflow/keboola/python-component/push_main.yml/main?enable=pin","Warn: pipCommand not pinned by hash: .github/workflows/deploy.yml:20","Warn: pipCommand not pinned by hash: .github/workflows/deploy.yml:21","Warn: pipCommand not pinned by hash: .github/workflows/deploy.yml:22","Warn: pipCommand not pinned by hash: .github/workflows/deploy.yml:23","Warn: pipCommand not pinned by hash: .github/workflows/deploy_to_test.yml:17","Warn: pipCommand not pinned by hash: .github/workflows/deploy_to_test.yml:18","Warn: pipCommand not pinned by hash: .github/workflows/deploy_to_test.yml:19","Warn: pipCommand not pinned by hash: .github/workflows/deploy_to_test.yml:20","Warn: pipCommand not pinned by hash: .github/workflows/push_dev.yml:24","Warn: pipCommand not pinned by hash: .github/workflows/push_dev.yml:25","Warn: pipCommand not pinned by hash: .github/workflows/push_dev.yml:26","Warn: pipCommand not pinned by hash: .github/workflows/push_main.yml:19","Info:   0 out of   8 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of  12 pipCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 30 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-18T09:19:36.987Z","repository_id":39612102,"created_at":"2025-08-18T09:19:36.987Z","updated_at":"2025-08-18T09:19:36.987Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30152112,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T21:15:50.531Z","status":"ssl_error","status_checked_at":"2026-03-05T21:15:11.173Z","response_time":93,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["component","kbc"],"created_at":"2026-03-05T22:10:06.767Z","updated_at":"2026-03-05T22:10:11.073Z","avatar_url":"https://github.com/keboola.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Keboola Python Component library\n\n## Table of Contents\n\n- [Keboola Python Component library](#keboola-python-component-library)\n  - [Table of Contents](#table-of-contents)\n  - [Introduction](#introduction)\n    - [Links](#links)\n- [Quick start](#quick-start)\n  - [Installation](#installation)\n  - [For Developers](#for-developers)\n  - [Core structure \\\u0026 functionality](#core-structure--functionality)\n  - [CommonInterface](#commoninterface)\n  - [Initialization](#initialization)\n  - [Loading configuration parameters:](#loading-configuration-parameters)\n  - [Processing input tables - Manifest vs I/O mapping](#processing-input-tables---manifest-vs-io-mapping)\n    - [Manifest \\\u0026 input folder content](#manifest--input-folder-content)\n    - [Using I/O mapping](#using-io-mapping)\n  - [I/O table manifests and processing results](#io-table-manifests-and-processing-results)\n    - [Get input table by name](#get-input-table-by-name)\n  - [Working with Input/Output Mapping](#working-with-inputoutput-mapping)\n    - [Accessing Input Tables from Mapping](#accessing-input-tables-from-mapping)\n    - [Creating Output Tables based on Output Mapping](#creating-output-tables-based-on-output-mapping)\n    - [Combining Input and Output Mapping](#combining-input-and-output-mapping)\n  - [Processing input files](#processing-input-files)\n    - [Grouping Files by Tags](#grouping-files-by-tags)\n    - [Creating Output Files](#creating-output-files)\n  - [Processing state files](#processing-state-files)\n  - [Logging](#logging)\n- [ComponentBase](#componentbase)\n  - [Table Schemas in ComponentBase](#table-schemas-in-componentbase)\n    - [JSON Table Schema example file](#json-table-schema-example-file)\n    - [Out table definition from schema example](#out-table-definition-from-schema-example)\n- [Sync Actions](#sync-actions)\n  - [Creating Sync Actions](#creating-sync-actions)\n  - [Returning Data from Sync Actions](#returning-data-from-sync-actions)\n  - [Validation Message Action](#validation-message-action)\n      - [No output](#no-output)\n  - [License](#license)\n\n\n## Introduction\n\n![Build \u0026 Test](https://github.com/keboola/python-component/workflows/Build%20\u0026%20Test/badge.svg)\n[![Code Climate](https://codeclimate.com/github/keboola/python-component/badges/gpa.svg)](https://codeclimate.com/github/keboola/python-component)\n[![PyPI version](https://badge.fury.io/py/keboola.component.svg)](https://badge.fury.io/py/keboola.component)\n\nThis library provides a Python wrapper over the\n[Keboola Common Interface](https://developers.keboola.com/extend/common-interface/). It simplifies all tasks related to\nthe communication of the [Docker component](https://developers.keboola.com/extend/component/) with the Keboola\nConnection that is defined by the Common Interface. Such tasks are config manipulation, validation, component state, I/O\nhandling, I/O metadata and manifest files, logging, etc.\n\nIt is being developed by the Keboola Data Services team and officially supported by Keboola. It aims to simplify the\nKeboola Component creation process, by removing the necessity of writing boilerplate code to manipulate with the Common\nInterface.\n\nAnother useful use-case is within the Keboola [Python Transformations](https://help.keboola.com/transformations/python/)\nto simplify the I/O handling.\n\n### Links\n\n- [PyPI](https://pypi.org/project/keboola.component/) \u0026 [TestPyPI](https://test.pypi.org/project/keboola.component/)\n- [API Documentation](https://keboola.github.io/python-component/interface.html)\n- [Keboola Components for developers](https://developers.keboola.com/extend/component/)\n- [Python Component Cookiecutter template project](https://github.com/keboola/cookiecutter-python-component)\n\n# Quick start\n\n## Installation\n\nThe package may be installed via uv 💜:\n\n ```\nuv add keboola.component\n```\n\n## For Developers\n\n\u003e **Note for contributors:** Before creating a pull request, make sure to manually run the\n\u003e [Generate Documentation](https://github.com/keboola/python-component/actions/workflows/generate-docs.yml)\n\u003e workflow on your branch. This ensures the API documentation in the `docs/` folder is up-to-date\n\u003e with your code changes. The workflow can be triggered from the Actions tab on GitHub.\n\n## Core structure \u0026 functionality\n\nThe package contains two core modules:\n\n- `keboola.component.interface` - Core methods and class to initialize and handle\n  the [Keboola Common Interface](https://developers.keboola.com/extend/common-interface/) tasks\n- `keboola.component.dao` - Data classes and containers for objects defined by the Common Interface such as manifest\n  files, metadata, environment variables, etc.\n- `keboola.component.base` - Base classes to build the Keboola Component applications from.\n\n## CommonInterface\n\nCore class that serves to initialize the docker environment. It handles the following tasks:\n\n- Environment initialisation\n    - Loading\n      all [environment variables](https://developers.keboola.com/extend/common-interface/environment/#environment-variables)\n    - Loading the [configuration file](https://developers.keboola.com/extend/common-interface/config-file/) and\n      initialization of the [data folder](https://developers.keboola.com/extend/common-interface/folders/)\n    - [State file](https://developers.keboola.com/extend/common-interface/config-file/#state-file) processing.\n    - [Logging](https://developers.keboola.com/extend/common-interface/logging/)\n- [Data folder](https://developers.keboola.com/extend/common-interface/folders/) manipulation\n    - [Manifest file](https://developers.keboola.com/extend/common-interface/manifest-files/) processing\n    - Config validation\n    - Metadata manipulation\n    - [OAuth](https://developers.keboola.com/extend/common-interface/oauth/) configuration handling.\n\n## Initialization\n\nThe core class is `keboola.component.interface.CommonInterface`, upon it's initialization the environment is created.\ne.g.\n\n- data folder initialized (either from the Environment Variable or manually)\n- config.json is loaded\n- All Environment variables are loaded\n\nThe optional parameter `data_folder_path` of the constructor is the path to the data directory. If not\nprovided it will be determined in this order:\n1.   [`KBC_DATADIR` environment variable](/extend/common-interface/environment/#environment-variables) if present\n2. -d / --data argument from the command line if present\n3. data folder inside the current working directory if present\n4. data folder inside the parent directory of the current working directory if present\n\nThe class can be either extended or just instantiated and manipulated like object. The `CommonInterface` class is\nexposed in the `keboola.component` namespace:\n\n```python\nfrom keboola.component import CommonInterface\n\n# init the interface\n# A ValueError error is raised if the KBC_DATADIR does not exist or contains non-existent path.\nci = CommonInterface()\n```\n\nTo specify the data folder path manually use this code:\n\n```python\nfrom keboola.component import CommonInterface\n\n# init the interface\n# A ValueError error is raised if the data folder path does not exist.\nci = CommonInterface(data_folder_path='/data')\n```\n\n## Loading configuration parameters:\n\nThe below example loads initializes the common interface class and automatically loading config.json from the\n[data folder](https://developers.keboola.com/extend/common-interface/folders/) which is defined by an environment\nvariable `KBC_DATADIR`, if the variable is not present, and error is raised. To override the data folder location\nprovide the `data_folder_path` parameter into constructor.\n\n**NOTE:** The `configuration` object is initialized upon access and a ValueError is thrown if the `config.json` does not\nexist in the data folder. e.g. `cfg = ci.configuration` may throw a ValueError even though the data folder exists and\nci (CommonInterface)\nis properly initialized.\n\n```python\nfrom keboola.component import CommonInterface\n# Logger is automatically set up based on the component setup (GELF or STDOUT)\nimport logging\n\nSOME_PARAMETER = 'some_user_parameter'\nREQUIRED_PARAMETERS = [SOME_PARAMETER]\n\n# init the interface\n# A ValueError error is raised if the KBC_DATADIR does not exist or contains non-existent path.\nci = CommonInterface()\n\n# A ValueError error is raised if the config.json file does not exists in the data dir.\n# Checks for required parameters and throws ValueError if any is missing.\nci.validate_configuration(REQUIRED_PARAMETERS)\n\n# print KBC Project ID from the environment variable if present:\nlogging.info(ci.environment_variables.project_id)\n\n# load particular configuration parameter\nlogging.info(ci.configuration.parameters[SOME_PARAMETER])\n```\n\n## Processing input tables - Manifest vs I/O mapping\n\nInput and output tables specified by user are listed in the [configuration file](/extend/common-interface/config-file/).\nApart from that, all input tables provided by user also include manifest file with additional metadata.\n\nTables and their manifest files are represented by the `keboola.component.dao.TableDefinition` object and may be loaded\nusing the convenience method `get_input_tables_definitions()`. The result object contains all metadata about the table,\nsuch as manifest file representations, system path and name.\n\n### Manifest \u0026 input folder content\n\n```python\nfrom keboola.component import CommonInterface\nimport logging\n\n# init the interface\nci = CommonInterface()\n\ninput_tables = ci.get_input_tables_definitions()\n\n# print path of the first table (random order)\nfirst_table = input_tables[0]\nlogging.info(f'The first table named: \"{first_table.name}\" is at path: {first_table.full_path}')\n\n# get information from table manifest\nlogging.info(f'The first table has following columns defined in the manifest {first_table.column_names}')\n\n```\n\n### Using I/O mapping\n\n```python\nimport csv\nfrom keboola.component import CommonInterface\n\n# initialize the library\nci = CommonInterface()\n\n# get list of input tables from the input mapping ()\ntables = ci.configuration.tables_input_mapping\nj = 0\nfor table in tables:\n    # get csv file name\n    inName = table.destination\n\n    # read input table manifest and get it's physical representation\n    table_def = ci.get_input_table_definition_by_name(table.destination)\n\n    # get csv file name with full path from output mapping\n    outName = ci.configuration.tables_output_mapping[j].full_path\n\n    # get file name from output mapping\n    outDestination = ci.configuration.tables_output_mapping[j]['destination']\n```\n\n## I/O table manifests and processing results\n\nThe component may define\noutput [manifest files](https://developers.keboola.com/extend/common-interface/manifest-files/#dataouttables-manifests)\nthat define options on storing the results back to the Keboola Connection Storage. This library provides methods that\nsimplifies the manifest file creation and allows defining the export options and metadata of the result table using\nhelper objects `TableDefinition`\nand `TableMetadata`.\n\n`TableDefinition` object serves as a result container containing all the information needed to store the Table into the\nStorage. It contains the manifest file representation and initializes all attributes available in the manifest.\n\nThis object represents both Input and Output manifests. All output manifest attributes are exposed in the class.\n\nThere are convenience methods for result processing and manifest creation `CommonInterface.write_manifest`. Also it is\npossible to create the container for the output table using the `CommonInterface.create_out_table_definition()`.\n\n![TableDefinition dependencies](docs/imgs/TableDefinition_class.png)\n\n**Table schema examples:**\n\n**Example 1: Creating table with predefined schema**\n\n```python\nfrom keboola.component import CommonInterface\nfrom collections import OrderedDict\nfrom keboola.component.dao import ColumnDefinition, DataType, SupportedDataTypes, BaseType\n\n# init the interface\nci = CommonInterface()\n\n# Define complete schema upfront\nschema = OrderedDict({\n    \"id\": ColumnDefinition(\n        data_types=BaseType.integer(),\n        primary_key=True\n    ),\n    \"created_at\": ColumnDefinition(\n        data_types=BaseType(dtype=SupportedDataTypes.TIMESTAMP)\n    ),\n    \"status\": ColumnDefinition(),\n    \"value\": ColumnDefinition(\n        data_types=BaseType.numeric(length=\"38,2\")\n    )\n})\n\n# Create table definition with predefined schema\nout_table = ci.create_out_table_definition(\n    name=\"results.csv\",                 # File name for the output\n    destination=\"out.c-data.results\",   # Destination table in Storage\n    schema=schema,                      # Predefined schema\n    incremental=True                    # Enable incremental loading\n)\n\n# Write some data to the output file\nimport csv\nwith open(out_table.full_path, 'w', newline='') as f:\n    writer = csv.DictWriter(f, fieldnames=out_table.column_names)\n    writer.writeheader()\n    writer.writerow({\n        \"id\": \"1\",\n        \"created_at\": \"2023-01-15T14:30:00Z\",\n        \"status\": \"completed\",\n        \"value\": \"123.45\"\n    })\n\n# Write manifest\nci.write_manifest(out_table)\n```\n\n**Example 2: Creating table with empty schema and adding columns dynamically**\n\n```python\nfrom keboola.component import CommonInterface\nfrom keboola.component.dao import ColumnDefinition, DataType, SupportedDataTypes, BaseType\nimport csv\n\n# init the interface\nci = CommonInterface()\n\n# Create table definition with empty schema\nout_table = ci.create_out_table_definition(\n    name=\"dynamic_results.csv\",\n    destination=\"out.c-data.dynamic_results\",\n    incremental=True\n)\n\n# Add columns using different data type methods\n# Method 1: Using BaseType helper\nout_table.add_column(\"id\",\n    ColumnDefinition(\n        primary_key=True,\n        data_types=BaseType.integer()\n    )\n)\n\n# Method 2: Using SupportedDataTypes enum\nout_table.add_column(\"created_at\",\n    ColumnDefinition(\n        data_types=BaseType(dtype=SupportedDataTypes.TIMESTAMP)\n    )\n)\n\n# Method 3: Simple column without specific data type\nout_table.add_column(\"status\", ColumnDefinition())\n\n# Method 4: Using BaseType with parameters\nout_table.add_column(\"price\",\n    ColumnDefinition(\n        data_types=BaseType.numeric(length=\"10,2\"),\n        description=\"Product price with 2 decimal places\"\n    )\n)\n\n# Method 5: Backend-specific data types\nout_table.add_column(\"metadata\",\n    ColumnDefinition(\n        data_types={\n            \"snowflake\": DataType(dtype=\"VARIANT\"),\n            \"bigquery\": DataType(dtype=\"JSON\"),\n            \"base\": DataType(dtype=SupportedDataTypes.STRING, length=\"65535\")\n        },\n        description=\"JSON metadata column\"\n    )\n)\n\n# Update existing column (example of column modification)\nout_table.update_column(\"price\",\n    ColumnDefinition(\n        data_types={\n            \"snowflake\": DataType(dtype=\"NUMBER\", length=\"15,4\"),\n            \"bigquery\": DataType(dtype=\"NUMERIC\", length=\"15,4\"),\n            \"base\": DataType(dtype=SupportedDataTypes.NUMERIC, length=\"15,4\")\n        },\n        description=\"Updated price with 4 decimal places for higher precision\"\n    )\n)\n\n# Write some data to the output file\nwith open(out_table.full_path, 'w', newline='') as f:\n    writer = csv.DictWriter(f, fieldnames=out_table.column_names)\n    writer.writeheader()\n    writer.writerow({\n        \"id\": \"1\",\n        \"created_at\": \"2023-01-15T14:30:00Z\",\n        \"status\": \"active\",\n        \"price\": \"99.9999\",\n        \"metadata\": '{\"category\": \"electronics\", \"brand\": \"TechCorp\"}'\n    })\n\n# Write manifest\nci.write_manifest(out_table)\n```\n\n**Simple Example for Basic Use Cases:**\n\n```python\nfrom keboola.component import CommonInterface\nimport csv\n\n# Initialize the component\nci = CommonInterface()\n\n# Create output table\nresult_table = ci.create_out_table_definition(\n    'output.csv',\n    primary_key=['id'],\n    incremental=True,\n    description='Data processed by my component'\n)\n\n# Write data to CSV\nwith open(result_table.full_path, 'w', newline='') as f:\n    writer = csv.DictWriter(f, fieldnames=['id', 'name', 'value'])\n    writer.writeheader()\n    writer.writerow({\"id\": \"1\", \"name\": \"Test\", \"value\": \"100\"})\n    writer.writerow({\"id\": \"2\", \"name\": \"Example\", \"value\": \"200\"})\n\n# Write manifest file\nci.write_manifest(result_table)\n```\n\n### Get input table by name\n\n```python\nfrom keboola.component import CommonInterface\n\n# init the interface\nci = CommonInterface()\ntable_def = ci.get_input_table_definition_by_name('input.csv')\n\n```\n\n## Working with Input/Output Mapping\n\nKeboola Connection provides input/output mappings that define which tables are loaded into your component and which tables should be stored back. These mappings are defined in the configuration file and can be accessed programmatically.\n\n### Accessing Input Tables from Mapping\n\n```python\nfrom keboola.component import CommonInterface\nimport csv\n\n# Initialize the component\nci = CommonInterface()\n\n# Access input mapping configuration\ninput_tables = ci.configuration.tables_input_mapping\n\n# Process each input table\nfor table in input_tables:\n    # Get the destination (filename in the /data/in/tables directory)\n    table_name = table.destination\n\n    # Load table definition from manifest\n    table_def = ci.get_input_table_definition_by_name(table_name)\n\n    # Print information about the table\n    print(f\"Processing table: {table_name}\")\n    print(f\"  - Source: {table.source}\")\n    print(f\"  - Full path: {table_def.full_path}\")\n    print(f\"  - Columns: {table_def.column_names}\")\n\n    # Read data from the CSV file\n    with open(table_def.full_path, 'r') as input_file:\n        csv_reader = csv.DictReader(input_file)\n        for row in csv_reader:\n            # Process each row\n            print(f\"  - Row: {row}\")\n```\n\n### Creating Output Tables based on Output Mapping\n\n```python\nfrom keboola.component import CommonInterface\nimport csv\n\n# Initialize the component\nci = CommonInterface()\n\n# Access output mapping configuration\noutput_tables = ci.configuration.tables_output_mapping\n\n# Process each output table mapping\nfor i, table_mapping in enumerate(output_tables):\n    # Get source (filename that should be created) and destination (where it will be stored in KBC)\n    source = table_mapping.source\n    destination = table_mapping.destination\n\n    # Create output table definition\n    out_table = ci.create_out_table_definition(\n        name=source,\n        destination=destination,\n        incremental=table_mapping.incremental\n    )\n\n    # Add some sample data (in a real component, this would be your processed data)\n    with open(out_table.full_path, 'w', newline='') as out_file:\n        writer = csv.DictWriter(out_file, fieldnames=['id', 'data'])\n        writer.writeheader()\n        writer.writerow({'id': f'{i+1}', 'data': f'Data for {destination}'})\n\n    # Write manifest file\n    ci.write_manifest(out_table)\n```\n\n### Combining Input and Output Mapping\n\nHere's a complete example that reads data from input tables and creates output tables:\n\n```python\nfrom keboola.component import CommonInterface\nimport csv\n\n# Initialize the component\nci = CommonInterface()\n\n# Get input tables\ninput_tables = ci.configuration.tables_input_mapping\noutput_tables = ci.configuration.tables_output_mapping\n\n# Process each output table based on input\nfor i, out_mapping in enumerate(output_tables):\n    # Find corresponding input table if possible (matching by index for simplicity)\n    in_mapping = input_tables[i] if i \u003c len(input_tables) else None\n\n    # Create output table\n    out_table = ci.create_out_table_definition(\n        name=out_mapping.source,\n        destination=out_mapping.destination,\n        incremental=out_mapping.incremental\n    )\n\n    # If we have an input table, transform its data\n    if in_mapping:\n        in_table = ci.get_input_table_definition_by_name(in_mapping.destination)\n\n        # Read input and write to output with transformation\n        with open(in_table.full_path, 'r') as in_file, open(out_table.full_path, 'w', newline='') as out_file:\n            reader = csv.DictReader(in_file)\n\n            # Create writer with same field names\n            fieldnames = reader.fieldnames\n            writer = csv.DictWriter(out_file, fieldnames=fieldnames)\n            writer.writeheader()\n\n            # Transform each row and write to output\n            for row in reader:\n                # Simple transformation example - uppercase all values\n                transformed_row = {k: v.upper() if isinstance(v, str) else v for k, v in row.items()}\n                writer.writerow(transformed_row)\n    else:\n        # No input table, create sample output\n        with open(out_table.full_path, 'w', newline='') as out_file:\n            writer = csv.DictWriter(out_file, fieldnames=['id', 'data'])\n            writer.writeheader()\n            writer.writerow({'id': f'{i+1}', 'data': f'Sample data for {out_mapping.destination}'})\n\n    # Write manifest\n    ci.write_manifest(out_table)\n```\n\n## Processing input files\n\nSimilarly as tables, files and their manifest files are represented by the `keboola.component.dao.FileDefinition` object\nand may be loaded using the convenience method `get_input_files_definitions()`. The result object contains all metadata\nabout the file, such as manifest file representations, system path and name.\n\nThe `get_input_files_definitions()` supports filter parameters to filter only files with a specific tag or retrieve only\nthe latest file of each. This is especially useful because the KBC input mapping will by default include all versions of\nfiles matching specific tag. By default, the method returns only the latest file of each.\n\n```python\nfrom keboola.component import CommonInterface\nimport logging\n\n# Initialize the interface\nci = CommonInterface()\n\n# Get input files with specific tags (only latest versions)\ninput_files = ci.get_input_files_definitions(tags=['images', 'documents'], only_latest_files=True)\n\n# Process each file\nfor file in input_files:\n    print(f\"Processing file: {file.name}\")\n    print(f\"  - Full path: {file.full_path}\")\n    print(f\"  - Tags: {file.tags}\")\n\n    # Example: Process image files\n    if 'images' in file.tags:\n        # Process image using appropriate library\n        print(f\"  - Processing image: {file.name}\")\n        # image = Image.open(file.full_path)\n        # ... process image ...\n\n    # Example: Process document files\n    if 'documents' in file.tags:\n        print(f\"  - Processing document: {file.name}\")\n        # ... process document ...\n```\n\n### Grouping Files by Tags\n\nWhen working with files it may be useful to retrieve them in a dictionary structure grouped by tag:\n\n```python\nfrom keboola.component import CommonInterface\n\n# Initialize the interface\nci = CommonInterface()\n\n# Group files by tag\nfiles_by_tag = ci.get_input_file_definitions_grouped_by_tag_group(only_latest_files=True)\n\n# Process files for each tag\nfor tag, files in files_by_tag.items():\n    print(f\"Processing tag group: {tag}\")\n    for file in files:\n        print(f\"  - File: {file.name}\")\n        # Process file based on its tag\n```\n\n### Creating Output Files\n\nSimilar to tables, you can create output files with appropriate manifests:\n\n```python\nfrom keboola.component import CommonInterface\n\n# Initialize the interface\nci = CommonInterface()\n\n# Create output file definition\noutput_file = ci.create_out_file_definition(\n    name=\"results.json\",\n    tags=[\"processed\", \"results\"],\n    is_public=False,\n    is_permanent=True\n)\n\n# Write content to the file\nwith open(output_file.full_path, 'w') as f:\n    f.write('{\"status\": \"success\", \"processed_records\": 42}')\n\n# Write manifest file\nci.write_manifest(output_file)\n```\n\n## Processing state files\n\n[State files](https://developers.keboola.com/extend/common-interface/config-file/#state-file) allow your component to store and retrieve information between runs. This is especially useful for incremental processing or tracking the last processed data.\n\n```python\nfrom keboola.component import CommonInterface\nfrom datetime import datetime\nimport json\n\n# Initialize the interface\nci = CommonInterface()\n\n# Load state from previous run\nstate = ci.get_state_file()\n\n# Get the last processed timestamp (or use default if this is the first run)\nlast_updated = state.get(\"last_updated\", \"1970-01-01T00:00:00Z\")\nprint(f\"Last processed data up to: {last_updated}\")\n\n# Process data (only data newer than last_updated)\n# In a real component, this would involve your business logic\nprocessed_items = [\n    {\"id\": 1, \"timestamp\": \"2023-05-15T10:30:00Z\"},\n    {\"id\": 2, \"timestamp\": \"2023-05-16T14:45:00Z\"}\n]\n\n# Get the latest timestamp for the next run\nif processed_items:\n    # Sort items by timestamp to find the latest one\n    processed_items.sort(key=lambda x: x[\"timestamp\"])\n    new_last_updated = processed_items[-1][\"timestamp\"]\nelse:\n    # No new items, keep the previous timestamp\n    new_last_updated = last_updated\n\n# Store the new state for the next run\nci.write_state_file({\n    \"last_updated\": new_last_updated,\n    \"processed_count\": len(processed_items),\n    \"last_run\": datetime.now().isoformat()\n})\n\nprint(f\"State updated, next run will process data from: {new_last_updated}\")\n```\n\nState files can contain any serializable JSON structure, so you can store complex information:\n\n```python\n# More complex state example\nstate = {\n    \"last_run\": datetime.now().isoformat(),\n    \"api_pagination\": {\n        \"next_page_token\": \"abc123xyz\",\n        \"page_size\": 100,\n        \"total_pages_retrieved\": 5\n    },\n    \"processed_ids\": [1001, 1002, 1003, 1004],\n    \"statistics\": {\n        \"success_count\": 1000,\n        \"error_count\": 5,\n        \"skipped_count\": 10\n    }\n}\n\nci.write_state_file(state)\n```\n\n## Logging\n\nThe library automatically initializes STDOUT or GELF logger based on the presence of the `KBC_LOGGER_PORT/HOST`\nenvironment variable upon the `CommonInterface` initialization. To use the GELF logger just enable the logger for your\nappplication in the Developer Portal. More info in\nthe [dedicated article](https://developers.keboola.com/extend/common-interface/logging/#examples).\n\nOnce it is enabled, you may just log your messages using the logging library:\n\n```python\nfrom keboola.component import CommonInterface\nfrom datetime import datetime\nimport logging\n\n# init the interface\nci = CommonInterface()\n\nlogging.info(\"Info message\")\n```\n\n**TIP:** When the logger verbosity is set to `verbose` you may leverage `extra` fields to log the detailed message in\nthe detail of the log event by adding extra fields to you messages:\n\n```python\nlogging.error(f'{error}. See log detail for full query. ',\n              extra={\"failed_query\": json.dumps(query)})\n```\n\nYou may also choose to override the settings by enabling the GELF or STDOUT explicitly and specifying the host/port\nparameters:\n\n```python\nfrom keboola.component import CommonInterface\nimport os\nimport logging\n\n# init the interface\nci = CommonInterface()\nos.environ['KBC_LOGGER_ADDR'] = 'localhost'\nos.environ['KBC_LOGGER_PORT'] = 12201\nci.set_gelf_logger(log_level=logging.INFO, transport_layer='UDP')\n\nlogging.info(\"Info message\")\n```\n\n# ComponentBase\n\n[Base class](https://keboola.github.io/python-component/base.html)\nfor general Python components. Base your components on this class for simpler debugging.\n\nIt performs following tasks by default:\n\n- Initializes the CommonInterface.\n- For easier debugging the data folder is picked up by default from `../data` path, relative to working directory when\n  the `KBC_DATADIR` env variable is not specified.\n- If `debug` parameter is present in the `config.json`, the default logger is set to verbose DEBUG mode.\n- Executes sync actions -\u003e `run` by default. See the sync actions section.\n\n**Constructor arguments**:\n\n- data_path_override: optional path to data folder that overrides the default behaviour\n  (`KBC_DATADIR` environment variable). May be also specified by `-d` or `--data` commandline argument\n\nRaises: `UserException` - on config validation errors.\n\n**Example usage**:\n\n```python\nimport csv\nimport logging\nfrom datetime import datetime\n\nfrom keboola.component.base import ComponentBase, sync_action\nfrom keboola.component import UserException\n\n# configuration variables\nKEY_PRINT_HELLO = 'print_hello'\n\n# list of mandatory parameters =\u003e if some is missing,\n# component will fail with readable message on initialization.\nREQUIRED_PARAMETERS = [KEY_PRINT_HELLO]\nREQUIRED_IMAGE_PARS = []\n\n\nclass Component(ComponentBase):\n\n    def run(self):\n        '''\n        Main execution code\n        '''\n\n        # ####### EXAMPLE TO REMOVE\n        # check for missing configuration parameters\n        self.validate_configuration_parameters(REQUIRED_PARAMETERS)\n        self.validate_image_parameters(REQUIRED_IMAGE_PARS)\n\n        params = self.configuration.parameters\n        # Access parameters in data/config.json\n        if params.get(KEY_PRINT_HELLO):\n            logging.info(\"Hello World\")\n\n        # get last state data/in/state.json from previous run\n        previous_state = self.get_state_file()\n        logging.info(previous_state.get('some_state_parameter'))\n\n        # Create output table (Tabledefinition - just metadata)\n        table = self.create_out_table_definition('output.csv', incremental=True, primary_key=['timestamp'])\n\n        # get file path of the table (data/out/tables/Features.csv)\n        out_table_path = table.full_path\n        logging.info(out_table_path)\n\n        # DO whatever and save into out_table_path\n        with open(table.full_path, mode='wt', encoding='utf-8', newline='') as out_file:\n            writer = csv.DictWriter(out_file, fieldnames=['timestamp'])\n            writer.writeheader()\n            writer.writerow({\"timestamp\": datetime.now().isoformat()})\n\n        # Save table manifest (output.csv.manifest) from the tabledefinition\n        self.write_manifest(table)\n\n        # Write new state - will be available next run\n        self.write_state_file({\"some_state_parameter\": \"value\"})\n\n        # ####### EXAMPLE TO REMOVE END\n\n    # sync action that is executed when configuration.json \"action\":\"testConnection\" parameter is present.\n    @sync_action('testConnection')\n    def test_connection(self):\n        connection = self.configuration.parameters.get('test_connection')\n        if connection == \"fail\":\n            raise UserException(\"failed\")\n        elif connection == \"succeed\":\n            # this is ignored when run as sync action.\n            logging.info(\"succeed\")\n\n\n\"\"\"\n        Main entrypoint\n\"\"\"\nif __name__ == \"__main__\":\n    try:\n        comp = Component()\n        # this triggers the run method by default and is controlled by the configuration.action paramter\n        comp.execute_action()\n    except UserException as exc:\n        logging.exception(exc)\n        exit(1)\n    except Exception as exc:\n        logging.exception(exc)\n        exit(2)\n```\n\n## Table Schemas in ComponentBase\n\nIn cases of a static schemas of output/input tables, the schemas can be defined using a JSON Table Schema. For output\nmapping these json schemas can be automatically turned into out table definitions.\n\n### JSON Table Schema example file\n\n```json\n{\n  \"name\": \"product\",\n  \"description\": \"this table holds data on products\",\n  \"parent_tables\": [],\n  \"primary_keys\": [\n    \"id\"\n  ],\n  \"fields\": [\n    {\n      \"name\": \"id\",\n      \"base_type\": \"string\",\n      \"description\": \"ID of the product\",\n      \"length\": \"100\",\n      \"nullable\": false\n    },\n    {\n      \"name\": \"name\",\n      \"base_type\": \"string\",\n      \"description\": \"Plain-text name of the product\",\n      \"length\": \"1000\",\n      \"default\": \"Default Name\"\n    }\n  ]\n}\n```\n\n### Out table definition from schema example\n\nThe example below shows how a table definition can be created from a json schema using the ComponentBase. The schema is\nlocated in the 'src/schemas' directory.\n\n ```python\nimport csv\nfrom keboola.component.base import ComponentBase\n\nDUMMY_PRODUCT_DATA = [{\"id\": \"P0001\",\n                       \"name\": \"juice\"},\n                      {\"id\": \"P0002\",\n                       \"name\": \"chocolate bar\"},\n                      {\"id\": \"P0003\",\n                       \"name\": \"Stylish Pants\"},\n                      ]\n\n\nclass Component(ComponentBase):\n\n    def __init__(self):\n        super().__init__()\n\n    def run(self):\n        product_schema = self.get_table_schema_by_name('product')\n        product_table = self.create_out_table_definition_from_schema(product_schema)\n        with open(product_table.full_path, 'w') as outfile:\n            writer = csv.DictWriter(outfile, fieldnames=product_table.column_names)\n            writer.writerows(DUMMY_PRODUCT_DATA)\n        self.write_manifest(product_table)\n ```\n\n# Sync Actions\n\n[Sync actions](https://developers.keboola.com/extend/common-interface/actions/) provide a way to execute quick, synchronous tasks within a component. Unlike the default `run` action (which executes asynchronously as a background job), sync actions execute immediately and return results directly to the UI.\n\nCommon use cases for sync actions:\n- Testing connections to external services\n- Fetching dynamic dropdown options for UI configuration\n- Validating user input\n- Listing available resources (tables, schemas, etc.)\n\n## Creating Sync Actions\n\nTo create a sync action, add a method to your component class and decorate it with `@sync_action('action_name')`. The framework handles all the details of proper response formatting and error handling.\n\n```python\nfrom keboola.component.base import ComponentBase, sync_action\nfrom keboola.component import UserException\n\nclass Component(ComponentBase):\n    def run(self):\n        # Main component logic\n        pass\n\n    @sync_action('testConnection')\n    def test_connection(self):\n        \"\"\"\n        Tests database connection credentials\n        \"\"\"\n        params = self.configuration.parameters\n        connection = params.get('connection', {})\n\n        # Validate connection parameters\n        if not connection.get('host') or not connection.get('username'):\n            raise UserException(\"Connection failed: Missing host or username\")\n\n        # If no exception is raised, the connection test is considered successful\n        # The framework automatically returns {\"status\": \"success\"}\n```\n\n## Returning Data from Sync Actions\n\nSync actions can return data that is used by the UI, such as dropdown options:\n\n```python\nfrom keboola.component.base import ComponentBase, sync_action\nfrom keboola.component.sync_actions import SelectElement\n\nclass Component(ComponentBase):\n    @sync_action('listTables')\n    def list_tables(self):\n        \"\"\"\n        Returns list of available tables for configuration dropdown\n        \"\"\"\n        # In a real scenario, you would fetch this from a database or API\n        available_tables = [\n            {\"id\": \"customers\", \"name\": \"Customer Data\"},\n            {\"id\": \"orders\", \"name\": \"Order History\"},\n            {\"id\": \"products\", \"name\": \"Product Catalog\"}\n        ]\n\n        # Return as list of SelectElement objects for UI dropdown\n        return [\n            SelectElement(value=table[\"id\"], label=table[\"name\"])\n            for table in available_tables\n        ]\n```\n\n## Validation Message Action\n\nYou can provide validation feedback to the UI:\n\n```python\nfrom keboola.component.base import ComponentBase, sync_action\nfrom keboola.component.sync_actions import ValidationResult, MessageType\n\nclass Component(ComponentBase):\n    @sync_action('validateConfiguration')\n    def validate_config(self):\n        \"\"\"\n        Validates the component configuration\n        \"\"\"\n        params = self.configuration.parameters\n\n        # Check configuration parameters\n        if params.get('extraction_type') == 'incremental' and not params.get('incremental_key'):\n            # Return warning message that will be displayed in UI\n            return ValidationResult(\n                \"Incremental extraction requires specifying an incremental key column.\",\n                MessageType.WARNING\n            )\n\n        # Check for potential issues\n        if params.get('row_limit') and int(params.get('row_limit')) \u003e 1000000:\n            # Return info message\n            return ValidationResult(\n                \"Large row limit may cause performance issues.\",\n                MessageType.INFO\n            )\n\n        # Success with no message\n        return None\n```\n\n#### No output\n\nSome actions like test connection button expect only success / failure type of result with no return value.\n\n```python\nfrom keboola.component.base import ComponentBase, sync_action\nfrom keboola.component import UserException\nimport logging\n\n\nclass Component(ComponentBase):\n\n    def __init__(self):\n        super().__init__()\n\n    @sync_action('testConnection')\n    def test_connection(self):\n        # this is ignored when run as sync action.\n        logging.info(\"Testing Connection\")\n        print(\"test print\")\n        params = self.configuration.parameters\n        connection = params.get('test_connection')\n        if connection == \"fail\":\n            raise UserException(\"failed\")\n        elif connection == \"succeed\":\n            # this is ignored when run as sync action.\n            logging.info(\"succeed\")\n```\n\n## License\n\nMIT licensed, see [LICENSE](./LICENSE) file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeboola%2Fpython-component","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkeboola%2Fpython-component","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkeboola%2Fpython-component/lists"}