{"id":34102925,"url":"https://github.com/opendatablend/opendatablend-py","last_synced_at":"2025-12-14T17:04:44.337Z","repository":{"id":44620003,"uuid":"382406853","full_name":"opendatablend/opendatablend-py","owner":"opendatablend","description":"The fastest way to get data from the Open Data Blend Dataset API","archived":false,"fork":false,"pushed_at":"2024-01-03T09:57:36.000Z","size":61,"stargazers_count":3,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-09-14T15:06:11.238Z","etag":null,"topics":["data","data-engineering","data-science","dataset","frictionless-data","frictionlessdata","koalas","pandas","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opendatablend.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2021-07-02T16:43:05.000Z","updated_at":"2025-02-15T13:45:30.000Z","dependencies_parsed_at":"2022-09-14T07:22:12.916Z","dependency_job_id":"2d85332e-3f8e-4f4a-8bf8-61f859efbdff","html_url":"https://github.com/opendatablend/opendatablend-py","commit_stats":{"total_commits":28,"total_committers":2,"mean_commits":14.0,"dds":0.0357142857142857,"last_synced_commit":"a2238a38a5e696a7fb47b93c14c19d872605dfb5"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"purl":"pkg:github/opendatablend/opendatablend-py","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opendatablend%2Fopendatablend-py","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opendatablend%2Fopendatablend-py/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opendatablend%2Fopendatablend-py/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opendatablend%2Fopendatablend-py/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opendatablend","download_url":"https://codeload.github.com/opendatablend/opendatablend-py/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opendatablend%2Fopendatablend-py/sbom","scorecard":{"id":709163,"data":{"date":"2025-08-11","repo":{"name":"github.com/opendatablend/opendatablend-py","commit":"43843c5be036f1090d58fb04e011b439e7956a63"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.5,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-22T07:37:50.521Z","repository_id":44620003,"created_at":"2025-08-22T07:37:50.521Z","updated_at":"2025-08-22T07:37:50.521Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27732135,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-14T02:00:11.348Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","data-engineering","data-science","dataset","frictionless-data","frictionlessdata","koalas","pandas","python"],"created_at":"2025-12-14T17:04:43.813Z","updated_at":"2025-12-14T17:04:44.328Z","avatar_url":"https://github.com/opendatablend.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![alt text](https://raw.githubusercontent.com/opendatablend/opendatablend-py/master/images/odblogo.png \"Open Data Blend\")\n\n# Open Data Blend for Python\n\nOpen Data Blend for Python is the fastest way to get data from the Open Data Blend Dataset API. It is a lightweight, easy-to-use extract and load (EL) tool.\n\nYou can use the `get_data` function to download any data file belonging to an Open Data Blend dataset. Alternatively, the `get_data_files` function can be used to download a collection of data files from an Open Data Blend dataset. The functions transparently download and cache the data locally or in cloud storage, mirroring the same folder hierarchy as on the remote server. They also cache a copy of the dataset metadata file (datapackage.json) at the point that they are called. The cache is persistent which means the files will be kept until they are deleted.\n\nThe versioned dataset metadata can be used to re-download a specific version of a data file (sometimes referred to as 'time travel'). You can learn more about how we version our datasets in the [Open Data Blend Docs](https://docs.opendatablend.io/open-data-blend-datasets/dataset-snapshots).\n\nIn addition to downloading the data and metadata files, `get_data` returns an object called `Output` which includes the locations of the downloaded files. Similarly, `get_data_files` returns an object called `OutputSet` which includes the locations of the files that are downloaded and the associated metadata. From there, you can query and analyse the data directly using something light like [Pandas](https://pandas.pydata.org/) or, for more resource intensive processing, a data lakehouse platform like [Databricks](https://databricks.com/), or a scalable in-memory OLAP library like [Polars](https://www.pola.rs/).\n\n# Installation\n\nInstall the latest version of `opendatablend` from [PyPI](https://pypi.org/):\n\n```Python\npip install opendatablend\n```\n\n# Usage Examples\n\n---\n**NOTE**\n\nIf you want to run the examples, be sure to replace placeholder values such as  `\u003cACCESS_KEY\u003e` with appropriate string literals or variables.\n\n---\n\nSome of the following examples require the `pandas` and `pyarrow` packages to be installed:\n\n```Python\npip install pandas\npip install pyarrow\n```\n\n## Making Public API Requests\n\n---\n**NOTE**\n\nPublic API requests have a [monthly limit](https://docs.opendatablend.io/open-data-blend-datasets/dataset-api#usage-limits).\n\n---\n\n### Get the Data\n\n```python\nimport opendatablend as odb\nimport pandas as pd\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Get the data and store the output object\noutput = odb.get_data(dataset_path, resource_name)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n### Use The Data\n\n```python\n# Read a subset of the columns into a dataframe\ndf_date = pd.read_parquet(output.data_file_name, columns=['drv_date_key', 'drv_date', 'drv_month_name', 'drv_month_number', 'drv_quarter_name', 'drv_quarter_number', 'drv_year'])\n\n# Check the contents of the dataframe\ndf_date\n```\n\n## Making Authenticated API Requests\n\n### Get the Data\n\n```python\nimport opendatablend as odb\nimport pandas as pd\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e'\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Get the data and store the output object\noutput = odb.get_data(dataset_path, resource_name, access_key=access_key)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n### Use the Data\n\n```python\n# Read a subset of the columns into a dataframe\ndf_date = pd.read_parquet(output.data_file_name, columns=['drv_date_key', 'drv_date', 'drv_month_name', 'drv_month_number', 'drv_quarter_name', 'drv_quarter_number', 'drv_year'])\n\n# Check the contents of the dataframe\ndf_date\n```\n\n## Downloading Multiple Data Files\n\nThe `get_data_files` function can be used to download a set of data files by providing their resource names as a list.\n\n### Get the Data\n\n```python\nimport opendatablend as odb\nimport pandas as pd\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e'\n\n# Specify the resource names of the data files. In this example, a subset of the available data files will be requested in Parquet format.\nresource_names = [\n    'date-parquet',\n    'time-of-day-parquet',\n    'geolocation-parquet',\n    'road-safety-accident-info-parquet',\n    'road-safety-accident-location-parquet',\n    'road-safety-accident-2021-parquet'\n    ]\n\n# Get the data files and store the output object\noutput = odb.get_data_files(dataset_path, resource_names, access_key=access_key)\n\n# Print the file locations\nprint(output.data_file_names)\nprint(output.metadata_file_name)\n```\n\n\n## Ingesting Data Directly into Cloud Storage Services\n\n### Azure Blob Storage\n\n#### Using `get_data`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Get the data and store the output object using the Azure Blob Storage file system\nconfiguration = {\n    \"connection_string\" : \"DefaultEndpointsProtocol=https;AccountName=\u003cAZURE_BLOB_STORAGE_ACCOUNT_NAME\u003e;AccountKey=\u003cAZURE_BLOB_STORAGE_ACCOUNT_KEY\u003e;EndpointSuffix=core.windows.net\",\n    \"container_name\" : \"\u003cAZURE_BLOB_STORAGE_CONTAINER_NAME\u003e\" # e.g. odbp-integration\n    }\noutput = odb.get_data(dataset_path, resource_name, access_key=access_key, file_system=\"azure_blob_storage\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n#### Using `get_data_files`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource names of the data files. In this example, a subset of the available data files will be requested in Parquet format.\nresource_names = [\n    'date-parquet',\n    'time-of-day-parquet',\n    'geolocation-parquet',\n    'road-safety-accident-info-parquet',\n    'road-safety-accident-location-parquet',\n    'road-safety-accident-2021-parquet'\n    ]\n\n# Get the data and store the output object using the Azure Blob Storage file system\nconfiguration = {\n    \"connection_string\" : \"DefaultEndpointsProtocol=https;AccountName=\u003cAZURE_BLOB_STORAGE_ACCOUNT_NAME\u003e;AccountKey=\u003cAZURE_BLOB_STORAGE_ACCOUNT_KEY\u003e;EndpointSuffix=core.windows.net\",\n    \"container_name\" : \"\u003cAZURE_BLOB_STORAGE_CONTAINER_NAME\u003e\" # e.g. odbp-integration\n    }\noutput = odb.get_data_files(dataset_path, resource_names, access_key=access_key, file_system=\"azure_blob_storage\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_names)\nprint(output.metadata_file_name)\n```\n\n### Azure Data Lake Storage (ADLS) Gen2\n\n### Using `get_data`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Get the data and store the output object using the Azure Data Lake Storage Gen2 file system\nconfiguration = {\n    \"connection_string\" : \"DefaultEndpointsProtocol=https;AccountName=\u003cADLS_GEN2_ACCOUNT_NAME\u003e;AccountKey=\u003cADLS_GEN2_ACCOUNT_KEY\u003e;EndpointSuffix=core.windows.net\",\n    \"container_name\" : \"\u003cADLS_GEN2_CONTAINER_NAME\u003e\" # e.g. odbp-integration\n    }\noutput = odb.get_data(dataset_path, resource_name, access_key=access_key, file_system=\"azure_blob_storage\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n### Using `get_data_files`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource names of the data files. In this example, a subset of the available data files will be requested in Parquet format.\nresource_names = [\n    'date-parquet',\n    'time-of-day-parquet',\n    'geolocation-parquet',\n    'road-safety-accident-info-parquet',\n    'road-safety-accident-location-parquet',\n    'road-safety-accident-2021-parquet'\n    ]\n\n# Get the data and store the output object using the Azure Data Lake Storage Gen2 file system\nconfiguration = {\n    \"connection_string\" : \"DefaultEndpointsProtocol=https;AccountName=\u003cADLS_GEN2_ACCOUNT_NAME\u003e;AccountKey=\u003cADLS_GEN2_ACCOUNT_KEY\u003e;EndpointSuffix=core.windows.net\",\n    \"container_name\" : \"\u003cADLS_GEN2_CONTAINER_NAME\u003e\" # e.g. odbp-integration\n    }\noutput = odb.get_data_files(dataset_path, resource_names, access_key=access_key, file_system=\"azure_blob_storage\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_names)\nprint(output.metadata_file_name)\n```\n\n### Amazon S3\n\n#### Using `get_data`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Get the data and store the output object using the Amazon S3 file system\nconfiguration = {\n    \"aws_access_key_id\" : \"\u003cAWS_ACCESS_KEY_ID\u003e\",\n    \"aws_secret_access_key\" : \"AWS_SECRET_ACCESS_KEY\",\n    \"bucket_name\" : \"\u003cBUCKET_NAME\u003e\", # e.g. odbp-integration\n    \"bucket_region\" : \"\u003cBUCKET_REGION\u003e\" # e.g. eu-west-2\n    }\n\noutput = odb.get_data(dataset_path, resource_name, access_key=access_key, file_system=\"amazon_s3\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n#### Using `get_data_files`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource names of the data files. In this example, a subset of the available data files will be requested in Parquet format.\nresource_names = [\n    'date-parquet',\n    'time-of-day-parquet',\n    'geolocation-parquet',\n    'road-safety-accident-info-parquet',\n    'road-safety-accident-location-parquet',\n    'road-safety-accident-2021-parquet'\n    ]\n\n# Get the data and store the output object using the Amazon S3 file system\nconfiguration = {\n    \"aws_access_key_id\" : \"\u003cAWS_ACCESS_KEY_ID\u003e\",\n    \"aws_secret_access_key\" : \"AWS_SECRET_ACCESS_KEY\",\n    \"bucket_name\" : \"\u003cBUCKET_NAME\u003e\", # e.g. odbp-integration\n    \"bucket_region\" : \"\u003cBUCKET_REGION\u003e\" # e.g. eu-west-2\n    }\n\noutput = odb.get_data(dataset_path, resource_names, access_key=access_key, file_system=\"amazon_s3\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_names)\nprint(output.metadata_file_name)\n```\n\n### Google Cloud Storage\n\n#### Using `get_data`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Get the data and store the output object using the Google Cloud Storage file system\nconfiguration = {\n    \"service_account_private_key_file\" : \"\u003cPATH_TO_SERVICE_ACCOUNT_PRIVATE_KEY_FILE\u003e\",\n    \"bucket_name\" : \"\u003cBUCKET_NAME\u003e\", # e.g. odbp-integration\n    \"bucket_location\" : \"\u003cBUCKET_LOCATION\u003e\" # e.g. europe-west2\n    }\n\noutput = odb.get_data(dataset_path, resource_name, access_key=access_key, file_system=\"google_cloud_storage\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n#### Using `get_data_files`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource names of the data files. In this example, a subset of the available data files will be requested in Parquet format.\nresource_names = [\n    'date-parquet',\n    'time-of-day-parquet',\n    'geolocation-parquet',\n    'road-safety-accident-info-parquet',\n    'road-safety-accident-location-parquet',\n    'road-safety-accident-2021-parquet'\n    ]\n\n# Get the data and store the output object using the Google Cloud Storage file system\nconfiguration = {\n    \"service_account_private_key_file\" : \"\u003cPATH_TO_SERVICE_ACCOUNT_PRIVATE_KEY_FILE\u003e\",\n    \"bucket_name\" : \"\u003cBUCKET_NAME\u003e\", # e.g. odbp-integration\n    \"bucket_location\" : \"\u003cBUCKET_LOCATION\u003e\" # e.g. europe-west2\n    }\n\noutput = odb.get_data(dataset_path, resource_names, access_key=access_key, file_system=\"google_cloud_storage\", configuration=configuration)\n\n# Print the file locations\nprint(output.data_file_names)\nprint(output.metadata_file_name)\n```\n### OneLake in Microsoft Fabric\n\nYou can use Open Data Blend for Python to ingest data directly into OneLake in Microsoft Fabric using a Fabric Notebook.\n\n#### Prerequisites\n\nBefore attempting to ingest the data into OneLake using this method, you need to:\n\n1. Create a [Microsoft Fabric Lakehouse](https://learn.microsoft.com/en-us/fabric/onelake/create-lakehouse-onelake)\n2. Create a [Microsoft Fabric Notebook](https://learn.microsoft.com/en-us/fabric/data-engineering/how-to-use-notebook)\n3. Set a [default lakehouse](https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-notebook-explore#switch-lakehouses-and-set-a-default) for your Microsoft Fabric Notebook\n4. Install the 'opendatablend' library through the [Public Libraries](https://learn.microsoft.com/en-us/fabric/data-engineering/environment-manage-library) section of a new or exiting [Microsoft Fabric Environment](https://learn.microsoft.com/en-us/fabric/data-engineering/create-and-use-environment)\n5. [Attach](https://learn.microsoft.com/en-us/fabric/data-engineering/create-and-use-environment#attach-an-environment) the environment to your workspace or notebook\n\nYou can then use the following methods to ingest the data. Pay special attention to `base_path` value because this is what controls where the data will be stored within OneLake. The `base_path` **must** point to the 'Files' location or a subfolder within it.\n\n#### Using `get_data`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource name of the data file. In this example, the 'date' data file will be requested in Parquet format.\nresource_name = 'date-parquet'\n\n# Specify the base path for where the data files should be landed. In this example, we want them to be stored in the root of the 'Files' folder in the Fabric lakehouse that has been attached to the Fabric notebook.\nbase_path = '/lakehouse/default/Files/'\n\noutput = odb.get_data(dataset_path, resource_name, base_path=base_path, access_key=access_key)\n\n# Print the file locations\nprint(output.data_file_name)\nprint(output.metadata_file_name)\n```\n\n#### Using `get_data_files`\n\n```python\nimport opendatablend as odb\n\ndataset_path = 'https://packages.opendatablend.io/v1/open-data-blend-road-safety/datapackage.json'\naccess_key = '\u003cACCESS_KEY\u003e' # The access key can be set to an empty string if you are making a public API request\n\n# Specify the resource names of the data files. In this example, a subset of the available data files will be requested in Parquet format.\nresource_names = [\n    'date-parquet',\n    'time-of-day-parquet',\n    'geolocation-parquet',\n    'road-safety-accident-info-parquet',\n    'road-safety-accident-location-parquet',\n    'road-safety-accident-2021-parquet'\n    ]\n\n# Specify the base path for where the data files should be landed. In this example, we want them to be stored in the root of the 'Files' folder in the Fabric lakehouse that has been attached to the Fabric notebook. \nbase_path = '/lakehouse/default/Files/'\n\noutput = odb.get_data_files(dataset_path, resource_names, base_path=base_path, access_key=access_key)\n\n# Print the file locations\nprint(output.data_file_names)\nprint(output.metadata_file_name)\n```\n\n## Additional Examples\n\nFor more in-depth examples, see the [examples](https://github.com/opendatablend/opendatablend-py/tree/master/examples) folder.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopendatablend%2Fopendatablend-py","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopendatablend%2Fopendatablend-py","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopendatablend%2Fopendatablend-py/lists"}