{"id":35060924,"url":"https://github.com/openspending/datapackage-pipelines-fiscal","last_synced_at":"2025-12-27T10:31:58.592Z","repository":{"id":62566658,"uuid":"71356214","full_name":"openspending/datapackage-pipelines-fiscal","owner":"openspending","description":"Fiscal Data Package extensions to Datapackage Pipelines","archived":false,"fork":false,"pushed_at":"2019-11-20T16:14:46.000Z","size":10452,"stargazers_count":3,"open_issues_count":4,"forks_count":6,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-10-01T16:21:46.923Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openspending.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-10-19T12:45:54.000Z","updated_at":"2019-11-20T16:14:43.000Z","dependencies_parsed_at":"2022-11-03T16:15:56.093Z","dependency_job_id":null,"html_url":"https://github.com/openspending/datapackage-pipelines-fiscal","commit_stats":null,"previous_names":[],"tags_count":42,"template":false,"template_full_name":null,"purl":"pkg:github/openspending/datapackage-pipelines-fiscal","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openspending%2Fdatapackage-pipelines-fiscal","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openspending%2Fdatapackage-pipelines-fiscal/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openspending%2Fdatapackage-pipelines-fiscal/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openspending%2Fdatapackage-pipelines-fiscal/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openspending","download_url":"https://codeload.github.com/openspending/datapackage-pipelines-fiscal/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openspending%2Fdatapackage-pipelines-fiscal/sbom","scorecard":{"id":710343,"data":{"date":"2025-08-11","repo":{"name":"github.com/openspending/datapackage-pipelines-fiscal","commit":"7b061371056ef19e1b28b50cabcfd9bcf8968f4b"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE.md:0","Info: FSF or OSI recognized license: MIT License: LICENSE.md:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}}]},"last_synced_at":"2025-08-22T07:59:54.582Z","repository_id":62566658,"created_at":"2025-08-22T07:59:54.582Z","updated_at":"2025-08-22T07:59:54.582Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28077503,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-27T02:00:05.897Z","response_time":58,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-12-27T10:31:57.920Z","updated_at":"2025-12-27T10:31:58.587Z","avatar_url":"https://github.com/openspending.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# datapackage-pipelines-fiscal\n\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/datapackage-pipelines-fiscal.svg)](https://pypi.org/project/datapackage-pipelines-fiscal/)\n[![Travis](https://travis-ci.org/openspending/datapackage-pipelines-fiscal.svg?branch=master)](https://travis-ci.org/openspending/datapackage-pipelines-fiscal)\n\nExtension for datapackage-pipelines used for loading Fiscal Data Packages into:\n- S3 (or compatible) storage, in a denormalized form\n- a database in normalized form.\n- Metadata will be stored in an elasticsearch instance (if available) via [os-package-registry](https://github.com/openspending/os-package-registry)\n- A `babbage` model will also be generated and written to the datapackage for querying the database using its API\n\nThis extension works with a custom source spec and a set of processors. The generator will convert the source spec into a set of inter-dependent pipelines, which when run in order will perform data processing and loading to selected endpoints (based on environment variables).\n\nThis extension is used by [os-conductor](https://github.com/openspending/os-conductor) and [os-data-importers](https://github.com/openspending/os-data-importers).\n\n## Environment variables\n\n`DPP_DB_ENGINE` - connection string for an SQL database to dump data into\n\n`ELASTICSEARCH_ADDRESS` [OPTIONAL] - connection string for an elasticsearch instance (used for package registry updating)\n\n`S3_BUCKET_NAME` [OPTIONAL] - S3 bucket for uploading data. If not provided, local ZIP files will be created instead.\n\n`AWS_ACCESS_KEY_ID` - S3 credentials (required if S3 bucket was specified)\n\n`AWS_SECRET_ACCESS_KEY` - S3 credentials (required if S3 bucket was specified)\n\n## Dependencies\n\nIn order to fully run the fiscal datapackage flow you need to have `os-types` installed, using npm:\n\n`$ npm install -g os-types`\n\nThis external node.js utility is used to perform fiscal modelling for the processed datapackage.\n\n## fiscal.source-spec.yaml\n\nEach source-spec contains information regarding a single Fiscal Data Package.\n\nTop level properties are:\n\n#### `title`\nTitle, or Display name, of the data package\n\n#### `dataset-name` [OPTIONAL]\nA slug to be used as the data package's name.\n\nIf not provided, a slugified version of the title will be used.\n\n#### `resource-name` [OPTIONAL]\n\nA slug to be used as the main resource's name in the final data package.\n\nIf not provided, the dataset name will be used.\n\n#### `owner-id`\n\nThe id of the owner of this datapackage.\n\nThis identifier is used to generate various paths and storage names.\n\n#### `sources`\n\nContains a non-empty list of data sources for the fiscal data package.\n\nEach data source has these properties:\n- `url`: The location of the data\n- `name`: [OPTIONAL] A name for this source (will later be used as an intermediate resource name)\n\nOther `tabulator` parameters can also be added as properties here, e.g. `sheet`, `encoding`, `compression` etc.\n\n#### `fields`\n\nContains a non-empty list of fields for the fiscal data package.\n\nEach field definition has these properties:\n- `header`: The `name` of the field in the resulting resource\n- `title` [OPTIONAL]: The display name of the field in the resulting resource\n- `columnType`: The _ColumnType_ of the field\n- `options`: Extra options to be added to the field, e.g. json-table-schema properties such as `decimalChar` etc.\n\n#### `measures` [OPTIONAL]\n\nExtra information for measure normalization processing.\n(Measure normalization is the process of reducing the number of measures to one while multipltying the number of rows and adding extra columns to contain values for identifying the original measure).\n\nContains the following sub-properties:\n- `currency`: The currency code of the output measure column\n- `title` [OPTIONAL]: The title for the output measure column\n- `mapping`: Unpivoting map.\n\nThe unpivoting map is a map from a measure's name to its unpivoting data.\n\n\"Unpivoting data\" is a map from an extra column's name to a value\n\nExample:\n```yaml\nmeasures:\n  currency: GTQ\n  mapping:\n    APPROVED:\n      PHASE_ID: \"0\"\n      PHASE: Inicial\n    RELEASED:\n      PHASE_ID: \"1\"\n      PHASE: Vigente\n    COMMITTED:\n      PHASE_ID: \"2\"\n      PHASE: Comprometido\n```\n\n- `currencies` [OPTIONAL]: List of currency codes to convert to ('USD' by default).\n  See next section for details\n\n\n#### `currency-conversion` [OPTIONAL]\n\nInstructions for adding an extra column or columns with measure values in another currency.\n\n- `date_measure` [OPTIONAL]: Column name from which a date can be extracted.\n  If not provided, a guess will be made according to the _ColumnType_.\n\n- `title` [OPTOINAL]: Title for the currency-converted measure columns.\n\n#### `datapackage-url` [OPTIONAL]\n\nContains the URL for a source datapackage from which this data came from.\nIf provided, metadata for this datapackage will be loaded from this URL.\n\n#### `deduplicate` [OPTIONAL]\n\nIf `true`, then the source data will be processed to remove duplicate rows (i.e. rows which have the same values in the primary key). Measure values for these rows will be summed in order to generate a single output row.\n\n#### `postprocessing` [OPTIONAL]\n\nA list of extra processors (and parameters) that will be applied to the data.\nFormat is as in any `pipeline-spec.yaml`\n\n#### `suppress-os` [OPTIONAL, default is `False`]\n\nIf `False`, an OpenSpending compatible datapackage is created on the datastore. This basic datapackage ensures a basic FDP is available for editing with OpenSpending. Packages created with `os-conductor` already create this artefact, so would use `suppress-os: True`, to prevent another being created unnecessarily.\n\n#### `keep-artifacts` [OPTIONAL, default is `False`]\n\nBy default, pipeline artifacts (temporary directories and files creating during pipeline execution) will be removed after all pipelines have successfully been run. To keep the artifact, set this option to `True`.\n\n\n## Generated Pipelines\n\n#### ./denormalized_flow\n\n- Loads external metadata\n- Collects all data from all sources\n- Combines different sources onto one unified stream\n- Does measure normalization\n- Does currency conversion\n- Does row deduplication\n- Does extra processing steps\n\nOutputs:\n- Denormalized data (local file)\n- List of fiscal years in a separate resource (local file)\n- Updates os package registry (if configured)\n\n#### ./finalize_datapackage_flow_splitter\n_(depends on ` ./denormalized_flow`)_\n\n- Loads denormalized package\n- Writes separate per-year filtered copies of the data\n\n#### ./finalize_datapackage_flow\n_(depends on ` ./finalize_datapackage_flow_splitter`)_\n\n- Loads all resources from the `splitter` pipeline as well as the full denormalized dataset\n\nOutputs:\n- Stores results in S3 bucket\n- Zip file with the datapackage (in case an S3 bucket is not configured)\n- Updates os package registry (if configured)\n\n#### ./dimension_flow_{hierarchy}\n_(depends on ` ./denormalized_flow`)_\n\n- Loads denormalized data\n- Picks only _hierarchy_ columns\n- Add auto-incrementing id column\n- Remove duplicates\n\nOutputs:\n- Normalized hierarchy data (local file)\n\n#### ./normalized_flow\n_(depends on ` ./denormalized_flow` and all `./dimension_flow_{hierarchy}`)_\n\n- Loads denormalized data as fact table\n- Loads all normalized hierarchy data\n- Creates babbage model\n- Replaces all hierarchy columns in fact table with corresponding ids from normalized hierarchy tables\n\nOutputs:\n- Normalized fact table (local file)\n- Updates os package registry (if configured)\n\n#### ./dumper_flow_{hierarchy}\n_(depends on corresponding `./dimension_flow_{hierarchy}`)_\n\n- Loads normalized hierarchy data\n- Fixes nulls in primary key (replacing them with empty strings)\n\nOutputs\n- Saves data as a single table in an SQL database\n\n#### ./dumper_flow\n_(depends on `./normalized_flow`)_\n\n- Loads normalized fact table data\n- Fixes nulls in primary key (replacing them with empty strings)\n\nOutputs\n- Saves data as a single table in an SQL database\n\n#### ./dumper_flow_update_status\n_(depends on `./dumper_flow`)_\n\nOutputs\n- Updates os package registry (if configured) that the package was loaded successfully\n\n## Contributing\n\nPlease read the contribution guideline:\n\n[How to Contribute](CONTRIBUTING.md)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenspending%2Fdatapackage-pipelines-fiscal","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenspending%2Fdatapackage-pipelines-fiscal","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenspending%2Fdatapackage-pipelines-fiscal/lists"}