{"id":15976661,"url":"https://github.com/wamuir/fpds-conversion-utility","last_synced_at":"2026-01-19T14:32:17.006Z","repository":{"id":55877784,"uuid":"123047587","full_name":"wamuir/fpds-conversion-utility","owner":"wamuir","description":"A replacement for the FPDS XML conversion utility: converts one or more FPDS data archives to a SQLite3 database.","archived":false,"fork":false,"pushed_at":"2021-07-25T16:50:02.000Z","size":654,"stargazers_count":3,"open_issues_count":1,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-02-10T02:37:42.197Z","etag":null,"topics":["fpds","government","spend-analysis","sql","xml"],"latest_commit_sha":null,"homepage":"","language":"XSLT","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wamuir.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-02-27T00:08:23.000Z","updated_at":"2023-05-09T14:01:04.000Z","dependencies_parsed_at":"2022-08-15T08:30:27.081Z","dependency_job_id":null,"html_url":"https://github.com/wamuir/fpds-conversion-utility","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wamuir%2Ffpds-conversion-utility","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wamuir%2Ffpds-conversion-utility/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wamuir%2Ffpds-conversion-utility/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wamuir%2Ffpds-conversion-utility/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wamuir","download_url":"https://codeload.github.com/wamuir/fpds-conversion-utility/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247217220,"owners_count":20903009,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fpds","government","spend-analysis","sql","xml"],"created_at":"2024-10-07T22:40:27.134Z","updated_at":"2026-01-19T14:32:16.999Z","avatar_url":"https://github.com/wamuir.png","language":"XSLT","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://github.com/wamuir/fpds-conversion-utility/actions/workflows/build.yml/badge.svg?branch=master\u0026event=push)](https://github.com/wamuir/fpds-conversion-utility/actions/workflows/build.yml?query=event%3Apush+branch%3Amaster)\n\n# FPDS XML Conversion Utility\n\nAn unofficial replacement for the (now defunct) official FPDS XML conversion\nutility.  Converts one or more FPDS data archives to a SQLite3 database.\n\n## About this project\n\nThis project intends to provide an unofficial replacement to the FPDS XML\narchive conversion utility.  It addresses several of the issues/limitations\n(identified below) with the existing, but now discontinued, utility. This\nconversion utility will convert one or more FPDS XML archives to a SQLite3\ndatabase and provides support for FPDS Specification Versions 1.4 and 1.5.\n\n#### Background\n\nThe [Federal Procurement Data System (FPDS)](https://www.fpds.gov/) houses\nprocurement/spend data for the U.S. Government.  Archives of annual spend data\nare available in XML format for each Federal agency.  Previously, the General\nServices Administration (GSA; the Federal agency who manages FPDS) published an\nXML conversion utility for converting an FPDS data archive into a\npipe-separated [flat] file. The conversion utility was useful since the\nconverted data could easily be imported into spreadsheet software and\nstatistical packages, which increases the accessibility of Federal procurement\ndata to taxpayers, agencies, suppliers and other parties.\u003csup\nid=\"a1\"\u003e[1](#f1)\u003c/sup\u003e\n\nSupport for the GSA's XML conversion utility was discontinued in 2009/2010 and\nno official replacement has been published.  While the utility is still\navailable for download, several issues preclude its use:\n\n- Data complexity has increased such that a single flat file may no longer\n  properly and efficiently represent relationships between data elements\n  (_e.g._, in cardinality)\n\n- For several agencies, the quantity of data may exceed the limits of\n  commonly-used spreadsheet software (_e.g._, Microsoft Excel, LibreOffice\n  Calc) and/or preclude usability of pivot tables and other tools for\n  summarizing/aggregating data.\n\n- Support for conversion of XML archives ended with FPDS Specification Version\n  1.3.  Archives are no longer posted in this version.  Version 1.3 was\n  deprecated on December 31, 2010, and replaced with Version 1.4.  Version 1.4\n  was deprecated on September 30, 2017, and replaced with Version 1.5\n\nAgency archives can be obtained from [https://www.fpds.gov](fpds.gov).\n\n\n## Compiling and running the utility\n\n#### Compiling\n\nThis utility can be built using CMake \u003e= 3.14. Obtain the necessary depedencies,\nfor example on **Debian/Ubuntu**:\n\n```shell\n$ sudo apt-get -y install \\\n        build-essential \\\n        cmake \\\n        liblzma-dev \\\n        libncurses-dev \\\n        libsqlite3-dev \\\n        libxml2 \\\n        libxslt1-dev \\\n        uuid-dev \\\n        xxd \\\n        zlib1g-dev\n```\n\nVerify that you have CMake \u003e= 3.14 using `cmake --version` and then build:\n\n```shell\n$ git clone --recurse-submodules https://github.com/wamuir/fpds-conversion-utility\n$ mkdir build \u0026\u0026 cmake -S fpds-conversion-utility -B build \u0026\u0026 cmake --build build\n```\n\nThe compiled executable will be at `build/app/conversion-utility`.\n\n#### Running\n\nGiven an FPDS XML archive `archive.xml` the utility can be run as:\n\n```shell\nconversion-utility xml_archive sqlite3_target\n```\n\nMultiple XML archives can be combined into a single SQLite database by invoking\nthe append (`-a`) flag:\n\n```shell\n./conversion-utility archive1.xml bundle.sqlite3\n./conversion-utility -a archive2.xml bundle.sqlite3\n./conversion-utility -a archive3.xml bundle.sqlite3\n```\n\nAnd an existing database can be overwritten by invoking the overwrite (`-o`)\nflag:\n\n```shell\n./conversion-utility -o archive.xml db.sqlite3\n``` \n\n#### Performance\n\nThis utility implements a streaming XML parser to limit memory usage, which is\nespecially useful for converting large archives.  The conversion rate is\ngenerally greater than 100 records per second (machine dependent).\n\nTo make use of multiple threads, pass the (`-t`) option with the desired number of\nthreads, such as:\n\n```shell\n./conversion-utility -t 4 archive.xml target.sqlite3\n```\n\nBy default only a single thread is used (equivalent to `-t 1`).  The optimal value\nfor `t` depends on machine characteristics such as the processor and io.\n\n\n## Database schema\n\n#### Tables\n\n| SQLite3 Table               | Cardinality | FPDS Element Group                  | FPDS Elements         |\n| --------------------------- | ----------- | ----------------------------------- | --------------------- |\n| additionalReporting         | one-to-many | Legislative Mandates                | 7G                    |\n| documentID                  | one-to-one  | Contract Identification Information | 1A-1D 1F-1H           |\n| competitionInformation      | one-to-one  | Competition Information             | 10*                   |\n| contractInformation         | one-to-one  | Contract Information                | 6A-6H 6J-6N 6P-6R 6T  |\n| contractMarketingData       | one-to-one  | Contract Marketing Data             | 5*                    |\n| contractorDataA             | one-to-one  | Contractor Data                     | 9*                    | \n| contractorDataB             | one-to-one  | Contractor Data                     | 13*                   |\n| dates                       | one-to-one  | Dates                               | 2*                    |\n| dollarValues                | one-to-one  | Dollar Values, Total Dollar Values  | 3* 3T*                |\n| legislativeMandates         | one-to-one  | Legislative Mandates                | 7A-7F                 |\n| preferencePrograms          | one-to-one  | Preference Programs                 | 11*                   |\n| productOrServiceInformation | one-to-one  | Product or Service Information      | 8*                    |\n| purchaserInformation        | one-to-one  | Purchaser Information               | 4*                    |\n| soliciationID               | one-to-one  | Contract Identification Information | 1E                    |\n| transactionInformation      | one-to-one  | Transaction Information             | 12*                   |\n| treasuryAccount             | one-to-many | Contract Information                | 6SC, 6SG, 6SH, 6SI    |\n\n#### Data elements\n\n- Information on data elements can be found within the FPDS data dictionary, \navailable at [fpds.gov](https://www.fpds.gov).  For each element, the \ncorresponding column name in the Sqlite3 database is identical to the `XML Tag \nName` within the data dictionary.\n\n\n#### Views\n\n- Two views are provided for ease of working with the data\n\n###### `documentID` view\n\n- Identifies document id (integer and primary key), document type (award, IDV) \nand contract identifiers (Agency ID, PIID, Modification Number, Transaction \nNumber, etc.)  \n\n- This view is created by the conversion utility, as:\n\n```sql\nCREATE VIEW IF NOT EXISTS documentID AS\n SELECT record.id AS id, record.docType as docType,\n    awardContractID.agencyID AS awardContractAgencyID,\n    awardContractID.PIID AS awardContractPIID,\n    awardContractID.modNumber AS awardContractModNumber,\n    awardContractID.transactionNumber AS awardContractTransactionNumber,\n    IDVID.agencyID AS IDVAgencyID,\n    IDVID.PIID AS IDVPIID,\n    IDVID.modNumber AS IDVModNumber,\n    referencedIDVID.agencyID AS referencedIDVagencyID,\n    referencedIDVID.PIID AS referencedIDVPIID,\n    referencedIDVID.modNumber AS referencedIDVmodNumber\n  FROM record\n  LEFT JOIN awardContractID ON record.id = awardContractID.id\n  LEFT JOIN IDVID ON record.id = IDVID.id\n  LEFT JOIN referencedIDVID ON record.id = referencedIDVID.id;\n```\n\n###### `fact` view\n\n- For potential use when \n[importing](#importing-archive-data-into-a-statistical-package-from-sqlite3), \n[exporting](#exporting-archive-data-from-sqlite3-to-a-flat-file) or other \ninstances where a fact table might come in handy\n\n- This view is created by the conversion utility, as:\n\n```sql\nCREATE VIEW fact AS\n  SELECT *\n  FROM documentID\n  LEFT JOIN competitionInformation on documentID.id = competitionInformation.id\n  LEFT JOIN contractInformation on documentID.id = contractInformation.id\n  LEFT JOIN contractMarketingData on documentID.id = contractMarketingdata.id\n  LEFT JOIN contractorDataA on documentID.id = contractorDataA.id\n  LEFT JOIN contractorDataB on documentID.id = contractorDataB.id\n  LEFT JOIN dates on documentID.id = dates.id\n  LEFT JOIN dollarValues on documentID.id = dollarValues.id\n  LEFT JOIN legislativeMandates on documentID.id = legislativeMandates.id\n  LEFT JOIN preferencePrograms on documentID.id = preferencePrograms.id\n  LEFT JOIN productOrServiceInformation on documentID.id = productOrServiceInformation.id\n  LEFT JOIN purchaserInformation on documentID.id = purchaserInformation.id\n  LEFT JOIN solicitationID on documentID.id = solicitationID.id\n  LEFT JOIN transactionInformation on documentID.id = transactionInformation.id;\n```\n\n\n## Importing and exporting data\n\n#### Importing archive data into a statistical package from SQLite3\n\nA simple example is given below for importing data into _R_:\n\n```r\n#!/usr/bin/env Rscript\n\nconn \u003c-  DBI::dbConnect(RSQLite::SQLite(), dbname=\"archive.sqlite3\")\nquery \u003c- DBI::dbSendQuery(conn, \"SELECT documentID.*, dollarValues.obligatedAmount\n                                 FROM documentID\n                                 LEFT JOIN dollarValues ON documentID.id = dollarValues.id\n                                 WHERE documentID.docType = 'award';\")\ndf \u003c- DBI::dbFetch(query, n = -1)\nDBI::dbClearResult(query)\nDBI::dbDisconnect(conn)\n```\n\nAnd, for importing the same data into Python:\n\n```python\n#!/usr/bin/env python3\n\nimport sqlite3\n\nconn = sqlite3.connect(\"archive.sqlite3\")\nc = conn.cursor()\nc.execute('''SELECT documentID.*, dollarValues.obligatedAmount\n             FROM documentID\n             LEFT JOIN dollarValues ON documentID.id = dollarValues.id\n             WHERE documentID.docType = 'award';''')\ndf = c.fetchall()\nconn.close()\n```\n\n#### Exporting archive data from SQLite3 to a flat file\n\nIdeally, don't do this. If you wish to flatten and export data, a flat table\nview is provided of one-to-one relationships and can be exported as follows:\n\n```sql\n.open archive.sqlite3\n.headers on\n.mode csv\n.output exported.csv\nSELECT * FROM fact;\n```\n\n## Limitations\n\n- ~~Currently, no support for `other transactions`~~ (Other Transaction supported added 2019-05-19)\n\n- Currently, no support for agency-specific (_e.g._, NASA) data elements\n\n\n\u003cbr/\u003e\u003cbr/\u003e \u003cb id=\"f1\"\u003e1\u003c/b\u003e Specifically, this refers to the accessibility of\nsets of data for analyses. Individual transactions can be searched/queried at\n[fpds.gov](https://www.fpds.gov/). Data is also available via ATOM Feed as well\nas aggregator sites (_e.g._, [usaspending.gov](https://www.usaspending.gov))\nbut do not resolve one or more of the issues identified or present additional\nissues.[↩](#a1)\n \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwamuir%2Ffpds-conversion-utility","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwamuir%2Ffpds-conversion-utility","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwamuir%2Ffpds-conversion-utility/lists"}