{"id":24782107,"url":"https://github.com/fabsdevx/file-format-converter-handout","last_synced_at":"2026-05-06T22:08:41.544Z","repository":{"id":274727190,"uuid":"923876829","full_name":"FabSDevX/File-Format-Converter-Handout","owner":"FabSDevX","description":"Data Engineering project for learning purposes. Credits to itversity","archived":false,"fork":false,"pushed_at":"2025-01-29T22:08:02.000Z","size":6144,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-24T05:31:18.730Z","etag":null,"topics":["csv","csv-import","data","data-engineering","database","pandas","python"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/FabSDevX.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-29T01:24:36.000Z","updated_at":"2025-01-29T22:08:06.000Z","dependencies_parsed_at":"2025-01-29T02:27:21.530Z","dependency_job_id":"0d1cd155-760f-45c7-9296-5042a195ea23","html_url":"https://github.com/FabSDevX/File-Format-Converter-Handout","commit_stats":null,"previous_names":["fabsdevx/file-format-converter-handout"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/FabSDevX/File-Format-Converter-Handout","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabSDevX%2FFile-Format-Converter-Handout","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabSDevX%2FFile-Format-Converter-Handout/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabSDevX%2FFile-Format-Converter-Handout/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabSDevX%2FFile-Format-Converter-Handout/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/FabSDevX","download_url":"https://codeload.github.com/FabSDevX/File-Format-Converter-Handout/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/FabSDevX%2FFile-Format-Converter-Handout/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32713879,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-06T19:35:05.142Z","status":"ssl_error","status_checked_at":"2026-05-06T19:35:03.996Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","csv-import","data","data-engineering","database","pandas","python"],"created_at":"2025-01-29T11:16:06.281Z","updated_at":"2026-05-06T22:08:41.514Z","avatar_url":"https://github.com/FabSDevX.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# File Format Converter Handout\nThe objective of this project is to develop solutions based on the design provided. In this case, the source data was obtained in the form of CSV files from a MySQL DB.\n\nTo improve the efficiency of our data engineering pipelines, we need to convert these CSV files into JSON files, since JSON is better to use in downstream applications than CSV files. The scope of this project involves converting CSV files into JSON files.\n\n### Data Model Details\n![image](https://github.com/user-attachments/assets/77882e8f-a9bb-462e-9715-0bbb3fdb0de8)\n\n### Design\n![image](https://github.com/user-attachments/assets/beead5ef-4fc2-4fa1-82a4-f4039bd8c8b4)\n\n### Setup Instructions\n1. Setup the Project Using VSCode\n2. Make sure you have set up a virtual environment (creating venv, requirements.txt, etc.,) and installed dependencies for the project.\n3. It is essential that you deploy the application with the core logic.\n4. Run the project after setting all the environment variables.\n5. Take appropriate steps to handle the exception\n\n### Validation Steps\n\n* You should check whether the data in the files has been converted properly.\n* Make sure the target folder has been created and populated with JSON files and confirm that the schema structure was accurately reflected from the CSV file. (Hint: Refer to schemas.json)\n* Take the count of records in the CSV files and compare it to the number of records in the JSON files.\n``` Python\nimport pandas as pd\n# ###### Read orders JSON File using PANDAS\norders_data_json= pd.read_json(\n    'data/retail_db/orders_json/part-00000',\n    lines=True\n)\n# To find count of rows\norders_data_json.count()\n# ###### Read order_items JSON File using PANDAS\norder_items_data_json= pd.read_json(\n    'data/retail_db/order_items_json/part-00000',\n    lines=True\n)\n# To find count of rows\norder_items_data_json.count()\n```\n\n### Technologies Used\n* Programming Language – Python\n* Pandas – For Converting CSV to Dataframe and then Dataframe into JSON.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabsdevx%2Ffile-format-converter-handout","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffabsdevx%2Ffile-format-converter-handout","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabsdevx%2Ffile-format-converter-handout/lists"}