{"id":26903413,"url":"https://github.com/edserranoc/scripting-with-python-and-sql-for-data-engineering","last_synced_at":"2025-06-12T19:05:47.921Z","repository":{"id":285337216,"uuid":"957770742","full_name":"edserranoc/Scripting-with-Python-and-SQL-for-Data-Engineering","owner":"edserranoc","description":"Scripting with Python and SQL for Data Engineering Course Course - Duke University","archived":false,"fork":false,"pushed_at":"2025-04-24T23:40:21.000Z","size":3390,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-25T00:31:52.805Z","etag":null,"topics":["databases-course","json","mysql","python-programming","sql","web-scraping"],"latest_commit_sha":null,"homepage":"https://www.coursera.org/learn/scripting-with-python-sql-for-data-engineering-duke?specialization=python-bash-sql-data-engineering-duke","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/edserranoc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-03-31T05:17:12.000Z","updated_at":"2025-04-24T23:40:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"53234683-ceac-45ad-966d-b680ad5a5d9d","html_url":"https://github.com/edserranoc/Scripting-with-Python-and-SQL-for-Data-Engineering","commit_stats":null,"previous_names":["edserranoc/scripting-with-python-and-sql-for-data-engineering"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/edserranoc/Scripting-with-Python-and-SQL-for-Data-Engineering","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edserranoc%2FScripting-with-Python-and-SQL-for-Data-Engineering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edserranoc%2FScripting-with-Python-and-SQL-for-Data-Engineering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edserranoc%2FScripting-with-Python-and-SQL-for-Data-Engineering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edserranoc%2FScripting-with-Python-and-SQL-for-Data-Engineering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/edserranoc","download_url":"https://codeload.github.com/edserranoc/Scripting-with-Python-and-SQL-for-Data-Engineering/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/edserranoc%2FScripting-with-Python-and-SQL-for-Data-Engineering/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259519775,"owners_count":22870364,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["databases-course","json","mysql","python-programming","sql","web-scraping"],"created_at":"2025-04-01T10:28:17.971Z","updated_at":"2025-06-12T19:05:47.913Z","avatar_url":"https://github.com/edserranoc.png","language":"Jupyter Notebook","readme":"# Scripting with Python and SQL for Data Engineering\n\n## Course Overview\nThis repository contains notes, code snippets, and exercises from the Coursera course **\"Scripting with Python and SQL for Data Engineering\"** offered by Duke University. The course focuses on using Python and SQL for automating data workflows, managing databases, and performing data transformations.\n\n## Topics Covered\n\n### Week 1: Working with Data in Python\n- Using built-in Python modules\n- File handling (reading/writing files)\n- Working with popular formats like JSON\n- Serialize from Python to a JSON file\n- Build a useful Python Decorator\n\n### Week 2: Python Scripting and SQL\n- Writing and running Python scripts\n- Create database, store data adnd query with SQL from Python.\n\n### Week 3: Web Scraping using Python\n- Introduction to REST APIs\n- Making HTTP requests with `requests` library\n- Parsing JSON data\n- Web scraping with `BeautifulSoup`\n\n### Week 4: Working with MySQL\n- Basics of SQL and relational databases\n- Writing SQL queries\n- Using `sqlite3` in Python\n- Creating and managing tables in a database\n- Performing CRUD operations (Create, Read, Update, Delete)\n\u003c!---\n\n### Week 1: Introduction to Python for Data Engineering\n- Understanding scripting vs. interactive programming\n- Writing and running Python scripts\n- Using built-in Python modules\n- File handling (reading/writing files)\n- Automating tasks with Python scripts\n\n### Week 2: Working with APIs and Web Scraping\n- Introduction to REST APIs\n- Making HTTP requests with `requests` library\n- Parsing JSON data\n- Web scraping with `BeautifulSoup`\n\n### Week 3: SQL for Data Engineering\n- Basics of SQL and relational databases\n- Writing SQL queries\n- Using `sqlite3` in Python\n- Creating and managing tables in a database\n- Performing CRUD operations (Create, Read, Update, Delete)\n\n### Week 4: Advanced SQL and Python Integration\n- Writing complex SQL queries\n- Using joins and subqueries\n- Query optimization techniques\n- Connecting Python to external databases (PostgreSQL, MySQL)\n- Using `pandas` for SQL data analysis\n\n### Week 5: Automating Data Pipelines\n- Introduction to ETL (Extract, Transform, Load)\n- Automating ETL workflows using Python\n- Error handling in Python scripts\n- Scheduling and running scripts with `cron` (Linux/macOS) or Task Scheduler (Windows)\n\n### Week 6: Final Project\n- Building a data pipeline using Python and SQL\n- Automating data extraction, transformation, and loading\n- Logging and debugging workflows\n\n\n\n\n## Requirements\nTo follow along with the course materials, install the following dependencies:\n```bash\npip install requests beautifulsoup4 pandas sqlite3\n```\n----\u003e\n## Notes and Resources\n- Course link: [Scripting with Python and SQL for Data Engineering](https://www.coursera.org/learn/scripting-with-python-sql-for-data-engineering-duke)\n- [Python Official Documentation](https://docs.python.org/3/)\n- [SQLite Documentation](https://www.sqlite.org/docs.html)\n- [pandas Documentation](https://pandas.pydata.org/docs/)\n- [Microsoft Learn](https://learn.microsoft.com/en-us/training/browse/?WT.mc_id=academic-0000-alfredodeza\u0026subjects=artificial-intelligence)\n- [Alfredo Deza: python-scripting Repository](https://github.com/alfredodeza/python-scripting)\n- [Alfredo Deza: Parsing With HTMLParser](https://github.com/alfredodeza/scrapy-xpath)\n\n## Progress Tracking\n| Week | Topic | Status |\n|------|-------------------------------|--------|\n| 1    |Working with Data in Python | ✅  |\n| 2    | Python Scripting and SQL | ✅ |\n| 3    |Web Scraping using Python | ⏳ |\n| 4    | Working with MySQL| ❌ | \n\n✅ - Completed  | ⏳ - In Progress  | ❌ - Not Started\n\n## License\nThis project is licensed under the GPL-3.0 license. See the LICENSE file for details.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedserranoc%2Fscripting-with-python-and-sql-for-data-engineering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fedserranoc%2Fscripting-with-python-and-sql-for-data-engineering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fedserranoc%2Fscripting-with-python-and-sql-for-data-engineering/lists"}