{"id":21184566,"url":"https://github.com/yuhexiong/oracle-data-pipeline-spark-python","last_synced_at":"2026-05-21T10:06:38.970Z","repository":{"id":258119644,"uuid":"869409875","full_name":"yuhexiong/oracle-data-pipeline-spark-python","owner":"yuhexiong","description":null,"archived":false,"fork":false,"pushed_at":"2024-11-24T09:22:20.000Z","size":8,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-14T19:52:37.600Z","etag":null,"topics":["apache-doris","apache-spark","doris","oracle","pipeline","spark"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yuhexiong.png","metadata":{"files":{"readme":"README-CH.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-08T08:48:39.000Z","updated_at":"2024-11-24T09:22:24.000Z","dependencies_parsed_at":"2024-10-17T18:44:46.340Z","dependency_job_id":"3ea11807-d1df-4066-8a06-42f0ddc7fbeb","html_url":"https://github.com/yuhexiong/oracle-data-pipeline-spark-python","commit_stats":null,"previous_names":["yuhexiong/oracle-data-pipeline-spark-python"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yuhexiong/oracle-data-pipeline-spark-python","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yuhexiong%2Foracle-data-pipeline-spark-python","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yuhexiong%2Foracle-data-pipeline-spark-python/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yuhexiong%2Foracle-data-pipeline-spark-python/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yuhexiong%2Foracle-data-pipeline-spark-python/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yuhexiong","download_url":"https://codeload.github.com/yuhexiong/oracle-data-pipeline-spark-python/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yuhexiong%2Foracle-data-pipeline-spark-python/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33297213,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-21T02:57:32.698Z","status":"ssl_error","status_checked_at":"2026-05-21T02:57:31.990Z","response_time":62,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-doris","apache-spark","doris","oracle","pipeline","spark"],"created_at":"2024-11-20T18:09:22.042Z","updated_at":"2026-05-21T10:06:38.955Z","avatar_url":"https://github.com/yuhexiong.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Oracle Data Pipeline Spark\n\n使用 Spark 撰寫把 Oracle 轉換至 Doris 的資料管道。\n\n## Overview\n\n- 語言: Python\n- 資料轉換框架: Spark v3.5.1\n\n\n## Run\n\n### Run Docker Container\n\n修改 docker-compose.yaml 裡面的檔名    \n```\ndocker compose up -d\n```\n\n\n## Entry\n\n### Oracle To Doris\n\n轉換換行符號為空字串  \n將 `RATING` 欄位的 NULL 補值為 0  \n\n程式碼參考  \n(1) [oracle_to_doris.py](oracle_to_doris.py)  \n(2) 將資料格式定義在 yaml [oracle_to_doris_yaml.py](oracle_to_doris_yaml.py) 和 [oracle_to_doris.yaml](oracle_to_doris.yaml)  \n\n\n- Oracle Table\n```\n| BOOKID  | TITLE               | AUTHOR              | PUBLICATIONYEAR  | GENRE           | RATING | STATUS      |\n|---------|---------------------|---------------------|------------------|-----------------|--------|-------------|\n| 1       | Dune                | Frank Herbert       | 1965             | SCIENCE FICTION | 4.5    | AVAILABLE   |\n| 2       | 1984                | George Orwell       | 1949             | DYSTOPIAN       | 4.7    | CHECKED OUT |\n| 3       | Pride and Prejudice | Jane Austen         | 1813             | ROMANCE         | 4.6    | AVAILABLE   |\n| 4       | The Great Gatsby    | F. Scott Fitzgerald | 1925             | CLASSIC         | 4.4    | AVAILABLE   |\n| 5       | The Hobbit          | J.R.R. Tolkien      | 1937             | FANTASY         | 4.8    | CHECKED OUT |\n```\n\n\n- Doris Table\n```\n| book_id | title               | author              | publication_year | genre           | rating | status      |\n|---------|---------------------|---------------------|------------------|-----------------|--------|-------------|\n| 1       | Dune                | Frank Herbert       | 1965             | SCIENCE FICTION | 4.5    | AVAILABLE   |\n| 2       | 1984                | George Orwell       | 1949             | DYSTOPIAN       | 4.7    | CHECKED OUT |\n| 3       | Pride and Prejudice | Jane Austen         | 1813             | ROMANCE         | 4.6    | AVAILABLE   |\n| 4       | The Great Gatsby    | F. Scott Fitzgerald | 1925             | CLASSIC         | 4.4    | AVAILABLE   |\n| 5       | The Hobbit          | J.R.R. Tolkien      | 1937             | FANTASY         | 4.8    | CHECKED OUT |\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyuhexiong%2Foracle-data-pipeline-spark-python","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyuhexiong%2Foracle-data-pipeline-spark-python","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyuhexiong%2Foracle-data-pipeline-spark-python/lists"}