{"id":27736784,"url":"https://github.com/atomjay/taiwan-stocks-crawler","last_synced_at":"2025-06-25T23:05:56.914Z","repository":{"id":289978539,"uuid":"973017816","full_name":"atomjay/taiwan-stocks-crawler","owner":"atomjay","description":"基於領域驅動設計 (DDD) 的台灣股票資料爬蟲系統，專注於從台灣證券交易所爬取股票資料並存儲到 PostgreSQL 資料庫中","archived":false,"fork":false,"pushed_at":"2025-04-27T06:00:23.000Z","size":61,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-25T23:05:56.517Z","etag":null,"topics":["rust","rust-lang","rust-library"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/atomjay.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-26T04:46:43.000Z","updated_at":"2025-06-21T15:54:37.000Z","dependencies_parsed_at":"2025-04-26T05:44:33.860Z","dependency_job_id":"fba82532-0edf-4c4b-a038-8c4dfb21460a","html_url":"https://github.com/atomjay/taiwan-stocks-crawler","commit_stats":null,"previous_names":["atomjay/taiwan-stocks-crawler"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/atomjay/taiwan-stocks-crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atomjay%2Ftaiwan-stocks-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atomjay%2Ftaiwan-stocks-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atomjay%2Ftaiwan-stocks-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atomjay%2Ftaiwan-stocks-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/atomjay","download_url":"https://codeload.github.com/atomjay/taiwan-stocks-crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/atomjay%2Ftaiwan-stocks-crawler/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261967132,"owners_count":23237663,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["rust","rust-lang","rust-library"],"created_at":"2025-04-28T14:32:02.840Z","updated_at":"2025-06-25T23:05:56.888Z","avatar_url":"https://github.com/atomjay.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 這是我個人學習 Rust 的專案\n\n## 台灣股票爬蟲系統 (Taiwan Stocks Crawler)\n\n這是一個基於領域驅動設計 (DDD) 的台灣股票資料爬蟲系統，專注於從台灣證券交易所和 Yahoo Finance 爬取股票資料並存儲到 PostgreSQL 資料庫中。系統採用 Rust 語言開發，具有高效能和穩定性，並支援 LINE 通知功能，確保中文字符正確顯示。\n\n## 功能特點\n\n- **股票資料爬蟲**：爬取台灣股票市場上市公司列表和股價數據\n- **股票價格歷史**：獲取股票的歷史價格數據，包括開盤價、最高價、最低價、收盤價、成交量等\n- **財務指標**：獲取股票的本益比、股價淨值比、殖利率、市值等財務指標\n- **三大法人買賣超**：獲取外資、投信、自營商的買賣超資訊\n- **資料庫存儲**：將爬取的數據存儲到 PostgreSQL 資料庫，使用 ON CONFLICT 實現 upsert 操作\n- **自動化數據更新**：定期自動爬取最新的股票數據\n- **RESTful API**：提供 API 接口以訪問股票數據\n- **LINE 通知**：支援透過 LINE Bot 發送股票價格和每日摘要通知，確保中文字符正確顯示\n- **精確數值計算**：使用 BigDecimal 處理金融數據，確保計算精度\n\n## 技術棧\n\n- **語言**：Rust 2024 Edition\n- **資料庫**：PostgreSQL 與 SQLx\n- **HTTP 客戶端**：Reqwest\n- **HTML 解析**：Scraper\n- **序列化/反序列化**：Serde\n- **Web 框架**：Axum\n- **日誌**：Tracing\n- **非同步運行時**：Tokio\n- **日期時間處理**：Time\n- **錯誤處理**：Anyhow\n- **精確數值**：BigDecimal\n- **字符編碼**：Encoding_rs (處理 BIG5 編碼)\n- **通知服務**：LINE Messaging API\n\n## 系統架構\n\n本專案採用領域驅動設計 (DDD) 架構，分為四個主要層次，並經過重構以簡化結構：\n\n```\nsrc/\n  ├── domain/                    # 領域層 - 核心業務邏輯和規則\n  │   ├── models/                # 實體模型 - 具有唯一標識的對象\n  │   │   ├── stock.rs           # 股票實體\n  │   │   └── stock_price.rs     # 股價實體\n  │   ├── repositories/          # 儲存庫接口 - 定義數據存取方法\n  │   │   ├── stock_repository.rs\n  │   │   └── stock_price_repository.rs\n  │   └── value_objects/         # 值對象 - 無唯一標識的對象\n  │       └── date_range.rs\n  │\n  ├── application/               # 應用層 - 協調領域對象完成用戶任務\n  │   ├── dtos/                  # 數據傳輸對象 - 跨層數據傳輸\n  │   │   ├── stock_dto.rs\n  │   │   └── stock_price_dto.rs\n  │   └── services/              # 應用服務 - 實現用例\n  │       ├── stock_service.rs\n  │       ├── stock_price_service.rs\n  │       └── notification_service.rs  # 通知服務\n  │\n  ├── infra/                     # 基礎設施層 - 技術實現 (簡化名稱)\n  │   ├── db/                    # 資料庫相關 - 數據庫操作 (簡化名稱)\n  │   │   ├── database.rs\n  │   │   ├── postgres_stock_repository.rs\n  │   │   └── postgres_stock_price_repository.rs  # 實現 upsert 操作\n  │   └── external_services/     # 外部服務 - 與外部系統交互\n  │       ├── stock_crawler_service.rs  # 處理 BIG5 編碼\n  │       └── line_notification_service.rs  # LINE 通知服務，處理 UTF-8 編碼\n  │\n  ├── api/                       # API 層 - 用戶界面 (簡化名稱)\n  │   ├── controllers/           # 控制器 - 處理請求和響應\n  │   │   ├── stock_controller.rs\n  │   │   └── stock_price_controller.rs\n  │   └── routes.rs              # API 路由 - 定義 API 端點 (簡化結構)\n  │\n  └── main.rs                    # 主程式入口點\n```\n\n## 系統流程圖與 UML 圖表\n\n### 系統流程圖 (Mermaid Flowchart)\n\n```mermaid\nflowchart TD\n    A[程式啟動] --\u003e B[初始化系統]\n    B --\u003e B1[設置日誌系統]\n    B --\u003e B2[載入環境變數]\n    B --\u003e B3[建立資料庫連接池]\n    B --\u003e B4[執行資料庫遷移]\n    B --\u003e B5[初始化儲存庫]\n    B --\u003e B6[初始化應用服務]\n    B --\u003e B7[初始化控制器]\n    B --\u003e B8[初始化通知服務]\n    \n    B7 --\u003e C[創建 API 路由]\n    B7 --\u003e D[執行爬蟲任務]\n    \n    D --\u003e D1[爬取股票列表]\n    D1 --\u003e D2[保存股票到資料庫]\n    D2 --\u003e D3[爬取股票價格]\n    D3 --\u003e D4[保存價格到資料庫]\n    D4 --\u003e D5[發送通知]\n    \n    C --\u003e E[啟動 Web 服務器]\n    \n    subgraph 爬蟲流程\n    D1\n    D3 --\u003e D3_1[爬取股票基本資訊]\n    D3 --\u003e D3_2[爬取三大法人買賣超]\n    end\n    \n    subgraph 通知流程\n    D5 --\u003e D5_1[發送初始通知]\n    D5 --\u003e D5_2[發送每日摘要]\n    D5 --\u003e D5_3[發送股價變動通知]\n    end\n```\n\n### 系統架構圖 (Mermaid Flowchart)\n\n```mermaid\nflowchart TB\n    subgraph 表現層 - API\n        A1[Stock Controller]\n        A2[Stock Price Controller]\n        A3[API 路由]\n    end\n    \n    subgraph 應用層 - Application\n        B1[Stock Service]\n        B2[Stock Price Service]\n        B3[DTOs]\n        B4[Notification Service]\n    end\n    \n    subgraph 領域層 - Domain\n        C1[Stock 實體]\n        C2[Stock Price 實體]\n        C3[儲存庫介面]\n    end\n    \n    subgraph 基礎設施層 - Infrastructure\n        D1[Stock Crawler Service]\n        D2[Postgres Stock Repository]\n        D3[Postgres Stock Price Repository]\n        D4[資料庫連接]\n        D5[LINE Notification Service]\n    end\n    \n    A1 --\u003e B1\n    A2 --\u003e B2\n    A3 --\u003e A1\n    A3 --\u003e A2\n    \n    B1 --\u003e C1\n    B1 --\u003e C3\n    B2 --\u003e C2\n    B2 --\u003e C3\n    B4 --\u003e D5\n    \n    C3 --\u003e D2\n    C3 --\u003e D3\n    \n    D1 --\u003e C1\n    D1 --\u003e C2\n    D2 --\u003e D4\n    D3 --\u003e D4\n    D5 --\u003e B4\n```\n\n### 類別圖 (UML Class Diagram)\n\n```mermaid\nclassDiagram\n    class Stock {\n        +Uuid id\n        +String code\n        +String name\n        +OffsetDateTime last_updated\n        +new(code, name): Stock\n    }\n    \n    class StockPrice {\n        +Uuid id\n        +Uuid stock_id\n        +Date date\n        +BigDecimal open\n        +BigDecimal high\n        +BigDecimal low\n        +BigDecimal close\n        +u64 volume\n        +BigDecimal change\n        +BigDecimal change_percent\n        +u64 turnover\n        +u64 transactions\n        +Option~BigDecimal~ pe_ratio\n        +Option~BigDecimal~ pb_ratio\n        +Option~BigDecimal~ dividend_yield\n        +Option~u64~ market_cap\n        +Option~i64~ foreign_buy\n        +Option~i64~ trust_buy\n        +Option~i64~ dealer_buy\n        +new(...): StockPrice\n        +with_details(...): StockPrice\n        +calculate_change(prev_close): void\n    }\n    \n    class StockCrawlerService {\n        +new(): StockCrawlerService\n        +crawl_stocks(): Result~Vec~Stock~~\n        +crawl_stock_prices(stock_code): Result~Vec~StockPrice~~\n        +crawl_stock_info(stock_code): Result~HashMap~String, f64~~\n        +crawl_institutional_investors(stock_code): Result~HashMap~String, (i64, i64, i64)~~\n        -parse_float_from_text(text): Option~f64~\n        -parse_bigdecimal_from_text(text): Option~BigDecimal~\n        -extract_value_from_document(document, label): Option~f64~\n    }\n    \n    class LineNotificationService {\n        -client: Client\n        -channel_access_token: String\n        -user_id: String\n        +new(channel_access_token, user_id): LineNotificationService\n        +send_stock_price_notification(stock, price): Result~()~\n        +send_daily_summary(date, stocks): Result~()~\n        +send_custom_message(text): Result~()~\n        -send_push_message(user_id, message): Result~()~\n        -build_stock_price_message(stock, price): Value\n        -build_daily_summary_message(date, stocks): Value\n        -utf8_encode(text): String  # 確保中文字符正確編碼\n        -format_number(number): String\n    }\n    \n    class StockService {\n        -stock_repository: Arc~dyn StockRepository~\n        +new(stock_repository): StockService\n        +create_stock(dto): Result~Stock~\n        +get_all_stocks(): Result~Vec~StockDto~~\n        +get_stock_by_id(id): Result~Option~StockDto~~\n        +get_stock_by_code(code): Result~Option~StockDto~~\n        +update_stock(id, dto): Result~Stock~\n        +delete_stock(id): Result~()~\n    }\n    \n    class StockPriceService {\n        -stock_price_repository: Arc~dyn StockPriceRepository~\n        -stock_repository: Arc~dyn StockRepository~\n        +new(stock_price_repository, stock_repository): StockPriceService\n        +create_stock_price(dto): Result~StockPrice~\n        +get_stock_price_by_id(id): Result~Option~StockPriceDto~~\n        +get_stock_prices_by_stock_id(stock_id): Result~Vec~StockPriceDto~~\n        +get_stock_prices_by_stock_code(code): Result~Vec~StockPriceDto~~\n        +get_stock_prices_by_date_range(stock_id, start_date, end_date): Result~Vec~StockPriceDto~~\n        +get_latest_stock_price(stock_id): Result~Option~StockPriceDto~~\n    }\n    \n    class NotificationService {\n        -line_service: Arc~LineNotificationService~\n        -stock_price_service: Arc~StockPriceService~\n        -stock_service: Arc~StockService~\n        +new(line_service, stock_price_service, stock_service): NotificationService\n        +send_stock_price_notification(stock_id): Result~()~\n        +send_daily_summary(): Result~()~\n        +send_custom_message(text): Result~()~\n    }\n    \n    StockPrice \"many\" --\u003e \"1\" Stock : belongs to\n    StockService --\u003e \"uses\" StockRepository\n    StockPriceService --\u003e \"uses\" StockPriceRepository\n    StockPriceService --\u003e \"uses\" StockRepository\n    NotificationService --\u003e \"uses\" LineNotificationService\n    NotificationService --\u003e \"uses\" StockPriceService\n    NotificationService --\u003e \"uses\" StockService\n```\n\n### 序列圖 (Sequence Diagram)\n\n```mermaid\nsequenceDiagram\n    participant Main as 主程式\n    participant System as 系統初始化\n    participant Crawler as 股票爬蟲服務\n    participant StockSvc as 股票服務\n    participant PriceSvc as 股價服務\n    participant NotifSvc as 通知服務\n    participant LineNotif as LINE通知服務\n    participant DB as 資料庫\n    \n    Main-\u003e\u003eSystem: 啟動系統\n    System-\u003e\u003eDB: 初始化資料庫連接\n    System-\u003e\u003eCrawler: 初始化爬蟲服務\n    System-\u003e\u003eStockSvc: 初始化股票服務\n    System-\u003e\u003ePriceSvc: 初始化股價服務\n    System-\u003e\u003eNotifSvc: 初始化通知服務\n    System-\u003e\u003eLineNotif: 初始化LINE通知服務\n    \n    Main-\u003e\u003eCrawler: 執行爬蟲任務\n    Crawler-\u003e\u003eCrawler: 爬取股票列表\n    Crawler-\u003e\u003eStockSvc: 保存股票數據\n    StockSvc-\u003e\u003eDB: 儲存股票\n    \n    loop 對每支股票\n        Crawler-\u003e\u003eCrawler: 爬取股票價格\n        Crawler-\u003e\u003eCrawler: 爬取基本資訊\n        Crawler-\u003e\u003eCrawler: 爬取三大法人買賣超\n        Crawler-\u003e\u003ePriceSvc: 保存股價數據\n        PriceSvc-\u003e\u003eDB: 儲存股價\n    end\n    \n    Main-\u003e\u003eNotifSvc: 發送初始通知\n    NotifSvc-\u003e\u003eLineNotif: 發送自訂訊息\n    LineNotif--\u003e\u003eNotifSvc: 通知結果\n    \n    Main-\u003e\u003eNotifSvc: 發送每日摘要\n    NotifSvc-\u003e\u003eStockSvc: 獲取所有股票\n    StockSvc-\u003e\u003eDB: 查詢股票\n    DB--\u003e\u003eStockSvc: 股票列表\n    StockSvc--\u003e\u003eNotifSvc: 股票列表\n    NotifSvc-\u003e\u003ePriceSvc: 獲取最新價格\n    PriceSvc-\u003e\u003eDB: 查詢價格\n    DB--\u003e\u003ePriceSvc: 價格數據\n    PriceSvc--\u003e\u003eNotifSvc: 價格數據\n    NotifSvc-\u003e\u003eLineNotif: 發送每日摘要\n    LineNotif--\u003e\u003eNotifSvc: 通知結果\n    \n    Main-\u003e\u003eSystem: 啟動API服務器\n```\n\n## 環境設置\n\n### 前置需求\n\n- Rust 2024 Edition\n- PostgreSQL 15+\n- LINE Messaging API 帳號 (用於通知功能)\n\n### 環境變數\n\n創建 `.env` 文件並設置以下環境變數：\n\n```\nDATABASE_URL=postgres://username:password@localhost:5432/taiwan_stocks\nLINE_CHANNEL_ACCESS_TOKEN=your_line_channel_access_token\nLINE_USER_ID=your_line_user_id\n```\n\n### 資料庫設置\n\n1. 創建 PostgreSQL 資料庫：\n\n```sql\nCREATE DATABASE taiwan_stocks;\n```\n\n2. 運行遷移腳本：\n\n```bash\ncargo run --bin migrate\n```\n\n## 運行\n\n```bash\ncargo run\n```\n\n## 主要功能\n\n### 股票爬蟲\n\n系統會自動爬取台灣證券交易所的股票列表和價格數據，並存儲到資料庫中。爬蟲服務會處理中文編碼問題，使用 BIG5 編碼解析網頁內容，確保正確顯示股票名稱。\n\n### LINE 通知\n\n系統支援透過 LINE Bot 發送以下類型的通知：\n\n1. **初始通知**：系統啟動時發送\n2. **每日摘要**：包含漲幅前5名和跌幅前5名的股票\n3. **股價變動通知**：當股票價格發生顯著變化時發送\n\n所有通知都經過 UTF-8 編碼處理，確保中文字符（如股票名稱）能夠正確顯示，避免出現亂碼（如 `�x�d`）。\n\n### 數據處理\n\n系統使用 BigDecimal 處理金融數據，確保計算精度，避免浮點數計算誤差。在與 PostgreSQL 數據庫交互時，系統會正確處理數值類型的轉換。\n\n### 資料庫操作\n\n系統使用 ON CONFLICT 子句實現 upsert 操作，允許在保存股票價格時更新現有記錄，而不是因為唯一約束違反而導致錯誤。這確保了數據的一致性和完整性。\n\n## 常見問題\n\n1. **資料庫連接失敗**\n   - 確認 PostgreSQL 服務已啟動\n   - 檢查 `.env` 文件中的連接字串是否正確\n   - 確認資料庫用戶有適當的權限\n\n2. **爬蟲失敗**\n   - 檢查網絡連接\n   - 確認目標網站是否更改了 HTML 結構\n   - 調整爬蟲服務中的選擇器\n\n3. **LINE 通知失敗**\n   - 確認 LINE Channel Access Token 是否有效\n   - 確認 LINE User ID 是否正確\n   - 檢查 LINE Messaging API 的配額限制\n\n4. **中文亂碼問題**\n   - 系統使用 BIG5 編碼處理台灣網站的中文字符\n   - 確保數據庫使用 UTF-8 編碼\n   - 通知服務中使用 utf8_encode 方法確保中文正確顯示在 LINE 通知中\n   - 如果仍有亂碼問題，檢查爬蟲服務中的字符編碼處理邏輯\n\n## 未來計劃\n\n- [ ] 增加更多股票資訊來源\n- [ ] 實現股票價格預測功能\n- [ ] 增加更多通知渠道 (Email, Telegram 等)\n- [ ] 優化爬蟲效率和穩定性\n- [ ] 增加用戶界面 (Web UI)\n- [ ] 改進字符編碼處理，支持更多語言和編碼格式\n\n## 貢獻\n\n歡迎提交 Issue 和 Pull Request。\n\n## 許可證\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatomjay%2Ftaiwan-stocks-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fatomjay%2Ftaiwan-stocks-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fatomjay%2Ftaiwan-stocks-crawler/lists"}