{"id":21822859,"url":"https://github.com/caspartse/python-baidu-tongji","last_synced_at":"2026-04-10T02:50:50.524Z","repository":{"id":141426804,"uuid":"608662187","full_name":"caspartse/python-baidu-tongji","owner":"caspartse","description":"A modern-style implementation of Baidu Analytics (Tongji) in Python.","archived":false,"fork":false,"pushed_at":"2023-03-30T00:19:37.000Z","size":19837,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-23T11:05:14.757Z","etag":null,"topics":["baidu","tongji","web-analytics","website-analytics"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/caspartse.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-03-02T13:40:04.000Z","updated_at":"2024-04-06T20:25:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"06c78b8e-fed4-4d06-a4a9-94c3cbbfbc67","html_url":"https://github.com/caspartse/python-baidu-tongji","commit_stats":null,"previous_names":[],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/caspartse/python-baidu-tongji","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caspartse%2Fpython-baidu-tongji","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caspartse%2Fpython-baidu-tongji/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caspartse%2Fpython-baidu-tongji/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caspartse%2Fpython-baidu-tongji/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/caspartse","download_url":"https://codeload.github.com/caspartse/python-baidu-tongji/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caspartse%2Fpython-baidu-tongji/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271746755,"owners_count":24813580,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baidu","tongji","web-analytics","website-analytics"],"created_at":"2024-11-27T17:18:11.344Z","updated_at":"2026-04-10T02:50:45.506Z","avatar_url":"https://github.com/caspartse.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# python-baidu-tongji\n\nA modern-style implementation of Baidu Analytics (Tongji) in Python.\n\n利用百度统计API，获取网站实时访客数据，解析并构建 Visitor 、Session 、Event 三个对象，以便后续数据汇总分析。\n\n---\n\n### ✨ 亮点功能\n\n- 百度统计 API Token 生成、自动刷新\n- 更精确的访客区域信息，包含国家、省份、城市三个字段（百度统计仅提供城市简称）\n- 更精细的流量渠道渠道划分，包括社交媒体、站内来源等（百度统计没有“社交媒体”分类；免费版已下架“站内来源”功能）\n- 支持 referrer 域名解析、支持页面路径/参数解析，方便聚类分析，如实现子目录分析、指定广告跟踪功能（免费版已下线“子目录分析”、“指定广告跟踪”功能）\n- 支持访客来源归因、轨迹分析，及高频IP与地域信息统计\n\n![before_after](assets/screenshots/before_after.png)\n\n- 🤔 **Before**：[原始返回数据 raw_data.json](./tests/16847648_raw_data.json)\n- 🤩 **After**： [解析后的数据 result_data.json](./tests/16847648_result_data.json)\n\n\n## 🧩 软件要求\n\n- SQLite\n- Redis\n- MongoDB （可选，用于存储过程数据）\n- PostgreSQL （可选，用于 Demo）\n- Elasticsearch、Kibana （可选，用于 Demo）\n\n\n## 🏁 准备工作\n\n1. 使用**一般**百度账号（非商业账号）开通数据API，获得 `API Key` 和 `Secret Key` （参见 [百度统计 API 用户手册](https://tongji.baidu.com/api/manual/)）。\n2. 按照文档步骤说明，获得一次性授权码 `CODE` （有效期为 10 分钟，过期后需要重新授权获取）。\n\n    ```Bash\n    # CLIENT_ID = API Key\n    http://openapi.baidu.com/oauth/2.0/authorize?response_type=code\u0026client_id={CLIENT_ID}\u0026redirect_uri=oob\u0026scope=basic\u0026display=popup\n    ```\n\n\n## 🚀 安装使用\n\n1. 克隆本仓库。\n\n   ```Bash\n    git clone https://github.com/caspartse/python-baidu-tongji.git\n    ```\n\n2. 安装依赖。\n\n   ```Bash\n    cd python-baidu-tongji \u0026\u0026 \\\n    python3 -m pip install -r ./requirements.txt\n   ```\n\n3. 将 `API Key`, `Secret Key`, `CODE` 填入 `package/config.yaml` 中。\n\n    ```YAML\n    # Baidu openapi, https://tongji.baidu.com/api/manual/Chapter2/openapi.html\n    baidu:\n      api_key: your_api_key\n      secret_key: your_secret_key\n      auth_code: your_auth_code\n    ```\n\n4. 根据实际需要，修改 `package/config.yaml` 中的其他配置，如：\n\n    - Redis 配置\n\n    ```YAML\n    # Redis\n    redis:\n      host: localhost\n      port: 6379\n      db: 0\n      password: ''\n    ```\n\n   - IP 定位服务配置（默认使用淘宝、太平洋在线的服务）\n\n    ```YAML\n    # LBS, query ip location (optional)\n    # amap: https://lbs.amap.com/api/webservice/guide/api/ipconfig\n    # baidu: https://lbsyun.baidu.com/index.php?title=webapi/ip-api\n    # tencent: https://lbs.qq.com/service/webService/webServiceGuide/webServiceIp\n    lbs:\n      service: '' # if you want to use LBS, set service to 'amap', 'baidu' or 'tencent'\n      app_key: '' # your_app_key of amap or baidu or tencent\n    ```\n\n5. 根据实际需要，修改 `package/dimensions.yaml` 中的维度配置，如 `custom_tracking_params` （自定义跟踪参数）, `onsite_search_params` （站内搜索参数）：\n\n    ```YAML\n    # Custom tracking parameters\n    custom_tracking_params:\n      - activity_id\n      - channel_id\n\n    # On-Site search parameters\n    onsite_search_params:\n      - kw\n      - keyword\n    ```\n\n6. 调用 `package/baidu_tongji.py` 即可。示例可参考 `tests/test.py` 及 Demo 中的 `main.py`。\n   每次调用后，原始数据会被临时存储到 `package/data` 目录下，文件名为 `{site_id}_raw_data.json` 。\n\n\n## 💡 Demo 介绍\n\n### PostgreSQL\n\n使用 `baidu_tongji.py` 获取的数据，并存储到 PostgreSQL 数据库中。\n\n1. 创建一个名为 `website_traffic` 的数据库。\n\n    ```SQL\n    CREATE DATABASE website_traffic;\n    COMMENT ON DATABASE website_traffic IS '网站流量';\n    ```\n\n2. 执行 `DDL/DDL_website_traffic.sql` 创建表结构。\n3. 运行 `python3 main.py` 即可。\n\n![PostgreSQL](assets/screenshots/demo_postgresql.png)\n\n### Elasticsearch\n\n使用 `baidu_tongji.py` 获取的数据，并存储到 Elasticsearch 中。\n\n1. 利用 `mappings` 目录下的 json 文件，分别创建索引 `visitors`, `sessions`, `events` 索引，也可直接运行 `mappings/create_indices.py` 创建。\n2. 运行 `python3 main.py` 即可。\n\n![Elasticsearch](assets/screenshots/demo_elasticsearch.png)\n\n\n## 📚 参考资料\n\n- [百度统计 API 用户手册 - 实时访客](https://tongji.baidu.com/api/manual/Chapter1/trend_latest_a.html)\n- [百度统计使用手册 - 实时访客](https://tongji.baidu.com/holmes/Analytics/%E4%BA%A7%E5%93%81%E4%BD%BF%E7%94%A8%E6%8C%87%E5%8D%97/%E6%A6%82%E8%A7%88/%E6%B5%81%E9%87%8F%E5%88%86%E6%9E%90/%E5%AE%9E%E6%97%B6%E8%AE%BF%E5%AE%A2/)\n- [神策分析 - 预置事件与预置属性](https://manual.sensorsdata.cn/sa/latest/tech_sdk_all_preset_properties-89620676.html)\n- [GA4 - Automatically collected events](https://support.google.com/analytics/answer/9234069?hl=en\u0026ref_topic=13367566)\n- [GA4 - Default channel group](https://support.google.com/analytics/answer/9756891?hl=en\u0026ref_topic=11151952)\n\n\n## ❤️ Thanks\n[GitHub Copilot](https://github.com/features/copilot), [vscode-chatgpt](https://github.com/gencay/vscode-chatgpt), [Administrative-divisions-of-China](https://github.com/modood/Administrative-divisions-of-China)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaspartse%2Fpython-baidu-tongji","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcaspartse%2Fpython-baidu-tongji","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaspartse%2Fpython-baidu-tongji/lists"}