{"id":23593916,"url":"https://github.com/mthomason/collect","last_synced_at":"2026-04-18T07:31:55.458Z","repository":{"id":269544188,"uuid":"836798245","full_name":"mthomason/collect","owner":"mthomason","description":"Example shows how to batch create a Drudge Report type site using Open AI's Function Calling API.","archived":false,"fork":false,"pushed_at":"2024-12-24T09:36:42.000Z","size":777,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-16T04:18:28.858Z","etag":null,"topics":["data-collection","example","example-project","function-calling","html","javascript","mit-license","openai","python","python3","rss","starter-project","web-scraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mthomason.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-01T15:21:43.000Z","updated_at":"2024-12-24T09:36:45.000Z","dependencies_parsed_at":null,"dependency_job_id":"04ce79ae-4320-4002-8c8d-c1cca0ac1165","html_url":"https://github.com/mthomason/collect","commit_stats":null,"previous_names":["mthomason/collect"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mthomason/collect","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mthomason%2Fcollect","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mthomason%2Fcollect/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mthomason%2Fcollect/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mthomason%2Fcollect/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mthomason","download_url":"https://codeload.github.com/mthomason/collect/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mthomason%2Fcollect/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31961112,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-collection","example","example-project","function-calling","html","javascript","mit-license","openai","python","python3","rss","starter-project","web-scraping"],"created_at":"2024-12-27T09:14:12.518Z","updated_at":"2026-04-18T07:31:55.452Z","avatar_url":"https://github.com/mthomason.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Hobby Report\n\n**Hobby Report** is an example Python project that demonstrates how to create a _Drudge Report_-style website for collectors, called **Hobby Report**, using APIs and automation.\n\nThis project is designed as an **example of using OpenAI's Function Calling API**, showcasing its ability to clean up and extract structured data from eBay headlines and parse auction details programmatically. It integrates eBay data, RSS feeds, and other APIs to generate and refresh a fully functional website.\n\n---\n\n## Features\n\n- **OpenAI Function Calling API**: Extract and clean auction data from eBay headlines, showcasing advanced AI capabilities.\n- **eBay Integration**: Retrieves and processes eBay auction data.\n- **RSS Feed Integration**: Aggregates content from RSS feeds.\n- **Site Automation**: Designed to run periodically to keep the site updated with the latest data.\n- **Developer Example**: A template for building automated aggregation sites.\n\n---\n\n## Getting Started\n\n### Prerequisites\n\n1. Python **3.12** or higher.\n2. A Python virtual environment is strongly recommended.\n3. API keys for:\n   - **OpenAI** Function Call API\n   - **eBay** API (including eBay Partner Network)\n   - Optional: AWS or Reddit API keys for additional enhancements.\n\n---\n\n### Installation\n\n1. Clone the repository:\n\n   ```bash\n   git clone https://github.com/yourusername/hobby-report.git\n   cd hobby-report\n   ```\n\n2. Create and activate a Python virtual environment:\n\n   ```bash\n   python3.12 -m venv venv\n   source venv/bin/activate   # On Windows: venv\\Scripts\\activate\n   ```\n\n3. Install the required dependencies:\n\n   ```bash\n   pip install -r requirements.txt\n   ```\n\n---\n\n### Configuration\n\nCreate a `.env` file in the root directory and populate it with the following variables. Replace the placeholders with your actual API credentials:\n\n```plaintext\n################################################################################\n#AWS Keys\nAWS_ACCESS_KEY_ID = ''\nAWS_SECRET_ACCESS_KEY = ''\n\n#CloudFront Keys\nAWS_CF_DISTRIBUTION_ID = ''\n\n################################################################################\n#eBay API Keys\nEBAY_APPID = ''\nEBAY_DEVID = ''\nEBAY_CERTID = ''\n\n################################################################################\n#eBay Partner Network Keys\nEPN_TRACKING_ID = ''\n\n################################################################################\n#OpenAI API Keys\nOPENAI_API_KEY = ''\n\n################################################################################\n#Reddit API Keys\nREDDIT_CLIENT_ID = ''\nREDDIT_CLIENT_SECRET = ''\nREDDIT_USER_AGENT = ''\n```\n\nEdit the configuration files in the `config/` directory to customize sources and settings:\n\n- `rss-feeds.json`: List of RSS feeds to aggregate.\n- `auctions-ebay.json`: Filters and categories for eBay auctions.\n- `config.json`: General application settings.\n\n---\n\n### Running the Application\n\nRun the project from the command line with the following command:\n\n```bash\ncd /path/to/hobby-report/collect\n/usr/bin/env /path/to/hobby-report/venv/bin/python -m collect\n```\n\nThe application will fetch eBay data, read RSS feeds, and generate the website. To automate updates, schedule this command to run periodically (e.g., using `cron` or Task Scheduler).\n\n---\n\n### Debugging in Visual Studio Code\n\nThis project includes a preconfigured `launch.json` file for debugging with Visual Studio Code.\n\n1. Open the project in Visual Studio Code.\n2. Ensure you have the Python extension installed.\n3. Select one of the provided configurations:\n   - **Debug Main Module**: Runs and debugs the `collect` module to generate the site.\n   - **Run Specific Tests**: Allows debugging of test modules like `test_caching_robot_file_parser`.\n\nExample `launch.json` entry:\n```json\n{\n    \"version\": \"0.2.0\",\n    \"configurations\": [\n        {\n            \"name\": \"Debug Python: __main__.py\",\n            \"type\": \"python\",\n            \"request\": \"launch\",\n            \"module\": \"collect\",\n            \"console\": \"integratedTerminal\",\n            \"cwd\": \"${workspaceFolder}\",\n            \"stopOnEntry\": false,\n            \"justMyCode\": false\n        },\n        {\n            \"name\": \"Python: Unittest\",\n            \"type\": \"python\",\n            \"request\": \"launch\",\n            \"module\": \"collect.tests.test_caching_robot_file_parser\",\n            \"cwd\": \"${workspaceFolder}\",\n            \"console\": \"integratedTerminal\",\n            \"justMyCode\": false\n        }\n    ]\n}\n```\n\nUse the integrated terminal in Visual Studio Code to view debug output and logs.\n\n---\n\n## Screenshot Placeholder\n\n\u003cimg src=\"assets/hobbyrptscreenshot.png\" alt=\"Screenshot Hobby Report\" width=\"320\"\u003e\n\n---\n\n## Example Use Case: OpenAI Function Calling API\n\nThis project is a practical example of leveraging the **OpenAI Function Calling API** for structured data extraction. By processing raw eBay headlines, it demonstrates:\n\n- Cleaning up auction titles for better readability.\n- Parsing specific values, like item names or prices, programmatically.\n- Automating content aggregation tasks that traditionally required manual effort.\n\n---\n\n## Directory Structure\n\n- **`collect/`**: Core project files.\n  - `utility/`: Helper modules for caching, logging, API interaction, and data processing.\n  - `tests/`: Unit tests for the application.\n  - `core/`: Core functionality like RSS handling, HTML generation, and file management.\n- **`config/`**: Configuration files for API settings and RSS feeds.\n- **`templates/`**: HTML templates used to render the website.\n- **`logs/`**: Application logs for debugging.\n\n---\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmthomason%2Fcollect","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmthomason%2Fcollect","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmthomason%2Fcollect/lists"}