{"id":30697662,"url":"https://github.com/lizardcat/coda-extracter","last_synced_at":"2025-10-10T15:07:25.710Z","repository":{"id":312163729,"uuid":"1046459371","full_name":"lizardcat/coda-extracter","owner":"lizardcat","description":"Python script for extracting and processing timesheet data from Coda documents.","archived":false,"fork":false,"pushed_at":"2025-08-29T16:23:12.000Z","size":30,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-11T15:43:40.027Z","etag":null,"topics":["coda","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lizardcat.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-28T18:08:06.000Z","updated_at":"2025-08-29T16:23:16.000Z","dependencies_parsed_at":null,"dependency_job_id":"bf8c6844-d223-43e5-9dd6-5173a0767346","html_url":"https://github.com/lizardcat/coda-extracter","commit_stats":null,"previous_names":["lizardcat/coda-extracter"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/lizardcat/coda-extracter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lizardcat%2Fcoda-extracter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lizardcat%2Fcoda-extracter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lizardcat%2Fcoda-extracter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lizardcat%2Fcoda-extracter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lizardcat","download_url":"https://codeload.github.com/lizardcat/coda-extracter/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lizardcat%2Fcoda-extracter/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279004564,"owners_count":26083734,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["coda","python"],"created_at":"2025-09-02T09:37:49.576Z","updated_at":"2025-10-10T15:07:25.703Z","avatar_url":"https://github.com/lizardcat.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Coda Timesheet Extractor\n\nExtract and process timesheet data from Coda documents using Python.\n\n## Project Overview\n\nThis tool allows you to automatically extract timesheet data from your Coda documents using the official Coda API. It processes the raw data into clean CSV files and provides logging and configuration management.\n\n## File Structure\n\n```\ntimesheet_extractor/\n├── README.md\n├── requirements.txt\n├── config/\n│   ├── __init__.py\n│   └── config.py\n├── src/\n│   ├── __init__.py\n│   ├── coda_extractor.py\n│   └── data_processor.py\n├── scripts/\n│   └── extract_timesheet.py\n├── data/\n│   ├── raw/          # Raw JSON responses from Coda API\n│   └── processed/    # Cleaned CSV files\n├── logs/             # Extraction logs\n└── .env              # Your API credentials (keep private!)\n```\n\n## Setup Instructions\n\n### 1. Install Dependencies\n\n```bash\npip install -r requirements.txt\n```\n\nRequired packages:\n\n- `requests\u003e=2.31.0` - For API calls to Coda\n- `pandas\u003e=2.0.0` - For data processing\n- `python-dotenv\u003e=1.0.0` - For environment variable management\n\n### 2. Get Your Coda API Credentials\n\n1. **Get your API token:**\n\n   - Go to your Coda account settings\n   - Navigate to the \"API\" section\n   - Generate a new token\n   - Copy the token (you'll only see it once!)\n\n2. **Create your `.env` file:**\n   ```\n   CODA_API_TOKEN=your_api_token_here\n   CODA_DOC_ID=your_document_id_here\n   CODA_TABLE_ID=your_table_id_here\n   ```\n\n### 3. Find Your Document and Table IDs\n\n**Find your documents:**\n\n```bash\npython scripts/extract_timesheet.py --list-docs\n```\n\nThis will show you all documents you have access to with their IDs.\n\n**Find tables in your timesheet document:**\n\n```bash\npython scripts/extract_timesheet.py --list-tables YOUR_DOC_ID\n```\n\nReplace `YOUR_DOC_ID` with the document ID from the previous step.\n\n### 4. Update Your Configuration\n\nAdd the correct document and table IDs to your `.env` file:\n\n```\nCODA_API_TOKEN=your_actual_token\nCODA_DOC_ID=AbCdEfGhIj\nCODA_TABLE_ID=table-KlMnOpQr\n```\n\n## Usage\n\n### Basic Extraction\n\nExtract your timesheet data with default settings:\n\n```bash\npython scripts/extract_timesheet.py\n```\n\nThis will:\n\n- Extract data from your configured timesheet\n- Save raw JSON data to `data/raw/`\n- Process and clean the data\n- Export to CSV in `data/processed/`\n- Show a summary of the extracted data\n\n### Custom Output Filename\n\nSpecify a custom output filename:\n\n```bash\npython scripts/extract_timesheet.py --output my_timesheet_2024.csv\n```\n\n### List Available Resources\n\nList all your Coda documents:\n\n```bash\npython scripts/extract_timesheet.py --list-docs\n```\n\nList tables in a specific document:\n\n```bash\npython scripts/extract_timesheet.py --list-tables DOC_ID_HERE\n```\n\n## Output Files\n\n### Raw Data\n\n- **Location:** `data/raw/timesheet_raw_YYYYMMDD_HHMMSS.json`\n- **Content:** Unprocessed JSON response from Coda API\n- **Purpose:** Backup and debugging\n\n### Processed Data\n\n- **Location:** `data/processed/timesheet_processed_YYYYMMDD_HHMMSS.csv`\n- **Content:** Clean, structured CSV file\n- **Purpose:** Ready for analysis in Excel, Google Sheets, or other tools\n\n### Logs\n\n- **Location:** `logs/extraction_YYYYMMDD.log`\n- **Content:** Detailed extraction logs with timestamps\n- **Purpose:** Troubleshooting and audit trail\n\n## Data Processing Features\n\nThe tool automatically:\n\n- Extracts data from Coda's nested JSON format\n- Converts date columns to proper datetime format\n- Converts hours/duration columns to numeric values\n- Handles missing or malformed data gracefully\n- Provides summary statistics (total hours, date range, etc.)\n\n## Customization\n\n### Adding Custom Data Cleaning\n\nEdit `src/data_processor.py` in the `clean_timesheet_data()` method to add your own data cleaning rules:\n\n```python\ndef clean_timesheet_data(self, df):\n    cleaned_df = df.copy()\n\n    # Your custom cleaning rules here\n    if 'Project' in cleaned_df.columns:\n        cleaned_df['Project'] = cleaned_df['Project'].str.strip().str.title()\n\n    return cleaned_df\n```\n\n### Scheduling Automatic Extractions\n\nYou can set up automatic extractions using cron (Linux/Mac) or Task Scheduler (Windows):\n\n```bash\n# Run daily at 6 PM\n0 18 * * * cd /path/to/timesheet_extractor \u0026\u0026 python scripts/extract_timesheet.py\n```\n\n## Troubleshooting\n\n### Common Issues\n\n**\"Missing required environment variables\"**\n\n- Check that your `.env` file exists and has all required variables\n- Make sure there are no extra spaces around the `=` signs\n\n**\"Authentication failed\"**\n\n- Verify your API token is correct\n- Check that the token hasn't expired\n- Ensure you have access to the specified document\n\n**\"Table not found\"**\n\n- Use `--list-tables` to verify the table ID\n- Make sure you're using the table ID (starts with \"table-\"), not the table name\n\n**\"No data extracted\"**\n\n- Check that your timesheet table has data\n- Verify you have read permissions on the document\n- Look at the log files in the `logs/` directory for detailed error messages\n\n### Getting Help\n\n1. Check the log files in `logs/` directory\n2. Run with `--list-docs` and `--list-tables` to verify your IDs\n3. Test with a simple document first to verify your setup\n\n## Security Notes\n\n- Never commit your `.env` file to version control\n- Keep your API token secure and rotate it periodically\n- The tool only reads data from your Coda documents (no write access)\n- All data is stored locally on your machine\n\n## License\n\nThis project is for personal/internal use. Modify as needed for your specific timesheet format and requirements.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flizardcat%2Fcoda-extracter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flizardcat%2Fcoda-extracter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flizardcat%2Fcoda-extracter/lists"}