https://github.com/lizardcat/coda-extracter

Python script for extracting and processing timesheet data from Coda documents.
https://github.com/lizardcat/coda-extracter

coda python

Last synced: 3 months ago
JSON representation

Python script for extracting and processing timesheet data from Coda documents.

Host: GitHub
URL: https://github.com/lizardcat/coda-extracter
Owner: lizardcat
Created: 2025-08-28T18:08:06.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-08-29T16:23:12.000Z (5 months ago)
Last Synced: 2025-09-11T15:43:40.027Z (4 months ago)
Topics: coda, python
Language: Python
Homepage:
Size: 29.3 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Coda Timesheet Extractor

Extract and process timesheet data from Coda documents using Python.

## Project Overview

This tool allows you to automatically extract timesheet data from your Coda documents using the official Coda API. It processes the raw data into clean CSV files and provides logging and configuration management.

## File Structure

```
timesheet_extractor/
├── README.md
├── requirements.txt
├── config/
│ ├── __init__.py
│ └── config.py
├── src/
│ ├── __init__.py
│ ├── coda_extractor.py
│ └── data_processor.py
├── scripts/
│ └── extract_timesheet.py
├── data/
│ ├── raw/ # Raw JSON responses from Coda API
│ └── processed/ # Cleaned CSV files
├── logs/ # Extraction logs
└── .env # Your API credentials (keep private!)
```

## Setup Instructions

### 1. Install Dependencies

```bash
pip install -r requirements.txt
```

Required packages:

- `requests>=2.31.0` - For API calls to Coda
- `pandas>=2.0.0` - For data processing
- `python-dotenv>=1.0.0` - For environment variable management

### 2. Get Your Coda API Credentials

1. **Get your API token:**

- Go to your Coda account settings
- Navigate to the "API" section
- Generate a new token
- Copy the token (you'll only see it once!)

2. **Create your `.env` file:**
```
CODA_API_TOKEN=your_api_token_here
CODA_DOC_ID=your_document_id_here
CODA_TABLE_ID=your_table_id_here
```

### 3. Find Your Document and Table IDs

**Find your documents:**

```bash
python scripts/extract_timesheet.py --list-docs
```

This will show you all documents you have access to with their IDs.

**Find tables in your timesheet document:**

```bash
python scripts/extract_timesheet.py --list-tables YOUR_DOC_ID
```

Replace `YOUR_DOC_ID` with the document ID from the previous step.

### 4. Update Your Configuration

Add the correct document and table IDs to your `.env` file:

```
CODA_API_TOKEN=your_actual_token
CODA_DOC_ID=AbCdEfGhIj
CODA_TABLE_ID=table-KlMnOpQr
```

## Usage

### Basic Extraction

Extract your timesheet data with default settings:

```bash
python scripts/extract_timesheet.py
```

This will:

- Extract data from your configured timesheet
- Save raw JSON data to `data/raw/`
- Process and clean the data
- Export to CSV in `data/processed/`
- Show a summary of the extracted data

### Custom Output Filename

Specify a custom output filename:

```bash
python scripts/extract_timesheet.py --output my_timesheet_2024.csv
```

### List Available Resources

List all your Coda documents:

```bash
python scripts/extract_timesheet.py --list-docs
```

List tables in a specific document:

```bash
python scripts/extract_timesheet.py --list-tables DOC_ID_HERE
```

## Output Files

### Raw Data

- **Location:** `data/raw/timesheet_raw_YYYYMMDD_HHMMSS.json`
- **Content:** Unprocessed JSON response from Coda API
- **Purpose:** Backup and debugging

### Processed Data

- **Location:** `data/processed/timesheet_processed_YYYYMMDD_HHMMSS.csv`
- **Content:** Clean, structured CSV file
- **Purpose:** Ready for analysis in Excel, Google Sheets, or other tools

### Logs

- **Location:** `logs/extraction_YYYYMMDD.log`
- **Content:** Detailed extraction logs with timestamps
- **Purpose:** Troubleshooting and audit trail

## Data Processing Features

The tool automatically:

- Extracts data from Coda's nested JSON format
- Converts date columns to proper datetime format
- Converts hours/duration columns to numeric values
- Handles missing or malformed data gracefully
- Provides summary statistics (total hours, date range, etc.)

## Customization

### Adding Custom Data Cleaning

Edit `src/data_processor.py` in the `clean_timesheet_data()` method to add your own data cleaning rules:

```python
def clean_timesheet_data(self, df):
cleaned_df = df.copy()

# Your custom cleaning rules here
if 'Project' in cleaned_df.columns:
cleaned_df['Project'] = cleaned_df['Project'].str.strip().str.title()

return cleaned_df
```

### Scheduling Automatic Extractions

You can set up automatic extractions using cron (Linux/Mac) or Task Scheduler (Windows):

```bash
# Run daily at 6 PM
0 18 * * * cd /path/to/timesheet_extractor && python scripts/extract_timesheet.py
```

## Troubleshooting

### Common Issues

**"Missing required environment variables"**

- Check that your `.env` file exists and has all required variables
- Make sure there are no extra spaces around the `=` signs

**"Authentication failed"**

- Verify your API token is correct
- Check that the token hasn't expired
- Ensure you have access to the specified document

**"Table not found"**

- Use `--list-tables` to verify the table ID
- Make sure you're using the table ID (starts with "table-"), not the table name

**"No data extracted"**

- Check that your timesheet table has data
- Verify you have read permissions on the document
- Look at the log files in the `logs/` directory for detailed error messages

### Getting Help

1. Check the log files in `logs/` directory
2. Run with `--list-docs` and `--list-tables` to verify your IDs
3. Test with a simple document first to verify your setup

## Security Notes

- Never commit your `.env` file to version control
- Keep your API token secure and rotate it periodically
- The tool only reads data from your Coda documents (no write access)
- All data is stored locally on your machine

## License

This project is for personal/internal use. Modify as needed for your specific timesheet format and requirements.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lizardcat/coda-extracter

Awesome Lists containing this project

README