{"id":31582153,"url":"https://github.com/nao1215/filesql","last_synced_at":"2026-05-31T08:01:06.041Z","repository":{"id":311287736,"uuid":"1043269436","full_name":"nao1215/filesql","owner":"nao1215","description":"sql driver for CSV, TSV, LTSV, JSON, Parquet, Excel with gzip, bzip2, xz, zstd support.","archived":false,"fork":false,"pushed_at":"2026-05-30T05:38:12.000Z","size":10702,"stargazers_count":372,"open_issues_count":2,"forks_count":10,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-30T06:18:43.367Z","etag":null,"topics":["ach","bzip2","csv","excel","fedwire","go","golang","gzip","json","ltsv","parquet","sql","sql-driver","sql-query","sqlite3","tsv","xz","zstd"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nao1215.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"github":"nao1215"}},"created_at":"2025-08-23T13:53:01.000Z","updated_at":"2026-05-30T05:32:58.000Z","dependencies_parsed_at":"2025-08-24T00:21:02.309Z","dependency_job_id":"6989f763-1be4-4291-89bc-6ab730689920","html_url":"https://github.com/nao1215/filesql","commit_stats":null,"previous_names":["nao1215/filesql"],"tags_count":24,"template":false,"template_full_name":null,"purl":"pkg:github/nao1215/filesql","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nao1215%2Ffilesql","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nao1215%2Ffilesql/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nao1215%2Ffilesql/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nao1215%2Ffilesql/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nao1215","download_url":"https://codeload.github.com/nao1215/filesql/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nao1215%2Ffilesql/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33723549,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ach","bzip2","csv","excel","fedwire","go","golang","gzip","json","ltsv","parquet","sql","sql-driver","sql-query","sqlite3","tsv","xz","zstd"],"created_at":"2025-10-05T22:32:00.321Z","updated_at":"2026-05-31T08:01:06.034Z","avatar_url":"https://github.com/nao1215.png","language":"Go","funding_links":["https://github.com/sponsors/nao1215"],"categories":[],"sub_categories":[],"readme":"# filesql\n\n[![Go Reference](https://pkg.go.dev/badge/github.com/nao1215/filesql.svg)](https://pkg.go.dev/github.com/nao1215/filesql)\n[![Go Report Card](https://goreportcard.com/badge/github.com/nao1215/filesql)](https://goreportcard.com/report/github.com/nao1215/filesql)\n[![MultiPlatformUnitTest](https://github.com/nao1215/filesql/actions/workflows/unit_test.yml/badge.svg)](https://github.com/nao1215/filesql/actions/workflows/unit_test.yml)\n![Coverage](https://raw.githubusercontent.com/nao1215/octocovs-central-repo/main/badges/nao1215/filesql/coverage.svg)\n\n[日本語](./doc/ja/README.md) | [Русский](./doc/ru/README.md) | [中文](./doc/zh-cn/README.md) | [한국어](./doc/ko/README.md) | [Español](./doc/es/README.md) | [Français](./doc/fr/README.md)\n\n![logo](./doc/image/filesql-logo.png)\n\n**filesql** is a Go SQL driver that enables you to query CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using SQLite3 SQL syntax. Query your data files directly without any imports or transformations!\n\n**Want to try filesql's capabilities?** Check out **[sqly](https://github.com/nao1215/sqly)** - a command-line tool that uses filesql to easily execute SQL queries against CSV, TSV, LTSV, and Excel files directly from your shell. It's the perfect way to experience the power of filesql in action!\n\n## Why filesql?\n\nThis library was born from the experience of maintaining two separate CLI tools - [sqly](https://github.com/nao1215/sqly) and [sqluv](https://github.com/nao1215/sqluv). Both tools shared a common feature: executing SQL queries against CSV, TSV, and other file formats. \n\nRather than maintaining duplicate code across both projects, we extracted the core functionality into this reusable SQL driver. Now, any Go developer can leverage this capability in their own applications!\n\n## Features\n\n- SQLite3 SQL Interface - Use SQLite3's powerful SQL dialect to query your files\n- Multiple File Formats - Support for CSV, TSV, LTSV, Parquet, and Excel (XLSX) files\n- Compression Support - Automatically handles .gz, .bz2, .xz, .zst, .z, .snappy, .s2, and .lz4 compressed files\n- Stream Processing - Efficiently handles large files through streaming with configurable chunk sizes\n- Flexible Input Sources - Support for file paths, directories, io.Reader, and embed.FS\n- Zero Setup - No database server required, everything runs in-memory\n- Auto-Save - Automatically persist changes back to files\n- Cross-Platform - Works seamlessly on Linux, macOS, and Windows\n- SQLite3 Powered - Built on the robust SQLite3 engine for reliable SQL processing\n\n## Supported File Formats\n\n| Extension | Format | Description |\n|-----------|--------|-------------|\n| `.csv` | CSV | Comma-separated values |\n| `.tsv` | TSV | Tab-separated values |\n| `.ltsv` | LTSV | Labeled Tab-separated Values |\n| `.parquet` | Parquet | Apache Parquet columnar format |\n| `.xlsx` | Excel XLSX | Microsoft Excel workbook format |\n| `.json` | JSON | JSON format (use `json_extract()` for field access) |\n| `.jsonl` | JSONL | JSON Lines format (one JSON object per line) |\n| `.csv.gz`, `.tsv.gz`, `.ltsv.gz`, `.parquet.gz`, `.xlsx.gz`, `.json.gz`, `.jsonl.gz` | Gzip compressed | Gzip compressed files |\n| `.csv.bz2`, `.tsv.bz2`, `.ltsv.bz2`, `.parquet.bz2`, `.xlsx.bz2`, `.json.bz2`, `.jsonl.bz2` | Bzip2 compressed | Bzip2 compressed files |\n| `.csv.xz`, `.tsv.xz`, `.ltsv.xz`, `.parquet.xz`, `.xlsx.xz`, `.json.xz`, `.jsonl.xz` | XZ compressed | XZ compressed files |\n| `.csv.zst`, `.tsv.zst`, `.ltsv.zst`, `.parquet.zst`, `.xlsx.zst`, `.json.zst`, `.jsonl.zst` | Zstandard compressed | Zstandard compressed files |\n| `.csv.z`, `.tsv.z`, `.ltsv.z`, `.parquet.z`, `.xlsx.z`, `.json.z`, `.jsonl.z` | Zlib compressed | Zlib compressed files |\n| `.csv.snappy`, `.tsv.snappy`, `.ltsv.snappy`, `.parquet.snappy`, `.xlsx.snappy`, `.json.snappy`, `.jsonl.snappy` | Snappy compressed | Snappy compressed files |\n| `.csv.s2`, `.tsv.s2`, `.ltsv.s2`, `.parquet.s2`, `.xlsx.s2`, `.json.s2`, `.jsonl.s2` | S2 compressed | S2 compressed files (Snappy compatible) |\n| `.csv.lz4`, `.tsv.lz4`, `.ltsv.lz4`, `.parquet.lz4`, `.xlsx.lz4`, `.json.lz4`, `.jsonl.lz4` | LZ4 compressed | LZ4 compressed files |\n| `.ach` | ACH (NACHA) | Automated Clearing House files (**Experimental**) |\n| `.fed` | Fedwire | Legacy Fedwire message files (**Experimental**) |\n\n## Installation\n\n```bash\ngo get github.com/nao1215/filesql\n```\n\n## Requirements\n\n- **Go Version**: 1.25 or later\n- **Operating Systems**:\n  - Linux\n  - macOS  \n  - Windows\n\n## Quick Start\n\n### Simple Usage\n\nThe recommended way to get started is with `OpenContext` for proper timeout handling:\n\n```go\npackage main\n\nimport (\n    \"context\"\n    \"fmt\"\n    \"log\"\n    \"time\"\n    \n    \"github.com/nao1215/filesql\"\n)\n\nfunc main() {\n    // Create context with timeout for large file operations\n    ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)\n    defer cancel()\n    \n    // Open a CSV file as a database\n    db, err := filesql.OpenContext(ctx, \"data.csv\")\n    if err != nil {\n        log.Fatal(err)\n    }\n    defer db.Close()\n    \n    // Query the data (table name = filename without extension)\n    rows, err := db.QueryContext(ctx, \"SELECT * FROM data WHERE age \u003e 25\")\n    if err != nil {\n        log.Fatal(err)\n    }\n    defer rows.Close()\n    \n    // Process results\n    for rows.Next() {\n        var name string\n        var age int\n        if err := rows.Scan(\u0026name, \u0026age); err != nil {\n            log.Fatal(err)\n        }\n        fmt.Printf(\"Name: %s, Age: %d\\n\", name, age)\n    }\n}\n```\n\n### Multiple Files and Formats\n\n```go\nctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)\ndefer cancel()\n\n// Open multiple files at once (including Parquet)\ndb, err := filesql.OpenContext(ctx, \"users.csv\", \"orders.tsv\", \"logs.ltsv.gz\", \"analytics.parquet\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Join data across different file formats\nrows, err := db.QueryContext(ctx, `\n    SELECT u.name, o.order_date, l.event, a.metrics\n    FROM users u\n    JOIN orders o ON u.id = o.user_id\n    JOIN logs l ON u.id = l.user_id\n    JOIN analytics a ON u.id = a.user_id\n    WHERE o.order_date \u003e '2024-01-01'\n`)\n```\n\n### Working with Directories\n\n```go\nctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)\ndefer cancel()\n\n// Load all supported files from a directory (recursive)\ndb, err := filesql.OpenContext(ctx, \"/path/to/data/directory\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// See what tables are available\nrows, err := db.QueryContext(ctx, \"SELECT name FROM sqlite_master WHERE type='table'\")\n```\n\n### JSON / JSONL Support\n\nJSON and JSONL files are stored as raw JSON in a single `data` TEXT column. Use SQLite's `json_extract()` function to query fields:\n\n```go\n// Open a JSON file\ndb, err := filesql.OpenContext(ctx, \"users.json\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Query using json_extract()\nrows, err := db.QueryContext(ctx, `\n    SELECT json_extract(data, '$.name') AS name,\n           json_extract(data, '$.age') AS age\n    FROM users\n    WHERE json_extract(data, '$.age') \u003e 25\n`)\n\n// Nested fields work too\nrows, err = db.QueryContext(ctx, `\n    SELECT json_extract(data, '$.address.city') AS city\n    FROM users\n    WHERE json_extract(data, '$.address.country') = 'Japan'\n`)\n```\n\n## Advanced Usage\n\n### Builder Pattern\n\nFor advanced scenarios, use the builder pattern:\n\n```go\npackage main\n\nimport (\n    \"context\"\n    \"embed\"\n    \"log\"\n    \n    \"github.com/nao1215/filesql\"\n)\n\n//go:embed data/*.csv\nvar embeddedFiles embed.FS\n\nfunc main() {\n    ctx := context.Background()\n    \n    // Configure data sources with builder\n    validatedBuilder, err := filesql.NewBuilder().\n        AddPath(\"local_file.csv\").      // Local file\n        AddFS(embeddedFiles).           // Embedded files\n        SetDefaultChunkSize(5000). // 5000 rows per chunk\n        Build(ctx)\n    if err != nil {\n        log.Fatal(err)\n    }\n    \n    db, err := validatedBuilder.Open(ctx)\n    if err != nil {\n        log.Fatal(err)\n    }\n    defer db.Close()\n    \n    // Query across all data sources\n    rows, err := db.Query(\"SELECT name FROM sqlite_master WHERE type='table'\")\n    if err != nil {\n        log.Fatal(err)\n    }\n    defer rows.Close()\n}\n```\n\n### Loading into an Existing Database\n\n`Open`/`OpenContext` create a new in-memory database. When you already manage a database, use `LoadInto` to load files into it instead of copying through a second database. Use it for a long-lived session that imports files over time, or to join file data with tables you created yourself.\n\n```go\npackage main\n\nimport (\n    \"context\"\n    \"database/sql\"\n    \"log\"\n\n    \"github.com/nao1215/filesql\"\n    _ \"modernc.org/sqlite\"\n)\n\nfunc main() {\n    db, err := sql.Open(\"sqlite\", \":memory:\")\n    if err != nil {\n        log.Fatal(err)\n    }\n    defer db.Close()\n    // \":memory:\" is private per connection; pin the pool so the loaded tables\n    // are visible to later queries on the same database.\n    db.SetMaxOpenConns(1)\n\n    // Load files into the database you own. LoadInto does not close db.\n    if err := filesql.LoadInto(context.Background(), db, \"users.csv\", \"orders.parquet\"); err != nil {\n        log.Fatal(err)\n    }\n\n    // Load more files later into the same database (last-wins on same names).\n    if err := filesql.LoadInto(context.Background(), db, \"more_users.csv\"); err != nil {\n        log.Fatal(err)\n    }\n}\n```\n\nA table whose name matches a loaded file is replaced, so reloading a file is idempotent; tables you created yourself are left untouched. For readers or filesystems, configure a builder and call its `LoadInto` method. Auto-save is not supported here because the caller owns the database lifecycle.\n\n### Auto-Save Features\n\n#### Auto-Save on Database Close\n\n```go\n// Auto-save changes when database is closed\nvalidatedBuilder, err := filesql.NewBuilder().\n    AddPath(\"data.csv\").\n    EnableAutoSave(\"./backup\"). // Save to backup directory\n    Build(ctx)\nif err != nil {\n    log.Fatal(err)\n}\n\ndb, err := validatedBuilder.Open(ctx)\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close() // Changes are automatically saved here\n\n// Make changes\ndb.Exec(\"UPDATE data SET status = 'processed' WHERE id = 1\")\ndb.Exec(\"INSERT INTO data (name, age) VALUES ('John', 30)\")\n```\n\n#### Auto-Save on Transaction Commit\n\n```go\n// Auto-save after each transaction\nvalidatedBuilder, err := filesql.NewBuilder().\n    AddPath(\"data.csv\").\n    EnableAutoSaveOnCommit(\"\"). // Empty = overwrite original files\n    Build(ctx)\nif err != nil {\n    log.Fatal(err)\n}\n\ndb, err := validatedBuilder.Open(ctx)\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Changes are saved after each commit\ntx, _ := db.Begin()\ntx.Exec(\"UPDATE data SET status = 'processed' WHERE id = 1\")\ntx.Commit() // Auto-save happens here\n```\n\n### Working with io.Reader and Network Data\n\n```go\nimport (\n    \"net/http\"\n    \"github.com/nao1215/filesql\"\n)\n\n// Load data from HTTP response\nresp, err := http.Get(\"https://example.com/data.csv\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer resp.Body.Close()\n\nvalidatedBuilder, err := filesql.NewBuilder().\n    AddReader(resp.Body, \"remote_data\", filesql.FileTypeCSV).\n    Build(ctx)\nif err != nil {\n    log.Fatal(err)\n}\n\ndb, err := validatedBuilder.Open(ctx)\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Query remote data\nrows, err := db.QueryContext(ctx, \"SELECT * FROM remote_data LIMIT 10\")\n```\n\n### Manual Data Export\n\nIf you prefer manual control over saving:\n\n```go\nctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)\ndefer cancel()\n\ndb, err := filesql.OpenContext(ctx, \"data.csv\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Make modifications\ndb.Exec(\"UPDATE data SET status = 'processed'\")\n\n// Manually export changes\nerr = filesql.DumpDatabase(db, \"./output\")\nif err != nil {\n    log.Fatal(err)\n}\n\n// Or with custom format and compression\noptions := filesql.NewDumpOptions().\n    WithFormat(filesql.OutputFormatTSV).\n    WithCompression(filesql.CompressionGZ)\nerr = filesql.DumpDatabase(db, \"./output\", options)\n\n// Export to Parquet format\nparquetOptions := filesql.NewDumpOptions().\n    WithFormat(filesql.OutputFormatParquet)\n// Note: Parquet export is implemented, but external compression is not supported (use Parquet's built-in compression)\n```\n\n### Custom Logger\n\nfilesql supports pluggable logging via the `Logger` interface. By default, a no-op logger is used with zero performance overhead. You can inject your own logger (e.g., `slog`) for debugging and monitoring.\n\n```go\nimport (\n    \"log/slog\"\n    \"os\"\n    \"github.com/nao1215/filesql\"\n)\n\n// Create a slog logger\nslogLogger := slog.New(slog.NewTextHandler(os.Stdout, \u0026slog.HandlerOptions{\n    Level: slog.LevelDebug,\n}))\n\n// Wrap it with SlogAdapter and pass to the builder\nlogger := filesql.NewSlogAdapter(slogLogger)\n\nvalidatedBuilder, err := filesql.NewBuilder().\n    WithLogger(logger).\n    AddPath(\"data.csv\").\n    Build(ctx)\n```\n\n#### Logger Interface\n\n```go\ntype Logger interface {\n    Debug(msg string, args ...any)\n    Info(msg string, args ...any)\n    Warn(msg string, args ...any)\n    Error(msg string, args ...any)\n    With(args ...any) Logger\n}\n```\n\n#### Context-Aware Logger\n\nFor context-aware logging, use `ContextLogger`:\n\n```go\ntype ContextLogger interface {\n    Logger\n    DebugContext(ctx context.Context, msg string, args ...any)\n    InfoContext(ctx context.Context, msg string, args ...any)\n    WarnContext(ctx context.Context, msg string, args ...any)\n    ErrorContext(ctx context.Context, msg string, args ...any)\n}\n\n// Use SlogContextAdapter for context-aware logging\nlogger := filesql.NewSlogContextAdapter(slogLogger)\n```\n\n#### Performance\n\n| Logger Type | Performance | Memory |\n|-------------|-------------|--------|\n| nopLogger (default) | ~0.2 ns/op | 0 B/op |\n| SlogAdapter | ~1000 ns/op | ~630 B/op |\n\nThe default no-op logger has virtually zero overhead, making it safe to leave logging calls in production code.\n\n## Table Naming Rules\n\nfilesql automatically derives table names from file paths:\n\n- `users.csv` → table `users`\n- `data.tsv.gz` → table `data`\n- `/path/to/sales.csv` → table `sales`\n- `products.ltsv.bz2` → table `products`\n- `analytics.parquet` → table `analytics`\n\n## Important Notes\n\n### SQL Syntax\nSince filesql uses SQLite3 as its underlying engine, all SQL syntax follows [SQLite3's SQL dialect](https://www.sqlite.org/lang.html). This includes:\n- Functions (e.g., `date()`, `substr()`, `json_extract()`)\n- Window functions\n- Common Table Expressions (CTEs)\n- Triggers and views\n\n### Data Modifications\n- `INSERT`, `UPDATE`, and `DELETE` operations affect the in-memory database\n- **Original files remain unchanged by default**\n- Use auto-save features or `DumpDatabase()` to persist changes\n- This makes it safe to experiment with data transformations\n\n### Performance Tips\n- Use `OpenContext()` with timeouts for large files\n- Configure chunk sizes (rows per chunk) with `SetDefaultChunkSize()` for memory optimization\n- Single SQLite connection works best for most scenarios\n- Use streaming for files larger than available memory\n\n## Benchmark\n\nPerformance with a **100,000-row CSV file**:\n\n| Metric | Value |\n|--------|-------|\n| Execution Time | ~430 ms |\n| Memory Usage | ~141 MB |\n\nRun benchmarks yourself:\n```bash\nmake benchmark\n```\n\n### Concurrency Limitations\n⚠️ **IMPORTANT**: This library is **NOT thread-safe** and has **concurrency limitations**:\n- **Do NOT** share database connections across goroutines\n- **Do NOT** perform concurrent operations on the same database instance\n- **Do NOT** call `db.Close()` while queries are active in other goroutines\n- Use separate database instances for concurrent operations if needed\n- Race conditions may cause segmentation faults or data corruption\n\n**Recommended pattern for concurrent access**:\n```go\n// ✅ GOOD: Separate database instances per goroutine\nfunc processFileConcurrently(filename string) error {\n    db, err := filesql.Open(filename)  // Each goroutine gets its own instance\n    if err != nil {\n        return err\n    }\n    defer db.Close()\n    \n    // Safe to use within this goroutine\n    return processData(db)\n}\n\n// ❌ BAD: Sharing database instance across goroutines\nvar sharedDB *sql.DB  // This will cause race conditions\n```\n\n### Parquet Support\n- **Reading**: Full support for Apache Parquet files with complex data types\n- **Writing**: Export functionality is implemented (external compression not supported, use Parquet's built-in compression)\n- **Type Mapping**: Parquet types are mapped to SQLite types\n- **Compression**: Parquet's built-in compression is used instead of external compression\n- **Large Data**: Parquet files are efficiently processed with Arrow's columnar format\n\n### Excel (XLSX) Support\n- **1-Sheet-1-Table Structure**: Each sheet in an Excel workbook becomes a separate SQL table\n- **Table Naming**: SQL table names follow the format `{filename}_{sheetname}` (e.g., \"sales_Q1\", \"sales_Q2\")\n- **Header Row Processing**: First row of each sheet becomes the column headers for that table\n- **Standard SQL Operations**: Query each sheet independently or use JOINs to combine data across sheets\n- **Memory Requirements**: XLSX files require full loading into memory due to the ZIP-based format structure, even during streaming operations\n- **Implementation Note**: XLSX files are fully loaded into memory due to ZIP structure and all sheets are processed (CSV/TSV streaming parsers are not applicable)\n- **Export Functionality**: When exporting to XLSX format, table names become sheet names automatically\n- **Compression Support**: Full support for compressed XLSX files (.xlsx.gz, .xlsx.bz2, .xlsx.xz, .xlsx.zst, .xlsx.z, .xlsx.snappy, .xlsx.s2, .xlsx.lz4)\n\n### ACH (NACHA) Support - Experimental\n\n\u003e **Warning**: ACH file support is **experimental**. The API may change in future versions.\n\nACH (Automated Clearing House) files following the NACHA format can be queried using SQL. Each ACH file is converted to multiple tables:\n\n| Table Name | Description |\n|------------|-------------|\n| `{filename}_file_header` | File header information |\n| `{filename}_batches` | Batch header and control information |\n| `{filename}_entries` | Entry detail records (transactions) |\n| `{filename}_addenda` | Standard addenda records |\n| `{filename}_iat_entries` | IAT entry details |\n| `{filename}_iat_addenda` | IAT addenda records |\n\n#### Limitations\n\n**Read-only fields**: The following fields are exported for viewing but changes are not written back:\n- IAT Addenda sequence numbers (`entry_detail_sequence_number`, `sequence_number`)\n\n**Addenda05 index behavior**: When an entry has multiple addenda types (e.g., Addenda02 + Addenda05), the `addenda_index` represents the position within all addenda for that entry, not the index within Addenda05 array. For updates targeting specific Addenda05 records, use `addenda_type = '05'` to filter correctly.\n\n**Validation**: Modifying ACH data via SQL may create invalid ACH files. Users should ensure data consistency (e.g., `AddendaRecordIndicator` matches actual addenda presence).\n\n**Compression**: ACH files do not support compression wrappers (`.ach.gz`, etc.).\n\n#### Example\n\n```go\nctx := context.Background()\ndb, err := filesql.OpenContext(ctx, \"payments.ach\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Query entry details\nrows, err := db.QueryContext(ctx, `\n    SELECT individual_name, amount, trace_number\n    FROM payments_entries\n    WHERE transaction_code IN (22, 32)\n`)\n\n// Query with batch information\nrows, err := db.QueryContext(ctx, `\n    SELECT e.individual_name, e.amount, b.company_name\n    FROM payments_entries e\n    JOIN payments_batches b ON e.batch_index = b.batch_index\n`)\n```\n\n### Fedwire Support - Experimental\n\n\u003e **Warning**: Fedwire file support is **experimental**. The API may change in future versions.\n\nLegacy Fedwire message files (`.fed`) can be loaded, queried, modified, and exported back to Fedwire format. Each Fedwire file contains a single FEDWireMessage and is converted to a single flat table with approximately 326 columns.\n\n| Table Name | Description |\n|------------|-------------|\n| `{filename}_message` | Flat table with all FEDWireMessage fields (~326 columns, 1 row) |\n\nAll columns are TEXT type since the wire format stores all values as fixed-width strings.\n\n#### Limitations\n\n**UPDATE only**: Only UPDATE operations on existing rows are supported for round-trip editing. INSERT/DELETE operations in SQL are not reflected in the output wire file.\n\n**No new sections**: Optional message sections that were not present in the original file cannot be added via SQL modifications.\n\n**Compression**: Fedwire files do not support compression wrappers (`.fed.gz`, etc.).\n\n**Security**: Fedwire data contains sensitive banking information including routing numbers, account numbers, names, and transaction amounts. Do not log or export wire table data verbatim in production environments.\n\n#### Example\n\n```go\nctx := context.Background()\ndb, err := filesql.OpenContext(ctx, \"payment.fed\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Query sender and receiver information\nrows, err := db.QueryContext(ctx, `\n    SELECT sender_di_routing_number, receiver_di_routing_number, amount\n    FROM payment_message\n`)\n\n// Modify and export back to Fedwire format\ndb.ExecContext(ctx, \"UPDATE payment_message SET amount = '000005000000'\")\nfilesql.DumpFedWire(ctx, db, \"payment\", \"modified.fed\")\n```\n\n#### Excel File Structure Example\n```\nExcel File with Multiple Sheets:\n\n┌─────────────┐    ┌─────────────┐    ┌─────────────┐\n│ Sheet1      │    │ Sheet2      │    │ Sheet3      │\n│ Name   Age  │    │ Product     │    │ Region      │\n│ Alice   25  │    │ Laptop      │    │ North       │\n│ Bob     30  │    │ Mouse       │    │ South       │\n└─────────────┘    └─────────────┘    └─────────────┘\n\nResults in 3 separate SQL tables:\n\nsales_Sheet1:           sales_Sheet2:           sales_Sheet3:\n┌──────┬─────┐          ┌─────────┐             ┌────────┐\n│ Name │ Age │          │ Product │             │ Region │\n├──────┼─────┤          ├─────────┤             ├────────┤\n│ Alice│  25 │          │ Laptop  │             │ North  │\n│ Bob  │  30 │          │ Mouse   │             │ South  │\n└──────┴─────┘          └─────────┘             └────────┘\n\nSQL Examples:\nSELECT * FROM sales_Sheet1 WHERE Age \u003e 27;\nSELECT s1.Name, s2.Product FROM sales_Sheet1 s1 \n  JOIN sales_Sheet2 s2 ON s1.rowid = s2.rowid;\n```\n\n## Advanced Examples\n\n### Complex SQL Queries\n\n```go\nctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)\ndefer cancel()\n\ndb, err := filesql.OpenContext(ctx, \"employees.csv\", \"departments.csv\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Use advanced SQLite features\nquery := `\n    WITH dept_stats AS (\n        SELECT \n            department_id,\n            AVG(salary) as avg_salary,\n            COUNT(*) as emp_count\n        FROM employees\n        GROUP BY department_id\n    )\n    SELECT \n        e.name,\n        e.salary,\n        d.name as department,\n        ds.avg_salary as dept_avg,\n        RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank\n    FROM employees e\n    JOIN departments d ON e.department_id = d.id\n    JOIN dept_stats ds ON e.department_id = ds.department_id\n    WHERE e.salary \u003e ds.avg_salary * 0.8\n    ORDER BY d.name, salary_rank\n`\n\nrows, err := db.QueryContext(ctx, query)\n```\n\n### Context and Cancellation\n\n```go\nimport (\n    \"context\"\n    \"time\"\n)\n\n// Set timeout for large file operations\nctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)\ndefer cancel()\n\ndb, err := filesql.OpenContext(ctx, \"huge_dataset.csv.gz\")\nif err != nil {\n    log.Fatal(err)\n}\ndefer db.Close()\n\n// Query with context for cancellation support\nrows, err := db.QueryContext(ctx, \"SELECT * FROM huge_dataset WHERE status = 'active'\")\n```\n\n## Examples\n\nThe [examples](./examples) directory contains sample code demonstrating various filesql features:\n\n| Example | Description |\n|---------|-------------|\n| [basic](./examples/basic) | Basic CSV query operations |\n| [multi-format](./examples/multi-format) | Working with multiple file formats (CSV, TSV, LTSV, Parquet) |\n| [sqlc](./examples/sqlc) | Integration with [sqlc](https://sqlc.dev/) - type-safe SQL code generator |\n| [gorm](./examples/gorm) | Integration with [GORM](https://gorm.io/) - full-featured ORM |\n| [sqlx](./examples/sqlx) | Integration with [sqlx](https://github.com/jmoiron/sqlx) - extensions to database/sql |\n| [bun](./examples/bun) | Integration with [Bun](https://bun.uptrace.dev/) - SQL-first ORM |\n| [squirrel](./examples/squirrel) | Integration with [Squirrel](https://github.com/Masterminds/squirrel) - fluent SQL query builder |\n| [ent](./examples/ent) | Integration with [Ent](https://entgo.io/) - entity framework by Facebook |\n\n\n## Data Preprocessing with fileprep\n\nFor data validation and preprocessing before querying with filesql, we recommend using **[nao1215/fileprep](https://github.com/nao1215/fileprep)**.\n\nfileprep is a companion library that provides:\n- **Struct tag-based preprocessing** (`prep` tag): trim, lowercase, uppercase, default values, and more\n- **Struct tag-based validation** (`validate` tag): required fields, format validation, cross-field validation\n- **Seamless filesql integration**: Returns `io.Reader` for direct use with filesql's Builder pattern\n\n```go\n// Define struct with preprocessing and validation tags\ntype User struct {\n    // Name: trim whitespace, require non-empty\n    Name  string `prep:\"trim\" validate:\"required\"`\n    // Email: trim, convert to lowercase, validate email format\n    Email string `prep:\"trim,lowercase\" validate:\"required,email\"`\n    // Age: set default if empty, validate range 0-150\n    Age   string `prep:\"default=0\" validate:\"numeric,gte=0,lte=150\"`\n    // Role: trim, uppercase, must be one of the allowed values\n    Role  string `prep:\"trim,uppercase\" validate:\"oneof=ADMIN USER GUEST\"`\n}\n\nfunc main() {\n    // CSV data with messy input\n    csvData := `name,email,age,role\n  John Doe  ,JOHN@EXAMPLE.COM,25,admin\nAlice,alice@example.com,,user`\n\n    // Create processor and process the CSV\n    processor := fileprep.NewProcessor(fileprep.FileTypeCSV)\n    var users []User\n\n    reader, result, err := processor.Process(strings.NewReader(csvData), \u0026users)\n    if err != nil {\n        log.Fatal(err)\n    }\n\n    // Check validation results\n    fmt.Printf(\"Processed: %d rows, Valid: %d rows\\n\", result.RowCount, result.ValidRowCount)\n    if result.HasErrors() {\n        for _, e := range result.ValidationErrors() {\n            log.Printf(\"Row %d, Column %s: %s\", e.Row, e.Column, e.Message)\n        }\n    }\n\n    // Pass preprocessed data to filesql\n    // The data is now cleaned: trimmed, lowercased emails, defaults applied\n    ctx := context.Background()\n    db, err := filesql.NewBuilder().\n        AddReader(reader, \"users\", filesql.FileTypeCSV).\n        Build(ctx)\n    if err != nil {\n        log.Fatal(err)\n    }\n    defer db.Close()\n\n    // Query the clean data\n    rows, _ := db.QueryContext(ctx, \"SELECT * FROM users WHERE role = 'ADMIN'\")\n    // ...\n}\n```\n\nFor the complete list of preprocessing and validation options, see the [fileprep documentation](https://github.com/nao1215/fileprep).\n\n## Related Projects\n\nUsing filesql in your project? We'd love to hear about it! Please [open an issue](https://github.com/nao1215/filesql/issues) to let us know, and we'll add your project to the list below.\n\n### Related Libraries\n\n| Project | Description |\n|---------|-------------|\n| [nao1215/fileprep](https://github.com/nao1215/fileprep) | Data preprocessing library with struct tag validation. Clean and validate CSV/TSV data using Go struct tags before querying. |\n| [nao1215/fileframe](https://github.com/nao1215/fileframe)        | DataFrame API for CSV/TSV/LTSV, Parquet, Excel.  |\n\n\n### CLI Tools Using filesql\n\n| Project | Description |\n|---------|-------------|\n| [nao1215/sqly](https://github.com/nao1215/sqly) | Interactive shell for executing SQL queries against CSV, TSV, LTSV, JSON, and Excel files. Perfect for ad-hoc data analysis from the command line. |\n| [kanmu/gocon2025-ctf](https://github.com/kanmu/gocon2025-ctf) | Go Conference 2025 CTF repository (in japanese) |\n\n## Contributing\n\nContributions are welcome! Please see the [Contributing Guide](./CONTRIBUTING.md) for more details.\n\n## Support\n\nIf you find this project useful, please consider:\n\n- Giving it a star on GitHub - it helps others discover the project\n- [Becoming a sponsor](https://github.com/sponsors/nao1215) - your support keeps the project alive and motivates continued development\n\nYour support, whether through stars, sponsorships, or contributions, is what drives this project forward. Thank you!\n\n### Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=nao1215/filesql\u0026type=date\u0026legend=top-left)](https://www.star-history.com/#nao1215/filesql\u0026Date)\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnao1215%2Ffilesql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnao1215%2Ffilesql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnao1215%2Ffilesql/lists"}