{"id":31661883,"url":"https://github.com/postgres-ai/pg_index_pilot","last_synced_at":"2025-10-07T19:03:04.855Z","repository":{"id":315426154,"uuid":"1058993106","full_name":"postgres-ai/pg_index_pilot","owner":"postgres-ai","description":"Autonomous index lifecycle management for Postgres","archived":false,"fork":false,"pushed_at":"2025-10-03T13:27:14.000Z","size":391,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-03T13:35:29.776Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"PLpgSQL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/postgres-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":"COPYRIGHT","agents":null,"dco":null,"cla":null}},"created_at":"2025-09-17T20:49:41.000Z","updated_at":"2025-10-02T09:56:37.000Z","dependencies_parsed_at":"2025-10-03T13:24:23.358Z","dependency_job_id":"7c9d1710-f806-42c3-b77b-475137ccd805","html_url":"https://github.com/postgres-ai/pg_index_pilot","commit_stats":null,"previous_names":["postgres-ai/pg_index_pilot"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/postgres-ai/pg_index_pilot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgres-ai%2Fpg_index_pilot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgres-ai%2Fpg_index_pilot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgres-ai%2Fpg_index_pilot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgres-ai%2Fpg_index_pilot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/postgres-ai","download_url":"https://codeload.github.com/postgres-ai/pg_index_pilot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgres-ai%2Fpg_index_pilot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278830043,"owners_count":26053223,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-07T19:01:50.448Z","updated_at":"2025-10-07T19:03:04.840Z","avatar_url":"https://github.com/postgres-ai.png","language":"PLpgSQL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pg_index_pilot – autonomous index lifecycle management for Postgres\n\nThe purpose of `pg_index_pilot` is to provide all tools needed to manage indexes in Postgres in most automated fashion.\n\nThis project is in its very early stage. We start with most boring yet extremely important task: automatic reindexing (\"AR\") to mitigate index bloat, supporting any types of indexes, and then expand to other areas of index health. And then expand to two other big areas – automated index removal (\"AIR\") and, finally, automated index creation and optimization (\"AIC\u0026O\"). It is a part of the Self‑driving Postgres, but can be used independently as a standalone tool.\n\nDocs: [Installation](docs/installation.md) | [Runbook](docs/runbook.md) | [FAQ](docs/faq.md) | [Function reference](docs/function_reference.md) | [Architecture](docs/architecture.md)\n\n## What it is for\n\n- Automated index lifecycle management for PostgreSQL, starting with automatic reindexing to keep index bloat under control without manual work.\n\n## Key principles\n\n- Simplicity and full embed: implemented entirely inside PostgreSQL (PL/pgSQL), no external services required to rebuild indexes.\n- Works everywhere PostgreSQL runs (including managed platforms) — all logic lives in the database.\n- No superuser requirement for day‑to‑day operations; designed to run under owner/privileged roles in control DB and target DBs.\n- Scheduling inside the database via `pg_cron` — no EC2/Lambda or other external orchestrators needed.\n- Supports reindexing of all common index types (btree, hash, gin, gist, spgist); brin is currently excluded.\n- Control DB orchestrates multiple target databases via `postgres_fdw`/`dblink`; reindexing is executed with `reindex concurrently` to minimize locking.\n\nSee [Architecture](docs/architecture.md) for detailed design decisions and requirements.\n\n## Table of contents\n\n- [Roadmap](#roadmap)\n- [Automated reindexing](#automated-reindexing)\n- [Requirements](#requirements)\n- [Recommendations](#recommendations)\n- [Installation](#installation)\n  - [Control database setup (required)](#control-database-setup-required)\n  - [Self-hosted PostgreSQL Example](#self-hosted-postgresql-example)\n- [Initial launch](#initial-launch)\n- [Scheduling automated maintenance](#scheduling-automated-maintenance)\n  - [Choosing the right schedule](#choosing-the-right-schedule)\n  - [Using pg_cron (Recommended)](#using-pg_cron-recommended)\n  - [Using external cron](#using-external-cron)\n- [Uninstalling pg_index_pilot](#uninstalling-pg_index_pilot)\n- [Updating pg_index_pilot](#updating-pg_index_pilot)\n- [Monitoring and Analysis](#monitoring-and-analysis)\n  - [View reindexing history](#view-reindexing-history)\n  - [Check current bloat status](#check-current-bloat-status)\n\n## Roadmap\n\nThe roadmap covers three big areas:\n\n1. [ ] \"AR\": Automated Reindexing\n    1. [x] Maxim Boguk's bloat estimation formula – works with any type of index, not only btree\n        1. [x] original implementation (`pg_index_pilot`) – requires initial full reindex\n        2. [x] non-superuser mode for cloud databases (AWS RDS, Google Cloud SQL, Azure)\n        3. [x] flexible connection management for dblink\n        4. [ ] API for stats obtained on a clone (to avoid full reindex on prod primary)\n    2. [ ] Traditional bloat estimation (ioguix; btree only)\n    3. [ ] Exact bloat analysis (pgstattuple; analysis on clones)\n    4. [x] Tested on managed services\n        - [x] RDS and Aurora (see AWS specifics in Installation: docs/installation.md#aws-rds--aurora-specifics)\n        - [ ] CloudSQL\n        - [x] Supabase\n        - [ ] Crunchy Bridge\n        - [ ] Azure\n    5. [ ] Integration with postgres_ai monitoring\n    6. [ ] Resource-aware scheduling, predictive maintenance windows (when will load be lowest?)\n    7. [ ] Coordination with other ops (backups, vacuums, upgrades)\n    8. [ ] Parallelization and throttling (adaptive)\n    9. [ ] Predictive bloat modeling\n    10. [ ] Learning \u0026 Feedback Loops: learning from past actions, A/B testing and \"what-if\" simulation (DBLab)\n    11. [ ] Impact estimation before scheduling\n    12. [ ] RCA of fast degraded index health (why it gets bloated fast?) and mitigation (tune autovacuum, avoid xmin horizon getting stuck)\n    13. [ ] Self-adjusting thresholds\n2. [ ] \"AIR\": Automated Index Removal\n    1. [ ] Unused indexes\n    2. [ ] Redundant indexes\n    3. [ ] Invalid indexes (or, per configuration, rebuilding them)\n    4. [ ] Advanced scoring; suboptimal / rarely used indexes cleanup; self-adjusting thresholds\n    5. [ ] Forecasting of index usage; seasonal pattern recognition\n    6. [ ] Impact estimation before removal; \"what-if\" simulation (DBLab)\n3. [ ] \"AIC\u0026O\": Automated Index Creation \u0026 Optimization\n    1. [ ] Index recommendations (including multi-column, expression, partial, hybrid, and covering indexes)\n    2. [ ] Index optimization according to configured goals (latency, size, WAL, write/HOT overhead, read overhead)\n    3. [ ] Experimentation (hypothetical with HypoPG, real with DBLab)\n    4. [ ] Query pattern classification\n    5. [ ] Advanced scoring; cost/benefit analysis\n    6. [ ] Impact estimation before operations; \"what-if\" simulation (DBLab)\n\n## Automated reindexing\n\nThe framework of reindexing is implemented entirely inside Postgres, using:\n- PL/pgSQL functions and stored procedures with transaction control\n- [dblink](https://www.postgresql.org/docs/current/contrib-dblink-function.html) to execute `REINDEX CONCURRENTLY` – because it cannot be inside a transaction block)\n- [pg_cron](https://github.com/citusdata/pg_cron) for scheduling\n\n---\n\n\n## Requirements\n\n- PostgreSQL version 13.0 or higher\n- **IMPORTANT:** Requires ability to create database (not supported on TigerData, formerly Timescale Cloud)\n- Separate control database (`index_pilot_control`) to manage target databases\n- `dblink` and `postgres_fdw` extensions installed in control database\n- Database owner or user with appropriate permissions\n- Works with AWS RDS, Google Cloud SQL, Azure Database for PostgreSQL (where database creation is allowed)\n- Manages multiple target databases from single control database\n- Uses REINDEX CONCURRENTLY from control database (avoids deadlocks)\n\n## Recommendations \n- If server resources allow set non-zero `max_parallel_maintenance_workers` (exact amount depends on server parameters).\n- To set `wal_keep_segments` to at least `5000`, unless the WAL archive is used to support streaming replication.\n\n## Installation\n\n### Quick install via index_pilot.sh\n\n```bash\n# Clone the repository\ngit clone https://gitlab.com/postgres-ai/pg_index_pilot\ncd pg_index_pilot\n\n# 1) Install into control database (auto-creates DB, installs extensions/objects)\nPGPASSWORD='your_password' \\\n  ./index_pilot.sh install-control \\\n  -H your_host -U your_user -C your_control_db_name\n\n# 2) Register a target database via FDW (secure user mapping)\nPGPASSWORD='your_password' \\\n  ./index_pilot.sh register-target \\\n  -H your_host -U your_user -C your_control_db_name \\\n  -T your_database --fdw-host your_host\n\n# 3) Verify installation and environment\nPGPASSWORD='your_password' \\\n  ./index_pilot.sh verify \\\n  -H your_host -U your_user -C your_control_db_name\n\n# (Optional) Uninstall\nPGPASSWORD='your_password' \\\n  ./index_pilot.sh uninstall \\\n  -H your_host -U your_user -C your_control_db_name --drop-servers\n```\n\nNotes:\n- Use `PGPASSWORD` to avoid echoing secrets; the script won’t print passwords.\n- `--fdw-host` should be reachable from the database server itself (in Docker/CI it might be `postgres`, `127.0.0.1`, or the container IP).\n- For self-hosted replace host with `127.0.0.1`. For managed services ensure the admin user can `create database` and `create extension`.\n\n### Before you start (checklist)\n- PostgreSQL ≥ 13 and ability to create database/extensions (control DB).\n- Decide: CONTROL_DB name, TARGET_DB name, TARGET_HOST (reachable from Postgres server, not only from client).\n- If you plan to use pg_cron: ensure it’s in `shared_preload_libraries` (RDS: parameter group + reboot), and `create extension pg_cron` in `cron.database_name`.\n- The FDW user mapping is looked up for the `current_user` in the control DB session. Create mapping for that user.\n\n### Placeholders used below\n- CONTROL_DB, TARGET_DB, TARGET_HOST, SERVER_NAME (e.g. `target_\u003ctarget_db\u003e`)\n- CONTROL_USER/PASS (user running commands in control DB)\n- TARGET_USER/PASS (user in the target DB; typically an owner or a role with owner rights)\n\n### Key concepts\n- `target_\u003cdb\u003e`: FDW server that points to the target database. This name goes to `index_pilot.target_databases.fdw_server_name`.\n- A user mapping must exist for `current_user` (in the control DB) to each `target_\u003cdb\u003e` server you intend to use.\n\n### Security Note\n\n**CRITICAL**: Never use hardcoded passwords in production. The `setup_01_user.psql` script requires a secure password to be provided via psql variable:\n\n```bash\n# Generate secure random password\nRANDOM_PWD=$(openssl rand -base64 32)\n\n# Use the secure setup script (recommended)\n./setup_user_secure.sh\n\n# Or run manually with secure password\npsql -f setup_01_user.psql -v index_pilot_password=\"$RANDOM_PWD\"\necho \"Generated password: $RANDOM_PWD\"\n```\n\n### Manual installation\n\n#### Control database setup (Required)\n\n```bash\n# Clone the repository\ngit clone https://gitlab.com/postgres-ai/pg_index_pilot\ncd pg_index_pilot\n\n# 1. Create control database (as admin user)\npsql -h your-instance.region.rds.amazonaws.com -U postgres -c \"create database index_pilot_control;\"\n\n# 2. Install required extensions in control database\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c \"CREATE EXTENSION IF NOT EXISTS postgres_fdw;\"\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c \"CREATE EXTENSION IF NOT EXISTS dblink;\"\n\n# 3. Install schema and functions in control database\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -f index_pilot_tables.sql\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -f index_pilot_functions.sql\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -f index_pilot_fdw.sql\n\n# 4. Create FDW server and user mapping for the TARGET database\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control \u003c\u003c'SQL'\ncreate server if not exists target_\u003cyour_database\u003e foreign data wrapper postgres_fdw\n  options (host 'your-instance.region.rds.amazonaws.com', port '5432', dbname 'your_database');\n\n-- dblink_connect(server_name) uses current_user user mapping; create mapping for the user running control DB (often postgres or index_pilot)\ncreate user mapping if not exists for current_user server target_\u003cyour_database\u003e\n  options (user 'remote_owner_or_role', password 'remote_password');\nSQL\n\n# 5. Register the TARGET database (links index_pilot.target_databases to your FDW server)\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control \u003c\u003c'SQL'\ninsert into index_pilot.target_databases(database_name, host, port, fdw_server_name, enabled)\nvalues ('your_database', 'your-instance.region.rds.amazonaws.com', 5432, 'target_your_database', true)\non conflict (database_name) do update\n  set host=excluded.host, port=excluded.port, fdw_server_name=excluded.fdw_server_name, enabled=true;\nSQL\n\n# 7. Verify FDW and environment\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c \"select * from index_pilot.check_fdw_security_status();\"\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d index_pilot_control -c \"select * from index_pilot.check_environment();\"\n```\n\n#### Self-hosted PostgreSQL Example\n\n```bash\n# Clone the repository\ngit clone https://gitlab.com/postgres-ai/pg_index_pilot\ncd pg_index_pilot\n\n# 1. Create control database (as superuser)\npsql -U postgres -c \"create database index_pilot_control;\"\n\n# 2. Install required extensions in control database (as superuser)\npsql -U postgres -d index_pilot_control -c \"CREATE EXTENSION IF NOT EXISTS postgres_fdw;\"\npsql -U postgres -d index_pilot_control -c \"CREATE EXTENSION IF NOT EXISTS dblink;\"\n\n# 3. Install schema and functions in control database (as superuser)\npsql -U postgres -d index_pilot_control -f index_pilot_tables.sql\npsql -U postgres -d index_pilot_control -f index_pilot_functions.sql\npsql -U postgres -d index_pilot_control -f index_pilot_fdw.sql\n\n# 4. Create FDW server and user mapping for the TARGET database\npsql -U postgres -d index_pilot_control \u003c\u003c'SQL'\ncreate server if not exists target_your_database foreign data wrapper postgres_fdw\n  options (host '127.0.0.1', port '5432', dbname 'your_database');\n\ncreate user mapping if not exists for current_user server target_your_database\n  options (user 'remote_owner_or_role', password 'remote_password');\nSQL\n\n# 5. Register the TARGET database\npsql -U postgres -d index_pilot_control \u003c\u003c'SQL'\ninsert into index_pilot.target_databases(database_name, host, port, fdw_server_name, enabled)\nvalues ('your_database', '127.0.0.1', 5432, 'target_your_database', true)\non conflict (database_name) do update\n  set host=excluded.host, port=excluded.port, fdw_server_name=excluded.fdw_server_name, enabled=true;\nSQL\n\n# 7. Verify\npsql -U postgres -d index_pilot_control -c \"select * from index_pilot.check_fdw_security_status();\"\npsql -U postgres -d index_pilot_control -c \"select * from index_pilot.check_environment();\"\n```\n\n## Initial launch\n\n**⚠️ IMPORTANT:** During the first run, all indexes larger than index_size_threshold (default: 10MB) will be analyzed and potentially rebuilt. This process may take hours or days on large databases.\n\nFor manual initial run:\n\n```bash\n# Set credentials\nexport PGSSLMODE=require\nexport PGPASSWORD='your_index_pilot_password'\n\n# Run initial analysis and reindexing\nnohup psql -h your_host -U index_pilot -d your_database \\\n  -qXt -c \"call index_pilot.periodic(true)\" \u003e\u003e index_pilot.log 2\u003e\u00261\n```\n\n## Scheduling automated maintenance\n\n### Choosing the right schedule\n\nThe optimal maintenance schedule depends on your database characteristics:\n\n**Daily maintenance (recommended for):**\n- High-traffic databases with frequent updates\n- Databases where index bloat accumulates quickly\n- Systems with sufficient maintenance windows each night\n- When you want to catch and fix bloat early\n\n**Weekly maintenance (recommended for):**\n- Stable databases with predictable workloads\n- Systems where index bloat accumulates slowly\n- Production systems where daily maintenance might be disruptive\n- Databases with limited maintenance windows\n\n### Using pg_cron (Recommended)\n\n**Step 1: Check where pg_cron is installed**\n```sql\n-- Find which database has pg_cron\nshow cron.database_name;\n```\n\n**Step 2: Schedule jobs from the pg_cron database**\n\n```sql\n-- Connect to the database shown in step 1\n\\c postgres_ai  -- or whatever cron.database_name shows\n\n-- Daily maintenance at 2 AM\nselect cron.schedule_in_database(\n    'pg_index_pilot_daily',\n    '0 2 * * *',\n    'call index_pilot.periodic(real_run := true);',\n    'index_pilot_control'  -- Run in control database\n);\n\n-- Monitoring every 6 hours (no actual reindex)\nselect cron.schedule_in_database(\n    'pg_index_pilot_monitor',\n    '0 */6 * * *',\n    'call index_pilot.periodic(real_run := false);',\n    'index_pilot_control'\n);\n\n-- OR weekly maintenance on Sunday at 2 AM\nselect cron.schedule_in_database(\n    'pg_index_pilot_weekly',\n    '0 2 * * 0',\n    'call index_pilot.periodic(real_run := true);',\n    'index_pilot_control'\n);\n```\n\n**Step 3: Verify and manage schedules**\n```sql\n-- View scheduled jobs\nselect jobname, schedule, command, database, active \nfrom cron.job \nwhere jobname like 'pg_index_pilot%';\n\n-- Disable a schedule\nselect cron.unschedule('pg_index_pilot_daily');\n\n-- Change schedule time\nselect cron.unschedule('pg_index_pilot_daily');\nselect cron.schedule_in_database(\n    'pg_index_pilot_daily', \n    '0 3 * * *',  -- New time: 3 AM\n    'call index_pilot.periodic(real_run := true);',\n    'index_pilot_control'\n);\n```\n\n### Using external cron\n\nCreate a maintenance script:\n```bash\n# Runs reindexing only on primary (all databases)\npsql -d postgres -AtqXc \"select not pg_is_in_recovery()\" | grep -qx t || exit; psql -d postgres -qt -c \"call index_pilot.periodic(true);\"\n```\n\nAdd to crontab:\n```cron\n# Runs reindexing daily at 2 AM (only on primary)\n0 2 * * * /usr/local/bin/index_maintenance.sh\n```\n\n**💡 Best Practices:**\n- Schedule during low-traffic periods\n- Avoid overlapping with backup or other IO-intensive operations\n- Consider hourly runs for high-write workloads\n- Monitor resource usage during initial runs (first of all, both disk IO and CPU usage)\n\n## Uninstalling pg_index_pilot\n\nTo completely remove pg_index_pilot from your database:\n\n```bash\n# Uninstall the tool (this will delete all collected statistics!)\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d your_database -f uninstall.sql\n\n# Check for any leftover invalid indexes from failed reindexes\npsql -h your-instance.region.rds.amazonaws.com -U postgres -d your_database \\\n  -c \"select format('drop index concurrently if exists %I.%I;', n.nspname, i.relname) \n      from pg_index idx\n      join pg_class i on i.oid = idx.indexrelid\n      join pg_namespace n on n.oid = i.relnamespace\n      where i.relname ~ '_ccnew[0-9]*$'\n      and not idx.indisvalid;\"\n\n# Run any drop index commands from the previous query manually\n```\n\n**Note:** The uninstall script will:\n- Remove the `index_pilot` schema and all its objects\n- Remove the FDW server configuration\n- List any invalid `_ccnew*` indexes that need manual cleanup\n- Preserve the `postgres_fdw` extension (may be used by other tools)\n\n## Updating pg_index_pilot\n\nTo update to the latest version:\n```bash\ncd pg_index_pilot\ngit pull\n\n# Reload the updated functions (or reinstall completely)\npsql -1 -d your_database -f index_pilot_functions.sql\npsql -1 -d your_database -f index_pilot_fdw.sql\n```\n\n## Monitoring and Analysis\n\n### View Reindexing History\n```sql\n-- Show recent reindexing operations with status\nselect \n    schemaname, relname, indexrelname,\n    pg_size_pretty(indexsize_before::bigint) as size_before,\n    pg_size_pretty(indexsize_after::bigint) as size_after,\n    reindex_duration,\n    status,\n    case when error_message is not null then left(error_message, 50) else null end as error,\n    entry_timestamp\nfrom index_pilot.reindex_history \norder by entry_timestamp desc \nlimit 20;\n\n-- Show only failed reindexes for debugging\nselect \n    schemaname, relname, indexrelname,\n    pg_size_pretty(indexsize_before::bigint) as size_before,\n    reindex_duration,\n    error_message,\n    entry_timestamp\nfrom index_pilot.reindex_history \nwhere status = 'failed'\norder by entry_timestamp desc;\n```\n\n**💡 Tip:** Use the convenient `index_pilot.history` view for formatted output:\n```sql\n-- View recent operations with formatted sizes and status\nselect * from index_pilot.history limit 20;\n\n-- View only failed operations\nselect * from index_pilot.history where status = 'failed';\n```\n\n### Check Current Bloat Status\n```sql\n-- Check bloat estimates for current database\nselect \n    indexrelname,\n    pg_size_pretty(indexsize::bigint) as current_size,\n    round(estimated_bloat::numeric, 1)||'x' as bloat_now\nfrom index_pilot.get_index_bloat_estimates(current_database()) \norder by estimated_bloat desc nulls last \nlimit 40;\n```\n\n### Baseline, candidates, and exclusions (quick reference)\n\n```sql\n-- Initialize baseline without reindex (sets best_ratio for large indexes)\nselect index_pilot.do_force_populate_index_stats('\u003cTARGET_DB\u003e', null, null, null);\n\n-- List what periodic(true) would take under current thresholds\nselect\n  schemaname, relname, indexrelname,\n  pg_size_pretty(indexsize) as size,\n  round(estimated_bloat::numeric, 2) as bloat_x\nfrom index_pilot.get_index_bloat_estimates('\u003cTARGET_DB\u003e')\nwhere indexsize \u003e= pg_size_bytes(index_pilot.get_setting(datname, schemaname, relname, indexrelname, 'index_size_threshold'))\n  and coalesce(index_pilot.get_setting(datname, schemaname, relname, indexrelname, 'skip')::boolean, false) = false\n  and (estimated_bloat is null\n       or estimated_bloat \u003e= index_pilot.get_setting(datname, schemaname, relname, indexrelname, 'index_rebuild_scale_factor')::float)\norder by estimated_bloat desc nulls first\nlimit 50;\n\n-- Exclude service schemas if desired\nselect index_pilot.set_or_replace_setting('\u003cTARGET_DB\u003e','pg_toast',null,null,'skip','true',null);\nselect index_pilot.set_or_replace_setting('\u003cTARGET_DB\u003e','_timescaledb_internal',null,null,'skip','true',null);\n```\n\nNotes:\n- Baseline sets best_ratio to current size/tuples; immediately after, bloat_x ≈ 1.0 and will grow as indexes bloat.\n- Small indexes (\u003c minimum_reliable_index_size, default 128kB) skip best_ratio to avoid noise; candidates are still gated by index_size_threshold (default 10MB).\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgres-ai%2Fpg_index_pilot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpostgres-ai%2Fpg_index_pilot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgres-ai%2Fpg_index_pilot/lists"}