{"id":15405531,"url":"https://github.com/macbre/index-digest","last_synced_at":"2025-04-05T07:01:56.759Z","repository":{"id":38324418,"uuid":"105176212","full_name":"macbre/index-digest","owner":"macbre","description":"Analyses your database queries and schema and suggests indices and schema improvements","archived":false,"fork":false,"pushed_at":"2024-10-25T03:58:10.000Z","size":753,"stargazers_count":76,"open_issues_count":14,"forks_count":5,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-10-25T06:47:28.394Z","etag":null,"topics":["code-quality","database-perfomance","database-queries","dba","digest","docker-image","index","linter","mariadb","mysql","performance","python","query-digest","schema","slow-queries","sql","sql-logs","sqlcheck","sre","sustainable-software"],"latest_commit_sha":null,"homepage":"https://medium.com/legacy-systems-diary/linting-your-database-schema-cd8947835a52","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/macbre.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-28T17:05:09.000Z","updated_at":"2024-10-25T03:57:25.000Z","dependencies_parsed_at":"2023-02-18T02:40:18.658Z","dependency_job_id":"6e480b7c-0f49-4754-838f-3151ce85ae08","html_url":"https://github.com/macbre/index-digest","commit_stats":{"total_commits":790,"total_committers":7,"mean_commits":"112.85714285714286","dds":"0.32784810126582276","last_synced_commit":"fc15bce75272f0a2a09ccac34d6662cd4d325eda"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macbre%2Findex-digest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macbre%2Findex-digest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macbre%2Findex-digest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macbre%2Findex-digest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/macbre","download_url":"https://codeload.github.com/macbre/index-digest/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247299829,"owners_count":20916190,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["code-quality","database-perfomance","database-queries","dba","digest","docker-image","index","linter","mariadb","mysql","performance","python","query-digest","schema","slow-queries","sql","sql-logs","sqlcheck","sre","sustainable-software"],"created_at":"2024-10-01T16:16:56.862Z","updated_at":"2025-04-05T07:01:56.716Z","avatar_url":"https://github.com/macbre.png","language":"Python","funding_links":[],"categories":["Database tools"],"sub_categories":[],"readme":"# index-digest\n\n[![PyPI](https://img.shields.io/pypi/v/indexdigest.svg)](https://pypi.python.org/pypi/indexdigest)\n[![Docker Hub](https://img.shields.io/docker/pulls/macbre/index-digest.svg)](https://hub.docker.com/r/macbre/index-digest/)\n[![Coverage Status](https://coveralls.io/repos/github/macbre/index-digest/badge.svg?branch=master)](https://coveralls.io/github/macbre/index-digest?branch=master)\n\nAnalyses your database queries and schema and suggests indices improvements. You can use `index-digest` as **your database linter**. The goal is to **provide the user with actionable reports** instead of just a list of statistics and schema details. Inspired by [Percona's `pt-index-usage`](https://www.percona.com/doc/percona-toolkit/LATEST/pt-index-usage.html).\n\n**NEW** You can also [use `index-digest` as GitHub's Action](https://github.com/marketplace/actions/index-digest).\n\n## What this tool does\n\n`index-digest` does the following:\n\n* it checks the schema of all tables in a given database and suggests improvements (e.g. removal of redundant indices, adding a primary key to ease replication, dropping tables with just a single column or no rows)\n* if provided with SQL queries log (via `--sql-log` option) it:\n  * checks if all tables, columns and indices are used by these queries\n  * reports text columns with character set different than `utf`\n  * reports queries that do not use indices\n  * reports queries that use filesort, temporary file or full table scan\n  * reports queries that are not quite kosher (e.g. `LIKE \"%foo%\"`, `INSERT IGNORE`, `SELECT *`, `HAVING` clause, high `OFFSET` in pagination queries)\n* if run with `--analyze-data` switch it:\n  * reports tables with old data (by querying for `MIN()` value of time column) where data retency can be reviewed\n  * reports tables with not up-to-date data (by querying for `MAX()` value of time column)\n* if run with `--check-empty-databases` switch it:\n  * report empty databases on the current MySQL server\n\nThis tool **supports MySQL 5.7, 8.0, 8.1, [Percona Server](https://www.percona.com/software/mysql-database/percona-server) 8.0 and MariaDB 10.1, 10.2, 10.5, 10.6** and runs under **Python 3.8+**.\n\nResults can be reported in a human-readable form, as YAML or sent to syslog and later aggregated \u0026 processed using ELK stack.\n\n## Requirements \u0026 install\n\n### From `pypi`\n\n```\npip install indexdigest\n```\n\n### From git\n\n```\ngit clone git@github.com:macbre/index-digest.git \u0026\u0026 cd index-digest\nsudo apt-get install libmysqlclient-dev python3-dev virtualenv\n\nvirtualenv -ppython3 env\nsource env/bin/activate\nmake install\n```\n\nWhen using MacOS, you should follow [this `mysql_config` installation steps](https://stackoverflow.com/a/25491082).\n\n#### Running tests\n\n**We assume that the test database is running locally on port 53306**. You can use the following to test your changes locally before pushing them (this one uses MySQL 8.0.20):\n\n```\ndocker run --rm -p 53306:3306 --health-cmd=\"mysqladmin ping\" --health-interval=10s --health-timeout=5s --health-retries=3 -e \"MYSQL_ALLOW_EMPTY_PASSWORD=yes\" -e \"MYSQL_DATABASE=index_digest\" --name=index_digest_mysql mysql:8.0.22 \"--default-authentication-plugin=mysql_native_password\"\n```\n\nWait until the server is up and running.\n\n```\nmysql --protocol=tcp --port=53306 -u root --password=\"\" -v \u003c setup.sql\n./sql/populate.sh\nmysql --protocol=tcp --port=53306 -uindex_digest -pqwerty index_digest -v -e '\\s; SHOW TABLES; SHOW DATABASES;'\n\nmake test\n```\n\n### Using Docker\n\n\u003e See https://hub.docker.com/r/macbre/index-digest/\n\n```\n$ docker run --network=host -t macbre/index-digest:latest mysql://index_digest:qwerty@debian/index_digest  | head -n 20\n------------------------------------------------------------\nFound 61 issue(s) to report for \"index_digest\" database\n------------------------------------------------------------\nMySQL v5.7.22 at debian\nindex-digest v1.2.0\n------------------------------------------------------------\nredundant_indices → table affected: 0004_id_foo\n\n✗ \"idx\" index can be removed as redundant (covered by \"PRIMARY\")\n\n  - redundant: UNIQUE KEY idx (item_id, foo)\n  - covered_by: PRIMARY KEY (item_id, foo)\n  - schema: CREATE TABLE `0004_id_foo` (\n      `item_id` int(9) NOT NULL AUTO_INCREMENT,\n      `foo` varbinary(16) NOT NULL DEFAULT '',\n      PRIMARY KEY (`item_id`,`foo`),\n      UNIQUE KEY `idx` (`item_id`,`foo`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=latin1\n  - table_data_size_mb: 0.015625\n  - table_index_size_mb: 0.015625\n...\n```\n\n## How to run it?\n\n```\n$ index_digest -h\nindex_digest\n\nAnalyses your database queries and schema and suggests indices improvements.\n\nUsage:\n  index_digest DSN [--sql-log=\u003cfile\u003e] [--format=\u003cformatter\u003e] [--analyze-data] [--checks=\u003cchecks\u003e | --skip-checks=\u003cskip-checks\u003e] [--tables=\u003ctables\u003e | --skip-tables=\u003cskip-tables\u003e]\n  index_digest (-h | --help)\n  index_digest --version\n\nOptions:\n  DSN               Data Source Name of database to check\n  --sql-log=\u003cfile\u003e  Text file with SQL queries to check against the database\n  --format=\u003cformatter\u003e  Use a given results formatter (plain, syslog, yaml)\n  --analyze-data    Run additional checks that will query table data (can be slow!)\n  --checks=\u003clist\u003e   Comma-separated lists of checks to report\n  --skip-checks=\u003clist\u003e Comma-separated lists of checks to skip from report\n  --tables=\u003clist\u003e   Comma-separated lists of tables to report\n  --skip-tables=\u003clist\u003e Comma-separated lists of tables to skip from report\n  -h --help         Show this screen.\n  --version         Show version.\n\nExamples:\n  index_digest mysql://username:password@localhost/dbname\n  index_digest mysql://index_digest:qwerty@localhost/index_digest --sql-log=sql.log\n  index_digest mysql://index_digest:qwerty@localhost/index_digest --skip-checks=non_utf_columns\n  index_digest mysql://index_digest:qwerty@localhost/index_digest --analyze-data --checks=data_too_old,data_not_updated_recently\n  index_digest mysql://index_digest:qwerty@localhost/index_digest --analyze-data --skip-tables=DATABASECHANGELOG,DATABASECHANGELOGLOCK\n\nVisit \u003chttps://github.com/macbre/index-digest\u003e\n```\n\n## SQL query log\n\nIt's a text file with a single SQL query in each line (no line breaks are allowed). Lines that do start with `--` (SQL comment) are ignored. The file can be [generated using `query-digest` when `--sql-log` output mode is selected](https://github.com/macbre/query-digest#output-modes).\n\nAn example:\n\n```sql\n-- A comment\nselect * from 0002_not_used_indices order by id\nselect * from 0002_not_used_indices where foo = 'foo' and id = 2\nselect count(*) from 0002_not_used_indices where foo = 'foo'\n/* foo bar */ select * from 0002_not_used_indices where bar = 'foo'\nINSERT  IGNORE INTO `0070_insert_ignore` VALUES ('123', 9, '2017-01-01');\n```\n\n### From [MySQL slow query log](https://dev.mysql.com/doc/refman/8.0/en/slow-query-log.html)\n\nMySQL's slow query log needs to be pre-processed first (to remove comments and timestamps):\n\n```\ncat mysql-slow.log | egrep -v '^(SET timestamp|#|throttle: )' \u003e queries.log\n```\n\nThen you can run `index_digest --sql-log=queries.log ...`.\n\n## Formatters\n\n`index-digest` can return results in various formats (use `--format` to choose one).\n\n### plain\n\nEmits human-readable report to a console. You can disable colored and bold text by setting env variable `ANSI_COLORS_DISABLED=1`.\n\n### syslog\n\nPushes JSON-formatted messages via syslog, so they can be aggregated using ELK stack.\nUse `SYSLOG_IDENT` env variable to customize syslog's `ident` messages are sent with (defaults to `index-digest`).\n\n```\nDec 28 15:59:58 debian index-digest[17485]: {\"meta\": {\"version\": \"index-digest v0.1.0\", \"database_name\": \"index_digest\", \"database_host\": \"debian\", \"database_version\": \"MySQL v5.7.20\"}, \"report\": {\"type\": \"redundant_indices\", \"table\": \"0004_id_foo\", \"message\": \"\\\"idx\\\" index can be removed as redundant (covered by \\\"PRIMARY\\\")\", \"context\": {\"redundant\": \"UNIQUE KEY idx (id, foo)\", \"covered_by\": \"PRIMARY KEY (id, foo)\", \"schema\": \"CREATE TABLE `0004_id_foo` (\\n  `id` int(9) NOT NULL AUTO_INCREMENT,\\n  `foo` varbinary(16) NOT NULL DEFAULT '',\\n  PRIMARY KEY (`id`,`foo`),\\n  UNIQUE KEY `idx` (`id`,`foo`)\\n) ENGINE=InnoDB DEFAULT CHARSET=latin1\", \"table_data_size_mb\": 0.015625, \"table_index_size_mb\": 0.015625}}}\n```\n\n### yaml\n\nOutputs YML file with results and metadata.\n\n## Checks\n\nYou can select which checks should be reported by the tool by using `--checks` command line option. Certain checks can also be skipped via `--skip-checks` option. Refer to `index_digest --help` for examples.\n\n\u003e **Number of checks**: 24\n\n* `redundant_indices`: reports indices that are redundant and covered by other\n* `non_utf_columns`: reports text columns that have characters encoding set to `latin1` (utf is the way to go)\n* `missing_primary_index`: reports tables with no primary or unique key (see [MySQL bug #76252](https://bugs.mysql.com/bug.php?id=76252) and [Wikia/app#9863](https://github.com/Wikia/app/pull/9863)). [Primary keys can be enforced on MySQL config level](https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_sql_require_primary_key) since 8.0.13 (via `sql_require_primary_key` variable).\n* `test_tables`: reports tables that seem to be test leftovers (e.g. `some_guy_test_table`)\n* `single_column`: reports tables with just a single column\n* `empty_tables`: reports tables with no rows\n* `generic_primary_key`: reports tables with [a primary key on `id` column](https://github.com/jarulraj/sqlcheck/blob/master/docs/logical/1004.md) (a more meaningful name should be used)\n* `use_innodb`: reports table using storage engines different than `InnoDB` (a default for MySQL 5.5+ and MariaDB 10.2+)\n* `low_cardinality_index`: reports [indices with low cardinality](https://github.com/macbre/index-digest/issues/31)\n\n### Additional checks performed on SQL log\n\n\u003e You need to provide SQL log file via `--sql-log` option\n\n* `not_used_columns`: checks which columns were not used by SELECT queries\n* `not_used_indices`: checks which indices are not used by SELECT queries\n* `not_used_tables`: checks which tables are not used by SELECT queries\n* `queries_not_using_index`: reports SELECT queries that do not use any index\n* `queries_using_filesort`: reports SELECT queries that require filesort ([a sort can’t be performed from an index and quicksort is used](https://www.percona.com/blog/2009/03/05/what-does-using-filesort-mean-in-mysql/))\n* `queries_using_temporary`: reports SELECT queries that require a temporary table to hold the result\n* `queries_using_full_table_scan`: reports SELECT queries that require a [full table scan](https://dev.mysql.com/doc/refman/5.7/en/table-scan-avoidance.html)\n* `selects_with_like`: reports SELECT queries that use `LIKE '%foo'` conditions (they can not use an index)\n* `insert_ignore`: reports [queries using `INSERT IGNORE`](https://medium.com/legacy-systems-diary/things-to-avoid-episode-1-insert-ignore-535b4c24406b)\n* `select_star`: reports [queries using `SELECT *`](https://github.com/jarulraj/sqlcheck/blob/master/docs/query/3001.md)\n* `having_clause`: reports [queries using `HAVING` clause](https://github.com/jarulraj/sqlcheck/blob/master/docs/query/3012.md)\n* `high_offset_selects`: report [SELECT queries using high OFFSET](https://www.percona.com/blog/2008/09/24/four-ways-to-optimize-paginated-displays/)\n\n### Additional checks performed on tables data\n\n\u003e You need to use `--analyze-data` command line switch. Please note that these checks will query your tables. **These checks can take a while if queried columns are not indexed**.\n\n* `data_too_old`: reports tables that have really old data, maybe it's worth checking if such long data retention is actually needed (**defaults to three months threshold**, can be customized via `INDEX_DIGEST_DATA_TOO_OLD_THRESHOLD_DAYS` env variable)\n* `data_not_updated_recently`: reports tables that were not updated recently, check if it should be up-to-date (**defaults a month threshold**, can be customized via `INDEX_DIGEST_DATA_NOT_UPDATED_RECENTLY_THRESHOLD_DAYS` env variable)\n\n### Additional checks performed across database on the current MySQL server\n\n\u003e You need to use `--check-empty-databases` command line switch.\n\n* `empty_database`: reports databases that have no `BASE TABLE` tables (as provided by `information_schema.TABLES`)\n\n## An example report\n\n```sql\n$ index_digest mysql://index_digest:qwerty@localhost/index_digest --sql-log sql/0002-not-used-indices-log \n------------------------------------------------------------\nFound 85 issue(s) to report for \"index_digest\" database\n------------------------------------------------------------\nMySQL v5.7.21 at debian\nindex-digest v1.0.0\n------------------------------------------------------------\nredundant_indices → table affected: 0004_id_foo\n\n✗ \"idx\" index can be removed as redundant (covered by \"PRIMARY\")\n\n  - redundant: UNIQUE KEY idx (id, foo)\n  - covered_by: PRIMARY KEY (id, foo)\n  - schema: CREATE TABLE `0004_id_foo` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      `foo` varbinary(16) NOT NULL DEFAULT '',\n      PRIMARY KEY (`id`,`foo`),\n      UNIQUE KEY `idx` (`id`,`foo`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=latin1\n  - table_data_size_mb: 0.015625\n  - table_index_size_mb: 0.015625\n\n------------------------------------------------------------\nredundant_indices → table affected: 0004_id_foo_bar\n\n✗ \"idx_foo\" index can be removed as redundant (covered by \"idx_foo_bar\")\n\n  - redundant: KEY idx_foo (foo)\n  - covered_by: KEY idx_foo_bar (foo, bar)\n  - schema: CREATE TABLE `0004_id_foo_bar` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      `foo` varbinary(16) NOT NULL DEFAULT '',\n      `bar` varbinary(16) NOT NULL DEFAULT '',\n      PRIMARY KEY (`id`),\n      KEY `idx_foo` (`foo`),\n      KEY `idx_foo_bar` (`foo`,`bar`),\n      KEY `idx_id_foo` (`id`,`foo`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=latin1\n  - table_data_size_mb: 0.015625\n  - table_index_size_mb: 0.046875\n\n------------------------------------------------------------\nmissing_primary_index → table affected: 0034_querycache\n\n✗ \"0034_querycache\" table does not have any primary or unique index\n\n  - schema: CREATE TABLE `0034_querycache` (\n      `qc_type` varbinary(32) NOT NULL,\n      `qc_value` int(10) unsigned NOT NULL DEFAULT '0',\n      `qc_namespace` int(11) NOT NULL DEFAULT '0',\n      `qc_title` varchar(255) CHARACTER SET latin1 COLLATE latin1_bin NOT NULL DEFAULT '',\n      KEY `qc_type` (`qc_type`,`qc_value`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=utf8\n\n------------------------------------------------------------\ntest_tables → table affected: 0075_some_guy_test_table\n\n✗ \"0075_some_guy_test_table\" seems to be a test table\n\n  - schema: CREATE TABLE `0075_some_guy_test_table` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      `name` varchar(255) NOT NULL,\n      PRIMARY KEY (`id`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=utf8\n\n------------------------------------------------------------\nsingle_column → table affected: 0074_bag_of_ints\n\n✗ \"0074_bag_of_ints\" has just a single column\n\n  - schema: CREATE TABLE `0074_bag_of_ints` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      PRIMARY KEY (`id`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=utf8\n\n------------------------------------------------------------\nempty_tables → table affected: 0089_empty_table\n\n✗ \"0089_empty_table\" table has no rows, is it really needed?\n\n  - schema: CREATE TABLE `0089_empty_table` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      PRIMARY KEY (`id`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=latin1\n\n------------------------------------------------------------\ngeneric_primary_key → table affected: 0094_generic_primary_key\n\n✗ \"0094_generic_primary_key\" has a primary key called id, use a more meaningful name\n\n  - schema: CREATE TABLE `0094_generic_primary_key` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      `foo` varchar(16) NOT NULL DEFAULT '',\n      PRIMARY KEY (`id`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=latin1\n\n------------------------------------------------------------\nuse_innodb → table affected: 0036_use_innodb_myisam\n\n✗ \"0036_use_innodb_myisam\" uses MyISAM storage engine\n\n  - schema: CREATE TABLE `0036_use_innodb_myisam` (\n      `item_id` int(9) NOT NULL AUTO_INCREMENT,\n      `foo` int(8) DEFAULT NULL,\n      PRIMARY KEY (`item_id`)\n    ) ENGINE=MyISAM DEFAULT CHARSET=latin1\n  - engine: MyISAM\n\n------------------------------------------------------------\nnot_used_indices → table affected: 0002_not_used_indices\n\n✗ \"test_id_idx\" index was not used by provided queries\n\n  - not_used_index: KEY test_id_idx (test, id)\n\n------------------------------------------------------------\nnot_used_tables → table affected: 0020_big_table\n\n✗ \"0020_big_table\" table was not used by provided queries\n\n  - schema: CREATE TABLE `0020_big_table` (\n      `id` int(9) NOT NULL AUTO_INCREMENT,\n      `val` int(9) NOT NULL,\n      `text` char(5) NOT NULL,\n      PRIMARY KEY (`id`),\n      KEY `text_idx` (`text`)\n    ) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8\n  - table_size_mb: 5.03125\n  - rows_estimated: 100405\n\n------------------------------------------------------------\ninsert_ignore → table affected: 0070_insert_ignore\n\n✗ \"INSERT IGNORE INTO `0070_insert_ignore` VALUES (9,...\" query uses a risky INSERT IGNORE\n\n  - query: INSERT IGNORE INTO `0070_insert_ignore` VALUES (9, '123', '2017-01-01');\n  - schema: CREATE TABLE `0070_insert_ignore` (\n      `id` int(9) NOT NULL,\n      `text` char(5) NOT NULL,\n      `time` datetime DEFAULT NULL,\n      UNIQUE KEY `id` (`id`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=utf8\n\n------------------------------------------------------------\nnon_utf_columns → table affected: 0032_latin1_table\n\n✗ \"name\" text column has \"latin1\" character set defined\n\n  - column: name\n  - column_character_set: latin1\n  - column_collation: latin1_swedish_ci\n  - schema: CREATE TABLE `0032_latin1_table` (\n      `item_id` int(9) NOT NULL AUTO_INCREMENT,\n      `name` varchar(255) DEFAULT NULL,\n      `utf8_column` varchar(255) CHARACTER SET utf8 COLLATE utf8_polish_ci NOT NULL,\n      `ucs2_column` varchar(255) CHARACTER SET ucs2 DEFAULT NULL,\n      `utf8mb4_column` varchar(255) CHARACTER SET utf8mb4 DEFAULT NULL,\n      `utf16_column` varchar(255) CHARACTER SET utf16 DEFAULT NULL,\n      `utf32_column` varchar(255) CHARACTER SET utf32 DEFAULT NULL,\n      `binary_column` varbinary(255) DEFAULT NULL,\n      `latin_blob` blob,\n      PRIMARY KEY (`item_id`)\n    ) ENGINE=InnoDB DEFAULT CHARSET=latin1\n\n------------------------------------------------------------\n\n(...)\n\n------------------------------------------------------------\nqueries_using_filesort → table affected: 0020_big_table\n\n✗ \"SELECT val, count(*) FROM 0020_big_table WHERE id ...\" query used filesort\n\n  - query: SELECT val, count(*) FROM 0020_big_table WHERE id BETWEEN 10 AND 20 GROUP BY val\n  - explain_extra: Using where; Using temporary; Using filesort\n  - explain_rows: 11\n  - explain_filtered: None\n  - explain_key: PRIMARY\n\n------------------------------------------------------------\nqueries_using_temporary → table affected: 0020_big_table\n\n✗ \"SELECT val, count(*) FROM 0020_big_table WHERE id ...\" query used temporary\n\n  - query: SELECT val, count(*) FROM 0020_big_table WHERE id BETWEEN 10 AND 20 GROUP BY val\n  - explain_extra: Using where; Using temporary; Using filesort\n  - explain_rows: 11\n  - explain_filtered: None\n  - explain_key: PRIMARY\n\n------------------------------------------------------------\nqueries_using_full_table_scan → table affected: 0020_big_table\n\n✗ \"SELECT * FROM 0020_big_table\" query triggered full table scan\n\n  - query: SELECT * FROM 0020_big_table\n  - explain_rows: 9041\n\n------------------------------------------------------------\nselects_with_like → table affected: 0020_big_table\n\n✗ \"SELECT * FROM 0020_big_table WHERE text LIKE '%00'\" query uses LIKE with left-most wildcard\n\n  - query: SELECT * FROM 0020_big_table WHERE text LIKE '%00'\n  - explain_extra: Using where\n  - explain_rows: 100623\n\n------------------------------------------------------------\nselect_star → table affected: bar\n\n✗ \"SELECT t.* FROM bar AS t\" query uses SELECT *\n\n  - query: SELECT t.* FROM bar AS t;\n\n------------------------------------------------------------\nhaving_clause → table affected: sales\n\n✗ \"SELECT s.cust_id,count(s.cust_id) FROM SH.sales s ...\" query uses HAVING clause\n\n  - query: SELECT s.cust_id,count(s.cust_id) FROM SH.sales s GROUP BY s.cust_id HAVING s.cust_id != '1660' AND s.cust_id != '2'\n\n(...)\n\n------------------------------------------------------------\nlow_cardinality_index → table affected: 0020_big_table\n\n✗ \"num_idx\" index on \"num\" column has low cardinality, check if it is needed\n\n  - column_name: num\n  - index_name: num_idx\n  - index_cardinality: 2\n  - schema: CREATE TABLE `0020_big_table` (\n      `item_id` int(9) NOT NULL AUTO_INCREMENT,\n      `val` int(9) NOT NULL,\n      `text` char(5) NOT NULL,\n      `num` int(3) NOT NULL,\n      PRIMARY KEY (`item_id`),\n      KEY `text_idx` (`text`),\n      KEY `num_idx` (`num`)\n    ) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=utf8\n  - value_usage: 33.24788541334185\n\n(...)\n\n------------------------------------------------------------\ndata_too_old → table affected: 0028_data_too_old\n\n✗ \"0028_data_too_old\" has rows added 184 days ago, consider changing retention policy\n\n  - diff_days: 184\n  - data_since: 2017-08-17 12:03:44\n  - data_until: 2018-02-17 12:03:44\n  - date_column_name: timestamp\n  - schema: CREATE TABLE `0028_data_too_old` (\n      `item_id` int(8) unsigned NOT NULL AUTO_INCREMENT,\n      `cnt` int(8) unsigned NOT NULL,\n      `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n      PRIMARY KEY (`item_id`)\n    ) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1\n  - rows: 4\n  - table_size_mb: 0.015625\n\n------------------------------------------------------------\ndata_not_updated_recently → table affected: 0028_data_not_updated_recently\n\n✗ \"0028_data_not_updated_recently\" has the latest row added 40 days ago, consider checking if it should be up-to-date\n\n  - diff_days: 40\n  - data_since: 2017-12-29 12:03:44\n  - data_until: 2018-01-08 12:03:44\n  - date_column_name: timestamp\n  - schema: CREATE TABLE `0028_data_not_updated_recently` (\n      `item_id` int(8) unsigned NOT NULL AUTO_INCREMENT,\n      `cnt` int(8) unsigned NOT NULL,\n      `timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n      PRIMARY KEY (`item_id`)\n    ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1\n  - rows: 3\n  - table_size_mb: 0.015625\n\n------------------------------------------------------------\nhigh_offset_selects → table affected: page\n\n✗ \"SELECT /* CategoryPaginationViewer::processSection...\" query uses too high offset impacting the performance\n\n  - query: SELECT /* CategoryPaginationViewer::processSection */  page_namespace,page_title,page_len,page_is_redirect,cl_sortkey_prefix  FROM `page` INNER JOIN `categorylinks` FORCE INDEX (cl_sortkey) ON ((cl_from = page_id))  WHERE cl_type = 'page' AND cl_to = 'Spotify/Song'  ORDER BY cl_sortkey LIMIT 927600,200\n  - limit: 200\n  - offset: 927600\n\n------------------------------------------------------------\nempty_database → table affected: index_digest_empty\n\n✗ \"index_digest_empty\" database has no tables\n\n------------------------------------------------------------\nQueries performed: 100\n```\n\n## Success stories\n\n\u003e Want to add your entry here? Submit a pull request\n\n* By running `index-digest` at [Wikia](http://wikia.com) on shared database clusters (including tables storing ~450 mm of rows with 300+ GiB of data) we were able to [reclaim around 1.25 TiB of MySQL storage space across all replicas](https://medium.com/legacy-systems-diary/linting-your-database-schema-cd8947835a52).\n\n## Read more\n\n* [Percona Database Performance Blog](https://www.percona.com/blog/)\n* [High Performance MySQL, 3rd Edition by Vadim Tkachenko, Peter Zaitsev, Baron Schwartz](https://www.safaribooksonline.com/library/view/high-performance-mysql/9781449332471/ch05.html)\n* [Percona | Indexing 101: Optimizing MySQL queries on a single table](https://www.percona.com/blog/2015/04/27/indexing-101-optimizing-mysql-queries-on-a-single-table/)\n* [Percona | `pt-index-usage`](https://www.percona.com/doc/percona-toolkit/LATEST/pt-index-usage.html) / [find unused indexes](https://www.percona.com/blog/2012/06/30/find-unused-indexes/)\n\n### Slides\n\n* [Percona | MySQL Indexing: Best Practices](https://www.percona.com/files/presentations/WEBINAR-MySQL-Indexing-Best-Practices.pdf)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmacbre%2Findex-digest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmacbre%2Findex-digest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmacbre%2Findex-digest/lists"}