Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/anse1/sqlsmith
A random SQL query generator
https://github.com/anse1/sqlsmith
fuzz-testing monetdb postgresql sqlite3
Last synced: 2 months ago
JSON representation
A random SQL query generator
- Host: GitHub
- URL: https://github.com/anse1/sqlsmith
- Owner: anse1
- License: gpl-3.0
- Created: 2015-06-01T10:41:29.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2024-01-05T17:43:08.000Z (about 1 year ago)
- Last Synced: 2024-10-02T07:10:01.452Z (4 months ago)
- Topics: fuzz-testing, monetdb, postgresql, sqlite3
- Language: C++
- Homepage:
- Size: 442 KB
- Stars: 744
- Watchers: 33
- Forks: 126
- Open Issues: 14
-
Metadata Files:
- Readme: README.org
- License: COPYING
Awesome Lists containing this project
- awesome-software-engineering-tools - SQLsmith
README
[[logo.png]]
* SQLsmith: "I love the smell of coredumps in the morning"
** Description
SQLsmith is a random SQL query generator. Its paragon is [[https://embed.cs.utah.edu/csmith/][Csmith]],
which proved valuable for quality assurance in C compilers.It currently supports generating queries for PostgreSQL, SQLite 3 and
MonetDB. To add support for another RDBMS, you need to implement two
classes providing schema information about and connectivity to the
device under test.Besides developers of the RDBMS products, users developing extensions
might also be interested in exposing their code to SQLsmith's random
workload.Since 2015, it found 118 bugs in alphas, betas and releases in the
aforementioned products, including security vulnerabilities in
released versions. Additional bugs were squashed in extensions and
libraries such as orafce and glibc.https://github.com/anse1/sqlsmith/wiki#score-list
** Dependencies
- C++11
- libpqxxoptional:
- boost::regex in case your std::regex is broken
- SQLite3
- monetdb_mapi** Building on Debian
: apt-get install build-essential autoconf autoconf-archive libpqxx-dev libboost-regex-dev libsqlite3-dev
: cd sqlsmith
: autoreconf -i # Avoid when building from a release tarball
: ./configure
: make** Building on OSX
In order to build on Mac OSX, assuming you use Homebrew, run the following
: brew install libpqxx automake libtool autoconf autoconf-archive pkg-config
: cd sqlsmith
: autoreconf -i # Avoid when building from a release tarball
: ./configure
: make** Usage
SQLsmith connects to the target database to retrieve the schema for
query generation and to send the generated queries to. Currently, all
generated statements are rolled back. Beware that SQLsmith does call
functions that could possibly have side-effects
(e.g. pg_terminate_backend). Use a suitably *underprivileged user*
for its connection to avoid this.Example invocations:
: # testing Postgres
: sqlsmith --verbose --target="host=/tmp port=65432 dbname=regression"
: # testing SQLite
: sqlsmith --verbose --sqlite="file:$HOME/.mozilla/firefox/places.sqlite?mode=ro"
: # testing MonetDB
: sqlsmith --verbose --monetdb="mapi:monetdb://localhost:50000/smith"The following options are currently supported:
| =--target=connstr= | target postgres database (default: libpq defaults) |
| =--sqlite=URI= | target SQLite3 database |
| =--monetdb=URI= | target MonetDB database |
| =--log-to=connstr= | postgres db for logging errors into (default: don't log) |
| =--verbose= | emit progress output |
| =--version= | show version information |
| =--seed=int= | seed RNG with specified integer instead of PID |
| =--dry-run= | print queries instead of executing them |
| =--max-queries=long= | terminate after generating this many queries |
| =--exclude-catalog= | don't generate queries using catalog relations |
| =--dump-all-queries= | dump queries as they are generated |
| =--dump-all-graphs= | dump generated ASTs for debugging |
| =--rng-state=string= | deserialize dumped rng state |Sample output:
=--verbose= makes sqlsmith emit some progress indication to stderr. A
symbol is output for each query sent to the server. Currently the
following ones are generated:| symbol | meaning | details |
|--------+-------------------+-----------------------------------------------|
| . | ok | Query generated and executed with ok sqlstate |
| S | syntax error | These are bugs in sqlsmith - please report |
| t | timeout | SQLsmith sets a statement timeout of 1s |
| C | broken connection | These happen when a query crashes the server |
| e | other error | |When you test against a RDBMS that doesn't support some of SQLsmith's
grammar, there will be a burst of syntax errors on startup. These
should disappear after some time as SQLsmith blacklists productions
that consistently lead to errors.=--verbose= will also periodically emit error reports. In the
following example, these are mostly caused by the primitive type
system.: queries: 39000 (202.399 gen/s, 298.942 exec/s)
: AST stats (avg): height = 5.599 nodes = 37.8489
: 82 ERROR: invalid regular expression: quantifier operand invalid
: 70 ERROR: canceling statement due to statement timeout
: 44 ERROR: operator does not exist: point = point
: 27 ERROR: operator does not exist: xml = xml
: 22 ERROR: cannot compare arrays of different element types
: 11 ERROR: could not determine which collation to use for string comparison
: 5 ERROR: invalid regular expression: nfa has too many states
: 4 ERROR: cache lookup failed for index 2619
: 4 ERROR: invalid regular expression: brackets [] not balanced
: 3 ERROR: operator does not exist: polygon = polygon
: 2 ERROR: invalid regular expression: parentheses () not balanced
: 1 ERROR: invalid regular expression: invalid character range
: error rate: 0.00705128The only one that looks interesting here is the cache lookup one.
Taking a closer look at it reveals that it happens when you query a
certain catalog view like this:: self=# select indexdef from pg_catalog.pg_indexes where indexdef is not NULL;
: FEHLER: cache lookup failed for index 2619This is because the planner then puts =pg_get_indexdef(oid)= in a
context where it sees non-index-oids, which causes it to croak:: QUERY PLAN
: ------------------------------------------------------------------------------------
: Hash Join (cost=17.60..30.65 rows=9 width=4)
: Hash Cond: (i.oid = x.indexrelid)
: -> Seq Scan on pg_class i (cost=0.00..12.52 rows=114 width=8)
: Filter: ((pg_get_indexdef(oid) IS NOT NULL) AND (relkind = 'i'::"char"))
: -> Hash (cost=17.31..17.31 rows=23 width=4)
: -> Hash Join (cost=12.52..17.31 rows=23 width=4)
: Hash Cond: (x.indrelid = c.oid)
: -> Seq Scan on pg_index x (cost=0.00..4.13 rows=113 width=8)
: -> Hash (cost=11.76..11.76 rows=61 width=8)
: -> Seq Scan on pg_class c (cost=0.00..11.76 rows=61 width=8)
: Filter: (relkind = ANY ('{r,m}'::"char"[]))Now this is more of a curiosity than a bug, but still illustrating how
debugging with the help of SQLsmith might look like.** Large-scale testing
=--log-to= allows logging of hundreds of sqlsmith instances into a
central PostgreSQL database. [[./log.sql]] contains the schema sqlsmith
expects and some additional views to generate reports on the logged
contents.It also contains a trigger to filter boring/known errors based on the
contents of the tables known and known_re. I periodically COPY my
filter tables for testing PostgreSQL into the files [[./known_re.txt]] and
[[./known.txt]] to serve as a starting point.** Resources
- [[https://www.postgresql.eu/events/pgconfeu2018/sessions/session/2221/slides/145/sqlsmith-talk.pdf][Slides from PGConf.EU 2018]]
- [[https://anse1.github.io/sqlsmith-doc/structsqltype.html][Doxygen output for SQLsmith]]** License
SQLsmith is available under GPLv3. Use it at your own risk. It may
*damage your database* (one of the purposes of this tool /is/ to try
and break things). See the file [[COPYING]] for details.** Authors
Andreas Seltenreich
Bo Tang
Sjoerd Mullender
[[ast.png]]