{"id":13467829,"url":"https://github.com/googlecreativelab/quickdraw-dataset","last_synced_at":"2025-04-08T12:00:15.557Z","repository":{"id":38705346,"uuid":"90778969","full_name":"googlecreativelab/quickdraw-dataset","owner":"googlecreativelab","description":"Documentation on how to access and use the Quick, Draw! Dataset.","archived":false,"fork":false,"pushed_at":"2023-08-17T18:02:37.000Z","size":159,"stargazers_count":5929,"open_issues_count":31,"forks_count":915,"subscribers_count":203,"default_branch":"master","last_synced_at":"2024-04-14T08:52:59.057Z","etag":null,"topics":["dataset","quickdraw-dataset"],"latest_commit_sha":null,"homepage":"https://quickdraw.withgoogle.com/data","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/googlecreativelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-05-09T18:28:32.000Z","updated_at":"2024-04-14T07:30:25.000Z","dependencies_parsed_at":"2024-01-03T03:53:38.230Z","dependency_job_id":"367c3702-cfe5-4664-9c7b-561061a42076","html_url":"https://github.com/googlecreativelab/quickdraw-dataset","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlecreativelab%2Fquickdraw-dataset","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlecreativelab%2Fquickdraw-dataset/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlecreativelab%2Fquickdraw-dataset/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/googlecreativelab%2Fquickdraw-dataset/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/googlecreativelab","download_url":"https://codeload.github.com/googlecreativelab/quickdraw-dataset/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247838388,"owners_count":21004576,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dataset","quickdraw-dataset"],"created_at":"2024-07-31T15:01:01.069Z","updated_at":"2025-04-08T12:00:15.224Z","avatar_url":"https://github.com/googlecreativelab.png","language":null,"funding_links":[],"categories":["Others","Neural Networks (NN) and Deep Neural Networks (DNN)","Text-to-Image:","Dataset"],"sub_categories":["NN/DNN Datasets","Web"],"readme":"# The Quick, Draw! Dataset\n![preview](preview.jpg)\n\nThe Quick Draw Dataset is a collection of 50 million drawings across [345 categories](categories.txt), contributed by players of the game [Quick, Draw!](https://quickdraw.withgoogle.com). The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located. You can browse the recognized drawings on [quickdraw.withgoogle.com/data](https://quickdraw.withgoogle.com/data).\n\nWe're sharing them here for developers, researchers, and artists to explore, study, and learn from. If you create something with this dataset, please let us know [by e-mail](mailto:quickdraw-support@google.com) or at [A.I. Experiments](https://aiexperiments.withgoogle.com/submit).\n\nWe have also released a tutorial and model for training your own drawing classifier on [tensorflow.org](https://github.com/tensorflow/docs/blob/master/site/en/r1/tutorials/sequences/recurrent_quickdraw.md).\n\nPlease keep in mind that while this collection of drawings was individually moderated, it may still contain inappropriate content.\n\n## Content\n- [The raw moderated dataset](#the-raw-moderated-dataset)\n- [Preprocessed dataset](#preprocessed-dataset)\n- [Get the data](#get-the-data)\n- [Projects using the dataset](#projects-using-the-dataset)\n- [Changes](#changes)\n- [License](#license)\n\n\n## The raw moderated dataset\nThe raw data is available as [`ndjson`](https://github.com/ndjson) files seperated by category, in the following format: \n\n| Key          | Type                   | Description                                  |\n| ------------ | -----------------------| -------------------------------------------- |\n| key_id       | 64-bit unsigned integer| A unique identifier across all drawings.     |\n| word         | string                 | Category the player was prompted to draw.    |\n| recognized   | boolean                | Whether the word was recognized by the game. |\n| timestamp    | datetime               | When the drawing was created.                |\n| countrycode  | string                 | A two letter country code ([ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)) of where the player was located. |\n| drawing      | string                 | A JSON array representing the vector drawing |  \n\n\nEach line contains one drawing. Here's an example of a single drawing:\n\n```javascript\n  { \n    \"key_id\":\"5891796615823360\",\n    \"word\":\"nose\",\n    \"countrycode\":\"AE\",\n    \"timestamp\":\"2017-03-01 20:41:36.70725 UTC\",\n    \"recognized\":true,\n    \"drawing\":[[[129,128,129,129,130,130,131,132,132,133,133,133,133,...]]]\n  }\n```\n\nThe format of the drawing array is as following:\n \n```javascript\n[ \n  [  // First stroke \n    [x0, x1, x2, x3, ...],\n    [y0, y1, y2, y3, ...],\n    [t0, t1, t2, t3, ...]\n  ],\n  [  // Second stroke\n    [x0, x1, x2, x3, ...],\n    [y0, y1, y2, y3, ...],\n    [t0, t1, t2, t3, ...]\n  ],\n  ... // Additional strokes\n]\n```\n\nWhere `x` and `y` are the pixel coordinates, and `t` is the time in milliseconds since the first point. `x` and `y` are real-valued while `t` is an integer. The raw drawings can have vastly different bounding boxes and number of points due to the different devices used for display and input.\n\n## Preprocessed dataset\nWe've preprocessed and split the dataset into different files and formats to make it faster and easier to download and explore.\n\n#### Simplified Drawing files (`.`)\nWe've simplified the vectors, removed the timing information, and positioned and scaled the data into a 256x256 region. The data is exported in [`ndjson`](https://github.com/ndjson) format with the same metadata as the raw format. The simplification process was:\n\n1. Align the drawing to the top-left corner, to have minimum values of 0.\n2. Uniformly scale the drawing, to have a maximum value of 255. \n3. Resample all strokes with a 1 pixel spacing.\n4. Simplify all strokes using the [Ramer–Douglas–Peucker algorithm](https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm) with an epsilon value of 2.0.\n\nThere is an example in [examples/nodejs/simplified-parser.js](examples/nodejs/simplified-parser.js) showing how to read ndjson files in NodeJS.  \nAdditionally, the [examples/nodejs/ndjson.md](examples/nodejs/ndjson.md) document details a set of command-line tools that can help explore subsets of these quite large files.\n\n#### Binary files (`.bin`)\nThe simplified drawings and metadata are also available in a custom binary format for efficient compression and loading.\n\nThere is an example in [examples/binary_file_parser.py](examples/binary_file_parser.py) showing how to load the binary files in Python.  \nThere is also an example in [examples/nodejs/binary-parser.js](examples/nodejs/binary-parser.js) showing how to read the binary files in NodeJS.\n\n#### Numpy bitmaps (`.npy`)\nAll the simplified drawings have been rendered into a 28x28 grayscale bitmap in numpy `.npy` format. The files can be loaded with [`np.load()`](https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.load.html). These images were generated from the simplified data, but are aligned to the center of the drawing's bounding box rather than the top-left corner. [See here for code snippet used for generation](https://github.com/googlecreativelab/quickdraw-dataset/issues/19#issuecomment-402247262).\n\n## Get the data\nThe dataset is available on Google Cloud Storage as [`ndjson`](https://github.com/ndjson) files seperated by category. See the list of files in [Cloud \n](https://console.cloud.google.com/storage/browser/quickdraw_dataset/), or read more about [accessing public datasets](https://cloud.google.com/storage/docs/access-public-data) using other methods. As an example, to easily download all simplified drawings, one way is to run the command `gsutil -m cp 'gs://quickdraw_dataset/full/simplified/*.ndjson' .` \n\n#### Full dataset seperated by categories\n- [Raw files](https://console.cloud.google.com/storage/browser/quickdraw_dataset/full/raw) (`.ndjson`)\n- [Simplified drawings files](https://console.cloud.google.com/storage/browser/quickdraw_dataset/full/simplified) (`.ndjson`)\n- [Binary files](https://console.cloud.google.com/storage/browser/quickdraw_dataset/full/binary) (`.bin`)\n- [Numpy bitmap files](https://console.cloud.google.com/storage/browser/quickdraw_dataset/full/numpy_bitmap) (`.npy`)\n\n#### Sketch-RNN QuickDraw Dataset\nThis data is also used for training the [Sketch-RNN](https://arxiv.org/abs/1704.03477) model.  An open source, TensorFlow implementation of this model is available in the [Magenta Project](https://magenta.tensorflow.org/sketch_rnn), (link to GitHub [repo](https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn)).  You can also read more about this model in this Google Research [blog post](https://research.googleblog.com/2017/04/teaching-machines-to-draw.html).  The data is stored in compressed `.npz` files, in a format suitable for inputs into a recurrent neural network.\n\nIn this dataset, 75K samples (70K Training, 2.5K Validation, 2.5K Test) has been randomly selected from each category, processed with [RDP](https://en.wikipedia.org/wiki/Ramer%E2%80%93Douglas%E2%80%93Peucker_algorithm) line simplification with an `epsilon` parameter of 2.0.  Each category will be stored in its own `.npz` file, for example, `cat.npz`.\n\nWe have also provided the full data for each category, if you want to use more than 70K training examples.  These are stored with the `.full.npz` extensions.\n\n- [Numpy .npz files](https://console.cloud.google.com/storage/browser/quickdraw_dataset/sketchrnn)\n\n*Note:* For Python3, loading the `npz` files using `np.load(data_filepath, encoding='latin1', allow_pickle=True)`\n\nInstructions for converting Raw `ndjson` files to this `npz` format is available in this [notebook](https://github.com/hardmaru/quickdraw-ndjson-to-npz).\n\n## Projects using the dataset\nHere are some projects and experiments that are using or featuring the dataset in interesting ways. Got something to add? [Let us know!](mailto:quickdraw-support@google.com)\n\n*Creative and artistic projects*\n\n- [Letter collages](http://frauzufall.de/en/2017/google-quick-draw/) by [Deborah Schmidt](http://frauzufall.de/)\n- [Face tracking experiment](https://www.instagram.com/p/BUU8TuQD6_v/) by [Neil Mendoza](http://www.neilmendoza.com/)\n- [Faces of Humanity](http://project.laboiteatortue.com/facesofhumanity/) by [Tortue](www.laboiteatortue.com)\n- [Infinite QuickDraw](https://kynd.github.io/infinite_quickdraw/) by [kynd.info](http://kynd.info)\n- [Misfire.io](http://misfire.io/) by Matthew Collyer\n- [Draw This](http://danmacnish.com/2018/07/01/draw-this/) by [Dan Macnish](http://danmacnish.com/)\n- [Scribbling Speech](http://xinyue.de/scribbling-speech.html) by [Xinyue Yang](http://xinyue.de/)\n- illustrAItion by [Ling Chen](https://github.com/lingchen42/illustrAItion)\n- [Dreaming of Electric Sheep](https://medium.com/@libreai/dreaming-of-electric-sheep-d1aca32545dc) by [\nDr. Ernesto Diaz-Aviles](http://ernesto.diazaviles.com/)\n\n*Data analyses*\n\n- [How do you draw a circle?](https://qz.com/994486/the-way-you-draw-circles-says-a-lot-about-you/) by [Quartz](https://qz.com/)\n- [Forma Fluens](http://formafluens.io/) by [Mauro Martino](http://www.mamartino.com/), [Hendrik Strobelt](http://hendrik.strobelt.com/) and [Owen Cornec](http://www.byowen.com/)\n- [How Long Does it Take to (Quick) Draw a Dog?](http://vallandingham.me/quickdraw/) by [Jim Vallandingham](http://vallandingham.me/)\n- [Finding bad flamingo drawings with recurrent neural networks](http://colinmorris.github.io/blog/bad_flamingos) by [Colin Morris](http://colinmorris.github.io/)\n- [Facets Dive x Quick, Draw!](https://pair-code.github.io/facets/quickdraw.html) by [People + AI Research Initiative (PAIR), Google](https://ai.google/pair)\n- [Exploring and Visualizing an Open Global Dataset](https://research.googleblog.com/2017/08/exploring-and-visualizing-open-global.html) by Google Research\n- [Machine Learning for Visualization](https://medium.com/@enjalot/machine-learning-for-visualization-927a9dff1cab) - Talk / article by Ian Johnson\n\n*Papers*\n- [A Neural Representation of Sketch Drawings](https://arxiv.org/pdf/1704.03477.pdf) by [David Ha](https://scholar.google.com/citations?user=J1j92GsxVUMC\u0026hl=en), [Douglas Eck](https://scholar.google.com/citations?user=bLb3VdIAAAAJ\u0026hl=en), ICLR 2018. [code](https://github.com/tensorflow/magenta/tree/master/magenta/models/sketch_rnn)\n- [Sketchmate: Deep hashing for million-scale human sketch retrieval](http://openaccess.thecvf.com/content_cvpr_2018/papers/Xu_SketchMate_Deep_Hashing_CVPR_2018_paper.pdf) by [Peng Xu](http://www.pengxu.net/) et al., CVPR 2018.\n- [Multi-graph transformer for free-hand sketch recognition](https://arxiv.org/pdf/1912.11258.pdf) by [Peng Xu](http://www.pengxu.net/), [Chaitanya K Joshi](https://chaitjo.github.io/), [Xavier Bresson](https://www.ntu.edu.sg/home/xbresson/), ArXiv 2019. [code](https://github.com/PengBoXiangShang/multigraph_transformer)\n- [Deep Self-Supervised Representation Learning for Free-Hand Sketch](https://arxiv.org/pdf/2002.00867.pdf) by [Peng Xu](http://www.pengxu.net/) et al., ArXiv 2020. [code](https://github.com/zzz1515151/self-supervised_learning_sketch)\n- [SketchTransfer: A Challenging New Task for Exploring Detail-Invariance and the Abstractions Learned by Deep Networks](https://arxiv.org/pdf/1912.11570.pdf) by [Alex Lamb](https://sites.google.com/view/alexmlamb), [Sherjil Ozair](https://sherjilozair.github.io/), [Vikas Verma](https://scholar.google.com/citations?user=wo_M4uQAAAAJ\u0026hl=en), [David Ha](https://scholar.google.com/citations?user=J1j92GsxVUMC\u0026hl=en), WACV 2020.\n- [Deep Learning for Free-Hand Sketch: A Survey](https://arxiv.org/pdf/2001.02600.pdf) by [Peng Xu](http://www.pengxu.net/), ArXiv 2020.\n- [A Novel Sketch Recognition Model based on Convolutional Neural Networks](https://ieeexplore.ieee.org/document/9152911) by [Abdullah Talha Kabakus](https://www.linkedin.com/in/talhakabakus), 2nd International Congress on Human-Computer Interaction, Optimization and Robotic Applications, pp. 101-106, 2020.\n\n*Guides \u0026 Tutorials*\n- [TensorFlow tutorial for drawing classification](https://github.com/tensorflow/docs/blob/master/site/en/r1/tutorials/sequences/recurrent_quickdraw.md)\n- [Train a model in tf.keras with Colab, and run it in the browser with TensorFlow.js](https://medium.com/tensorflow/train-on-google-colab-and-run-on-the-browser-a-case-study-8a45f9b1474e) by Zaid Alyafeai\n\n*Code and tools*\n- [Quick, Draw! Polymer Component \u0026 Data API](https://github.com/googlecreativelab/quickdraw-component) by Nick Jonas\n- [Quick, Draw for Processing](https://github.com/codybenlewis/Quick-Draw-for-Processing) by [Cody Ben Lewis](https://twitter.com/CodyBenLewis)\n- [Quick, Draw! prediction model](https://github.com/keisukeirie/quickdraw_prediction_model) by Keisuke Irie \n- [Random sample tool](http://learning.statistics-is-awesome.org/draw/) by [Learning statistics is awesome](http://learning.statistics-is-awesome.org/)\n- [SVG rendering in d3.js example](https://bl.ocks.org/enjalot/a2b28f0ed18b891f9fb70910f1b8886d) by [Ian Johnson](http://enja.org/) (read more about the process [here](https://gist.github.com/enjalot/54c4342eb7527ea523884dbfa52d174b))\n- [Sketch-RNN Classification](https://github.com/payalbajaj/sketch_rnn_classification) by Payal Bajaj\n- [quickdraw.js](https://github.com/wagenaartje/quickdraw.js) by Thomas Wagenaar\n- [~ Doodler ~](https://github.com/krishnasriSomepalli/cs50-project/) by [\nKrishna Sri Somepalli](https://krishnasrisomepalli.github.io/)\n- [quickdraw Python API](http://quickdraw.readthedocs.io) by [Martin O'Hanlon](https://github.com/martinohanlon)\n- [RealTime QuickDraw](https://github.com/akshaybahadur21/QuickDraw) by [Akshay Bahadur](http://akshaybahadur.com/)\n- [DataFlow processing](https://github.com/gxercavins/dataflow-samples/tree/master/quick-draw) by Guillem Xercavins \n- [QuickDrawGH Rhino Plugin](https://www.food4rhino.com/app/quickdrawgh) by [James Dalessandro](https://github.com/DalessandroJ)\n- [QuickDrawBattle](https://andri.io/quickdrawbattle/) by [Andri Soone](https://github.com/ndri)\n\n\n## Changes\n\nMay 25, 2017: Updated Sketch-RNN QuickDraw dataset, created `.full.npz` complementary sets.\n\n## License\nThis data made available by Google, Inc. under the [Creative Commons Attribution 4.0 International license.](https://creativecommons.org/licenses/by/4.0/)\n\n## Dataset Metadata\nThe following table is necessary for this dataset to be indexed by search\nengines such as \u003ca href=\"https://g.co/datasetsearch\"\u003eGoogle Dataset Search\u003c/a\u003e.\n\u003cdiv itemscope itemtype=\"http://schema.org/Dataset\"\u003e\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003cth\u003eproperty\u003c/th\u003e\n    \u003cth\u003evalue\u003c/th\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ename\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"name\"\u003eThe Quick, Draw! Dataset\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ealternateName\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"alternateName\"\u003eQuick Draw Dataset\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003ealternateName\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"alternateName\"\u003equickdraw-dataset\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eurl\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"url\"\u003ehttps://github.com/googlecreativelab/quickdraw-dataset\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003esameAs\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"sameAs\"\u003ehttps://github.com/googlecreativelab/quickdraw-dataset\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003edescription\u003c/td\u003e\n    \u003ctd\u003e\u003ccode itemprop=\"description\"\u003eThe Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game \"Quick, Draw!\". The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.\\n\n\\n\nExample drawings:\n![preview](https://raw.githubusercontent.com/googlecreativelab/quickdraw-dataset/master/preview.jpg)\u003c/code\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003eprovider\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cdiv itemscope itemtype=\"http://schema.org/Organization\" itemprop=\"provider\"\u003e\n        \u003ctable\u003e\n          \u003ctr\u003e\n            \u003cth\u003eproperty\u003c/th\u003e\n            \u003cth\u003evalue\u003c/th\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003ename\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"name\"\u003eGoogle\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003esameAs\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"sameAs\"\u003ehttps://en.wikipedia.org/wiki/Google\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n        \u003c/table\u003e\n      \u003c/div\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003elicense\u003c/td\u003e\n    \u003ctd\u003e\n      \u003cdiv itemscope itemtype=\"http://schema.org/CreativeWork\" itemprop=\"license\"\u003e\n        \u003ctable\u003e\n          \u003ctr\u003e\n            \u003cth\u003eproperty\u003c/th\u003e\n            \u003cth\u003evalue\u003c/th\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003ename\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"name\"\u003eCC BY 4.0\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n          \u003ctr\u003e\n            \u003ctd\u003eurl\u003c/td\u003e\n            \u003ctd\u003e\u003ccode itemprop=\"url\"\u003ehttps://creativecommons.org/licenses/by/4.0/\u003c/code\u003e\u003c/td\u003e\n          \u003c/tr\u003e\n        \u003c/table\u003e\n      \u003c/div\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgooglecreativelab%2Fquickdraw-dataset","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgooglecreativelab%2Fquickdraw-dataset","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgooglecreativelab%2Fquickdraw-dataset/lists"}