VOOZH about

URL: https://huggingface.co/datasets/google/quickdraw

⇱ google/quickdraw · Datasets at Hugging Face


Dataset Viewer

The viewer is disabled because this dataset repo requires arbitrary Python code execution. Please consider removing the loading script and relying on automated data support (you can use convert_to_parquet from the datasets library). If this is not possible, please open a discussion for direct help.

Dataset Card for Quick, Draw!

Dataset Summary

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.

Supported Tasks and Leaderboards

  • image-classification: The goal of this task is to classify a given sketch into one of 345 classes. The (closed) leaderboard for this task is available here.

Languages

English.

Dataset Structure

Data Instances

raw

A data point comprises a drawing and its metadata.

{
 'key_id': '5475678961008640',
 'word': 0,
 'recognized': True,
 'timestamp': datetime.datetime(2017, 3, 28, 13, 28, 0, 851730),
 'countrycode': 'MY',
 'drawing': {
 'x': [[379.0, 380.0, 381.0, 381.0, 381.0, 381.0, 382.0], [362.0, 368.0, 375.0, 380.0, 388.0, 393.0, 399.0, 404.0, 409.0, 410.0, 410.0, 405.0, 397.0, 392.0, 384.0, 377.0, 370.0, 363.0, 356.0, 348.0, 342.0, 336.0, 333.0], ..., [477.0, 473.0, 471.0, 469.0, 468.0, 466.0, 464.0, 462.0, 461.0, 469.0, 475.0, 483.0, 491.0, 499.0, 510.0, 521.0, 531.0, 540.0, 548.0, 558.0, 566.0, 576.0, 583.0, 590.0, 595.0, 598.0, 597.0, 596.0, 594.0, 592.0, 590.0, 589.0, 588.0, 586.0]],
 'y': [[1.0, 7.0, 15.0, 21.0, 27.0, 32.0, 32.0], [17.0, 17.0, 17.0, 17.0, 16.0, 16.0, 16.0, 16.0, 18.0, 23.0, 29.0, 32.0, 32.0, 32.0, 29.0, 27.0, 25.0, 23.0, 21.0, 19.0, 17.0, 16.0, 14.0], ..., [151.0, 146.0, 139.0, 131.0, 125.0, 119.0, 113.0, 107.0, 102.0, 99.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 98.0, 100.0, 102.0, 104.0, 105.0, 110.0, 115.0, 121.0, 126.0, 131.0, 137.0, 142.0, 148.0, 150.0]],
 't': [[0, 84, 100, 116, 132, 148, 260], [573, 636, 652, 660, 676, 684, 701, 724, 796, 838, 860, 956, 973, 979, 989, 995, 1005, 1012, 1020, 1028, 1036, 1053, 1118], ..., [8349, 8446, 8468, 8484, 8500, 8516, 8541, 8557, 8573, 8685, 8693, 8702, 8710, 8718, 8724, 8732, 8741, 8748, 8757, 8764, 8773, 8780, 8788, 8797, 8804, 8965, 8996, 9029, 9045, 9061, 9076, 9092, 9109, 9167]]
 }
}

preprocessed_simplified_drawings

The simplified version of the dataset generated from the raw data with the simplified vectors, removed timing information, and the data positioned and scaled into a 256x256 region. The simplification process was: 1.Align the drawing to the top-left corner, to have minimum values of 0. 2.Uniformly scale the drawing, to have a maximum value of 255. 3.Resample all strokes with a 1 pixel spacing. 4.Simplify all strokes using the Ramer-Douglas-Peucker algorithm with an epsilon value of 2.0.

{
 'key_id': '5475678961008640',
 'word': 0,
 'recognized': True,
 'timestamp': datetime.datetime(2017, 3, 28, 15, 28),
 'countrycode': 'MY',
 'drawing': {
 'x': [[31, 32], [27, 37, 38, 35, 21], [25, 28, 38, 39], [33, 34, 32], [5, 188, 254, 251, 241, 185, 45, 9, 0], [35, 35, 43, 125, 126], [35, 76, 80, 77], [53, 50, 54, 80, 78]],
 'y': [[0, 7], [4, 4, 6, 7, 3], [5, 10, 10, 7], [4, 33, 44], [50, 50, 54, 83, 86, 90, 86, 77, 52], [85, 91, 92, 96, 90], [35, 37, 41, 47], [34, 23, 22, 23, 34]]
 }
}

preprocessed_bitmaps (default configuration)

This configuration contains the 28x28 grayscale bitmap images that were generated from the simplified data, but are aligned to the center of the drawing's bounding box rather than the top-left corner. The code that was used for generation is available here.

{
 'image': <PIL.PngImagePlugin.PngImageFile image mode=L size=28x28 at 0x10B5B102828>,
 'label': 0
}

sketch_rnn and sketch_rnn_full

The sketch_rnn_full configuration stores the data in the format suitable for inputs into a recurrent neural network and was used for for training the Sketch-RNN model. Unlike sketch_rnn where the samples have been randomly selected from each category, the sketch_rnn_full configuration contains the full data for each category.

{
 'word': 0,
 'drawing': [[132, 0, 0], [23, 4, 0], [61, 1, 0], [76, 0, 0], [22, -4, 0], [152, 0, 0], [50, -5, 0], [36, -10, 0], [8, 26, 0], [0, 69, 0], [-2, 11, 0], [-8, 10, 0], [-56, 24, 0], [-23, 14, 0], [-99, 40, 0], [-45, 6, 0], [-21, 6, 0], [-170, 2, 0], [-81, 0, 0], [-29, -9, 0], [-94, -19, 0], [-48, -24, 0], [-6, -16, 0], [2, -36, 0], [7, -29, 0], [23, -45, 0], [13, -6, 0], [41, -8, 0], [42, -2, 1], [392, 38, 0], [2, 19, 0], [11, 33, 0], [13, 0, 0], [24, -9, 0], [26, -27, 0], [0, -14, 0], [-8, -10, 0], [-18, -5, 0], [-14, 1, 0], [-23, 4, 0], [-21, 12, 1], [-152, 18, 0], [10, 46, 0], [26, 6, 0], [38, 0, 0], [31, -2, 0], [7, -2, 0], [4, -6, 0], [-10, -21, 0], [-2, -33, 0], [-6, -11, 0], [-46, 1, 0], [-39, 18, 0], [-19, 4, 1], [-122, 0, 0], [-2, 38, 0], [4, 16, 0], [6, 4, 0], [78, 0, 0], [4, -8, 0], [-8, -36, 0], [0, -22, 0], [-6, -2, 0], [-32, 14, 0], [-58, 13, 1], [-96, -12, 0], [-10, 27, 0], [2, 32, 0], [102, 0, 0], [1, -7, 0], [-27, -17, 0], [-4, -6, 0], [-1, -34, 0], [-64, 8, 1], [129, -138, 0], [-108, 0, 0], [-8, 12, 0], [-1, 15, 0], [12, 15, 0], [20, 5, 0], [61, -3, 0], [24, 6, 0], [19, 0, 0], [5, -4, 0], [2, 14, 1]]
}

Data Fields

raw

  • key_id: A unique identifier across all drawings.
  • word: Category the player was prompted to draw.
  • recognized: Whether the word was recognized by the game.
  • timestamp: When the drawing was created.
  • countrycode: A two letter country code (ISO 3166-1 alpha-2) of where the player was located.
  • drawing: A dictionary where x and y are the pixel coordinates, and t is the time in milliseconds since the first point. x and y are real-valued while t is an integer. x, y and t match in lenght and are represented as lists of lists where each sublist corresponds to a single stroke. The raw drawings can have vastly different bounding boxes and number of points due to the different devices used for display and input.

preprocessed_simplified_drawings

  • key_id: A unique identifier across all drawings.
  • word: Category the player was prompted to draw.
  • recognized: Whether the word was recognized by the game.
  • timestamp: When the drawing was created.
  • countrycode: A two letter country code (ISO 3166-1 alpha-2) of where the player was located.
  • drawing: A simplified drawing represented as a dictionary where x and y are the pixel coordinates. The simplification processed is described in the Data Instances section.

preprocessed_bitmaps (default configuration)

  • image: A PIL.Image.Image object containing the 28x28 grayscale bitmap. Note that when accessing the image column: dataset[0]["image"] the image file is automatically decoded. Decoding of a large number of image files might take a significant amount of time. Thus it is important to first query the sample index before the "image" column, i.e. dataset[0]["image"] should always be preferred over dataset["image"][0].
  • label: Category the player was prompted to draw.

sketch_rnn and sketch_rnn_full

  • word: Category the player was prompted to draw.
  • drawing: An array of strokes. Strokes are represented as 3-tuples consisting of x-offset, y-offset, and a binary variable which is 1 if the pen is lifted between this position and the next, and 0 otherwise.

Note: Sketch-RNN takes for input strokes represented as 5-tuples with drawings padded to a common maximum length and prefixed by the special start token [0, 0, 1, 0, 0]. The 5-tuple representation consists of x-offset, y-offset, and p_1, p_2, p_3, a binary one-hot vector of 3 possible pen states: pen down, pen up, end of sketch. More precisely, the first two elements are the offset distance in the x and y directions of the pen from the previous point. The last 3 elements represents a binary one-hot vector of 3 possible states. The first pen state, p1, indicates that the pen is currently touching the paper, and that a line will be drawn connecting the next point with the current point. The second pen state, p2, indicates that the pen will be lifted from the paper after the current point, and that no line will be drawn next. The final pen state, p3, indicates that the drawing has ended, and subsequent points, including the current point, will not be rendered.

Data Splits

In the configurations raw, preprocessed_simplified_drawings and preprocessed_bitamps (default configuration), all the data is contained in the training set, which has 50426266 examples.

sketch_rnn and sketch_rnn_full have the data split into training, validation and test split. In the sketch_rnn configuration, 75K samples (70K Training, 2.5K Validation, 2.5K Test) have been randomly selected from each category. Therefore, the training set contains 24150000 examples, the validation set 862500 examples and the test set 862500 examples. The sketch_rnn_full configuration has the full (training) data for each category, which leads to the training set having 43988874 examples, the validation set 862500 and the test set 862500 examples.

Dataset Creation

Curation Rationale

From the GitHub repository:

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located. You can browse the recognized drawings on quickdraw.withgoogle.com/data.

We're sharing them here for developers, researchers, and artists to explore, study, and learn from

Source Data

Initial Data Collection and Normalization

This dataset contains vector drawings obtained from Quick, Draw!, an online game where the players are asked to draw objects belonging to a particular object class in less than 20 seconds.

Who are the source language producers?

The participants in the Quick, Draw! game.

Annotations

Annotation process

The annotations are machine-generated and match the category the player was prompted to draw.

Who are the annotators?

The annotations are machine-generated.

Personal and Sensitive Information

Some sketches are known to be problematic (see https://github.com/googlecreativelab/quickdraw-dataset/issues/74 and https://github.com/googlecreativelab/quickdraw-dataset/issues/18).

Considerations for Using the Data

Social Impact of Dataset

[More Information Needed]

Discussion of Biases

[More Information Needed]

Other Known Limitations

Additional Information

Dataset Curators

Jonas Jongejan, Henry Rowley, Takashi Kawashima, Jongmin Kim and Nick Fox-Gieg.

Licensing Information

The data is made available by Google, Inc. under the Creative Commons Attribution 4.0 International license.

Citation Information

@article{DBLP:journals/corr/HaE17,
 author = {David Ha and
 Douglas Eck},
 title = {A Neural Representation of Sketch Drawings},
 journal = {CoRR},
 volume = {abs/1704.03477},
 year = {2017},
 url = {http://arxiv.org/abs/1704.03477},
 archivePrefix = {arXiv},
 eprint = {1704.03477},
 timestamp = {Mon, 13 Aug 2018 16:48:30 +0200},
 biburl = {https://dblp.org/rec/bib/journals/corr/HaE17},
 bibsource = {dblp computer science bibliography, https://dblp.org}
}

Contributions

Thanks to @mariosasko for adding this dataset.

Downloads last month
693

Models trained or fine-tuned on google/quickdraw

Spaces using google/quickdraw 3

Paper for google/quickdraw