![]() |
VOOZH | about |
Ragged tensors are a fundamental data structure in TensorFlow, especially in scenarios where data doesn't conform to fixed shapes, such as sequences of varying lengths or nested structures. In this article, we'll understand what ragged tensors are, why they're useful, and provide hands-on coding examples to illustrate their usage.
Table of Content
In TensorFlow, tensors are the basic building blocks for data representation. A tensor is essentially a multi-dimensional array, where each dimension represents a different mode of indexing. Ragged tensors, however, deviate from this notion by allowing for variable lengths along certain dimensions.
Ragged tensors possess several distinct features:
Ragged tensors offer immense utility across diverse scenarios where data exhibits irregularities and fails to adhere to fixed shapes. Their versatility becomes apparent in various domains:
Here, we will learn how to create a ragged tensor with Tensorflow.
We use tf.ragged.constant() to create a ragged tensor from nested Python lists. Each nested list represents a sequence of varying length. The resulting ragged tensor accommodates these variable-length sequences.
Output:
<tf.RaggedTensor [[1, 2], [3, 4, 5], [6]]>
RaggedTensors are multi-dimensional tensors that can have rows of different lengths. There are multiple ways to put them together.
Pairing flat values (holding all values in a flattened list) with a row-partitioning tensor that indicates how to divide those values into rows using factory classmethods like tf.RaggedTensor.from_value_rowids, tf.RaggedTensor.from_row_lengths, and tf.RaggedTensor.from_row_splits.
tf.RaggedTensor.from_value_rowidsvalues tensor. Here, 0 represents the first sentence, 2 represents the second sentence, and 3 represents the third sentence (since it's a single word).Output:
<tf.RaggedTensor [[3, 1, 4, 1], [], [5, 9], [2]]>tf.RaggedTensor.from_row_lengthsrow_lengths: This is constant tensor containing information about the number of elements in each row of the desired RaggedTensor. Below, [2, 2, 2, 2] specifies that we want four rows, each with two elements.
Output:
<tf.RaggedTensor [[1, 2],
[3, 0],
[4, 0],
[5, 6]]>
tf.RaggedTensor.from_row_splitsrow_splits: This is a constant tensor, but instead of lengths, it contains information about where each row starts in the values tensor. Below, [0, 2, 4, 6, 8] indicates the starting index for each row: 0 (first element), 2 (third element), 4 (fifth element), and 6 (seventh element).
Output:
<tf.RaggedTensor [[1, 2],
[3, 0],
[4, 0],
[5, 6]]>
Ragged tensors support various standard operations similar to regular tensors in TensorFlow:
tf.add.tf.concat.In the example, let's demonstrate several standard operations on ragged tensors.
Output:
Added Ragged Tensor:
<tf.RaggedTensor [[5, 2],
[8, 6]]>
Mean Value of Ragged Tensor: 5.25
Concatenated Ragged Tensor:
<tf.RaggedTensor [[1, 2],
[3, 0],
[4, 0],
[5, 6]]>
Shapes of Elements Inside the Ragged Tensor: tf.Tensor([2 2], shape=(2,), dtype=int64)
Keras Makes Ragged Tensors Easy for Training with:
ragged=True argument: When defining the input layer using tf.keras.Input, set the ragged=True argument. This tells Keras the input will be a ragged tensor.ragged=True, simply pass your ragged tensor as the input to the model. Keras handles the ragged structure internally.inputs = keras.Input(shape=(), dtype=tf.int64, ragged=True)