TensorFlow 2.0新特性之Ragged Tensor

2019 年 4 月 5 日 深度学习每日摘要

在处理很多机器学习现实任务的时候，我们常常会面临形状分布不固定的Tensor，比如，在处理文本分类任务中，一句话中包含的单词的数目是可变的，在语音识别中，输入的音频的长度是可变的，在语音合成中，输入的文本的长度是可变的。一般这个时候，我们需要设定一个固定的长度，对于过长的输入，就进行截断；而对于过短的输入，就用特殊数值补齐。

在TF 2.0中，我们迎来了一个新的解决方法，那就是今天要介绍的Ragged Tensor，例如对于一段文本，我们可以按照如下方式来定义：

1
2
3

speech = tf.ragged.constant(
  [[‘今', ‘天', ‘天', ‘气', '很', ‘好'],
  [‘我’, ‘很’, ‘好']])

在这里，speech是一个RaggedTensor，那么大家可能会比较疑惑，这种变量是否支持加减乘除等一系列数学运算呢？好消息是，RaggedTensor支持非常多常用的数学运算，具体的可以参见https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/ragged?hl=en。除此之外，RaggedTensor还支持Python风格的索引，例如，当我们取speech的第一个元素的时候speech[0]，是这样的：

1	tf.Tensor([‘今', ‘天', ‘天', ‘气', '很', ‘好’], shape=(6,), dtype=string)

除此之外，tf.ragged包还包含了一些针对RaggedTensor特别进行的操作，例如tf.ragged.map_flat_values可以对RaggedTensor中的变量进行操作，并且返回值的形状还是输入RaggedTensor的形状。

需要说明的是，RaggedTensor可能听起来和SparseTensor无异，但是其实是有很大区别的，SparseTensor在计算的时候需要照顾到它的Dense Tensor的坐标轴，而RaggedTensor则只需考虑自己的行列即可，这一点在各自的加法上面可以看出来。

最后看一段RaggedTensor的示例代码：

import math
import tensorflow as tf
tf.enable_eager_execution()
# Set up the embeddingss
num_buckets = 1024
embedding_size = 16
embedding_table = 
    tf.Variable(
        tf.truncated_normal([num_buckets, embedding_size],
        stddev=1.0 / math.sqrt(embedding_size)),
        name="embedding_table")
# Input tensor.
queries = tf.ragged.constant([
    ['Who', 'is', 'Dan', 'Smith']
    ['Pause'],
    ['Will', 'it', 'rain', 'later', 'today']])
# Look up embedding for each word.  map_flat_values applies an operation to each value in a RaggedTensor.
word_buckets = tf.strings.to_hash_bucket_fast(queries, num_buckets)
word_embeddings = tf.ragged.map_flat_values(
        tf.nn.embedding_lookup, embedding_table, word_buckets)  # ①
# Add markers to the beginning and end of each sentence.
marker = tf.fill([queries.nrows()), 1], '#')
padded = tf.concat([marker, queries, marker], axis=1)           # ②
# Build word bigrams & look up embeddings.
bigrams = tf.string_join(
    [padded[:, :-1], padded[:, 1:]], separator='+')             # ③
bigram_buckets = 
    tf.strings.to_hash_bucket_fast(bigrams, num_buckets)
bigram_embeddings = tf.ragged.map_flat_values(
    tf.nn.embedding_lookup, embedding_table, bigram_buckets)   # ④
# Find the average embedding for each sentence
all_embeddings = 
    tf.concat([word_embeddings, bigram_embeddings], axis=1)    # ⑤
avg_embedding = tf.reduce_mean(all_embeddings, axis=1)         # ⑥
print(word_embeddings)
print(bigram_embeddings)
print(all_embeddings)
print(avg_embedding)