tf.data.experimental.index_table_from_dataset

Returns an index lookup table based on the given dataset.

View aliases

Compat aliases for migration

tf.compat.v1.data.experimental.index_table_from_dataset

tf.data.experimental.index_table_from_dataset(
    dataset=None,
    num_oov_buckets=0,
    vocab_size=None,
    default_value=-1,
    hasher_spec=lookup_ops.FastHashSpec,
    key_dtype=tf.dtypes.string,
    name=None
)

This operation constructs a lookup table based on the given dataset of keys.

Any lookup of an out-of-vocabulary token will return a bucket ID based on its hash if num_oov_buckets is greater than zero. Otherwise it is assigned the default_value. The bucket ID range is [vocabulary size, vocabulary size + num_oov_buckets - 1].

Sample Usages:

ds = tf.data.Dataset.range(100).map(lambda x: tf.strings.as_string(x * 2))
table = tf.data.experimental.index_table_from_dataset(
                                    ds, key_dtype=dtypes.int64)
table.lookup(tf.constant(['0', '2', '4'], dtype=tf.string)).numpy()
array([0, 1, 2])

Args
`dataset`	A dataset of keys.
`num_oov_buckets`	The number of out-of-vocabulary buckets.
`vocab_size`	Number of the elements in the vocabulary, if known.
`default_value`	The value to use for out-of-vocabulary feature values. Defaults to -1.
`hasher_spec`	A `HasherSpec` to specify the hash function to use for assignation of out-of-vocabulary buckets.
`key_dtype`	The `key` data type.
`name`	A name for this op (optional).

Returns
The lookup table based on the given dataset.

Raises
`ValueError`	If `num_oov_buckets` is negative `vocab_size` is not greater than zero The `key_dtype` is not integer or string

© 2022 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 4.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.9/api_docs/python/tf/data/experimental/index_table_from_dataset

Docs

Docs4dev

Title here

tf.data.experimental.index_table_from_dataset

View aliases

Sample Usages: