On this page
tf.data.experimental.make_batched_features_dataset
Returns a Dataset of feature dictionaries from Example protos.
tf.data.experimental.make_batched_features_dataset(
    file_pattern, batch_size, features, reader=None, label_key=None,
    reader_args=None, num_epochs=None, shuffle=True, shuffle_buffer_size=10000,
    shuffle_seed=None, prefetch_buffer_size=None, reader_num_threads=None,
    parser_num_threads=None, sloppy_ordering=False, drop_final_batch=False
)
If label_key argument is provided, returns a Dataset of tuple comprising of feature dictionaries and label.
Example:
serialized_examples = [
  features {
    feature { key: "age" value { int64_list { value: [ 0 ] } } }
    feature { key: "gender" value { bytes_list { value: [ "f" ] } } }
    feature { key: "kws" value { bytes_list { value: [ "code", "art" ] } } }
  },
  features {
    feature { key: "age" value { int64_list { value: [] } } }
    feature { key: "gender" value { bytes_list { value: [ "f" ] } } }
    feature { key: "kws" value { bytes_list { value: [ "sports" ] } } }
  }
]
We can use arguments:
features: {
  "age": FixedLenFeature([], dtype=tf.int64, default_value=-1),
  "gender": FixedLenFeature([], dtype=tf.string),
  "kws": VarLenFeature(dtype=tf.string),
}
And the expected output is:
{
  "age": [[0], [-1]],
  "gender": [["f"], ["f"]],
  "kws": SparseTensor(
    indices=[[0, 0], [0, 1], [1, 0]],
    values=["code", "art", "sports"]
    dense_shape=[2, 2]),
}
| Args | |
|---|---|
| file_pattern | List of files or patterns of file paths containing Examplerecords. Seetf.io.gfile.globfor pattern rules. | 
| batch_size | An int representing the number of records to combine in a single batch. | 
| features | A dictmapping feature keys toFixedLenFeatureorVarLenFeaturevalues. Seetf.io.parse_example. | 
| reader | A function or class that can be called with a filenamestensor and (optional)reader_argsand returns aDatasetofExampletensors. Defaults totf.data.TFRecordDataset. | 
| label_key | (Optional) A string corresponding to the key labels are stored in tf.Examples. If provided, it must be one of thefeatureskey, otherwise results inValueError. | 
| reader_args | Additional arguments to pass to the reader class. | 
| num_epochs | Integer specifying the number of times to read through the dataset. If None, cycles through the dataset forever. Defaults to None. | 
| shuffle | A boolean, indicates whether the input should be shuffled. Defaults to True. | 
| shuffle_buffer_size | Buffer size of the ShuffleDataset. A large capacity ensures better shuffling but would increase memory usage and startup time. | 
| shuffle_seed | Randomization seed to use for shuffling. | 
| prefetch_buffer_size | Number of feature batches to prefetch in order to improve performance. Recommended value is the number of batches consumed per training step. Defaults to auto-tune. | 
| reader_num_threads | Number of threads used to read Examplerecords. If >1, the results will be interleaved. Defaults to1. | 
| parser_num_threads | Number of threads to use for parsing Exampletensors into a dictionary ofFeaturetensors. Defaults to2. | 
| sloppy_ordering | If True, reading performance will be improved at the cost of non-deterministic ordering. IfFalse, the order of elements produced is deterministic prior to shuffling (elements are still randomized ifshuffle=True. Note that if the seed is set, then order of elements after shuffling is deterministic). Defaults toFalse. | 
| drop_final_batch | If True, and the batch size does not evenly divide the input dataset size, the final smaller batch will be dropped. Defaults toFalse. | 
| Returns | |
|---|---|
| A dataset of dictelements, (or a tuple ofdictelements and label). Eachdictmaps feature keys toTensororSparseTensorobjects. | 
| Raises | |
|---|---|
| TypeError | If readeris of the wrong type. | 
| ValueError | If label_keyis not one of thefeatureskeys. | 
© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
 https://www.tensorflow.org/versions/r2.4/api_docs/python/tf/data/experimental/make_batched_features_dataset