tf.data.experimental.TFRecordWriter

View source on GitHub

Class TFRecordWriter

Writes data to a TFRecord file.

Aliases:

To write a dataset to a single TFRecord file:

dataset = ... # dataset to be written
writer = tf.data.experimental.TFRecordWriter(PATH)
writer.write(dataset)

To shard a dataset across multiple TFRecord files:

dataset = ... # dataset to be written

def reduce_func(key, dataset):
  filename = tf.strings.join([PATH_PREFIX, tf.strings.as_string(key)])
  writer = tf.data.experimental.TFRecordWriter(filename)
  writer.write(dataset.map(lambda _, x: x))
  return tf.data.Dataset.from_tensors(filename)

dataset = dataset.enumerate()
dataset = dataset.apply(tf.data.experimental.group_by_window(
  lambda i, _: i % NUM_SHARDS, reduce_func, tf.int64.max
))

__init__

View source

__init__(
    filename,
    compression_type=None
)

Initialize self. See help(type(self)) for accurate signature.

Methods

tf.data.experimental.TFRecordWriter.write

View source

write(dataset)

Returns a tf.Operation to write a dataset to a file.

Args:

Returns:

A tf.Operation that, when run, writes contents of dataset to a file.