![]() |
Class StaticVocabularyTable
String to Id table wrapper that assigns out-of-vocabulary keys to buckets.
Inherits From: StaticVocabularyTable
Aliases:
For example, if an instance of StaticVocabularyTable
is initialized with a
string-to-id initializer that maps:
emerson -> 0
lake -> 1
palmer -> 2
The Vocabulary
object will performs the following mapping:
emerson -> 0
lake -> 1
palmer -> 2
<other term> -> bucket_id
, where bucket_id will be between3
and3 + num_oov_buckets - 1
, calculated by:hash(<term>) % num_oov_buckets + vocab_size
If input_tensor is ["emerson", "lake", "palmer", "king", "crimson"]
,
the lookup result is [0, 1, 2, 4, 7]
.
If initializer
is None, only out-of-vocabulary buckets are used.
Example usage:
num_oov_buckets = 3
input_tensor = tf.constant(["emerson", "lake", "palmer", "king", "crimnson"])
table = tf.lookup.StaticVocabularyTable(
tf.TextFileIdTableInitializer(filename), num_oov_buckets)
out = table.lookup(input_tensor).
table.init.run()
print(out.eval())
The hash function used for generating out-of-vocabulary buckets ID is Fingerprint64.
__init__
__init__(
initializer,
num_oov_buckets,
lookup_key_dtype=None,
name=None
)
Construct a StaticVocabularyTable
object.
Args:
initializer
: A TableInitializerBase object that contains the data used to initialize the table. If None, then we only use out-of-vocab buckets.num_oov_buckets
: Number of buckets to use for out-of-vocabulary keys. Must be greater than zero.lookup_key_dtype
: Data type of keys passed tolookup
. Defaults toinitializer.key_dtype
ifinitializer
is specified, otherwisetf.string
. Must be string or integer, and must be castable toinitializer.key_dtype
.name
: A name for the operation (optional).
Raises:
ValueError
: whennum_oov_buckets
is not positive.TypeError
: when lookup_key_dtype or initializer.key_dtype are not integer or string. Also when initializer.value_dtype != int64.
Properties
initializer
key_dtype
The table key dtype.
name
The name of the table.
resource_handle
Returns the resource handle associated with this Resource.
value_dtype
The table value dtype.
Methods
tf.lookup.StaticVocabularyTable.lookup
lookup(
keys,
name=None
)
Looks up keys
in the table, outputs the corresponding values.
It assigns out-of-vocabulary keys to buckets based in their hashes.
Args:
keys
: Keys to look up. May be either aSparseTensor
or denseTensor
.name
: Optional name for the op.
Returns:
A SparseTensor
if keys are sparse, otherwise a dense Tensor
.
Raises:
TypeError
: whenkeys
doesn't match the table key data type.
tf.lookup.StaticVocabularyTable.size
size(name=None)
Compute the number of elements in this table.