![]() |
Class Nadam
Optimizer that implements the NAdam algorithm.
Inherits From: Optimizer
Aliases:
- Class
tf.compat.v1.keras.optimizers.Nadam
- Class
tf.compat.v2.keras.optimizers.Nadam
- Class
tf.compat.v2.optimizers.Nadam
Much like Adam is essentially RMSprop with momentum, Nadam is Adam with Nesterov momentum.
Initialization:
Computes:
gradient is evaluated at theta(t) + momentum * v(t), and the variables always store theta + beta_1 * m / sqrt(v) instead of theta.
References See Dozat, T., 2015.
__init__
__init__(
learning_rate=0.001,
beta_1=0.9,
beta_2=0.999,
epsilon=1e-07,
name='Nadam',
**kwargs
)
Construct a new Nadam optimizer.
Args:
learning_rate
: A Tensor or a floating point value. The learning rate.beta_1
: A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates.beta_2
: A float value or a constant float tensor. The exponential decay rate for the exponentially weighted infinity norm.epsilon
: A small constant for numerical stability.name
: Optional name for the operations created when applying gradients. Defaults to "Adamax".**kwargs
: keyword arguments. Allowed to be {clipnorm
,clipvalue
,lr
,decay
}.clipnorm
is clip gradients by norm;clipvalue
is clip gradients by value,decay
is included for backward compatibility to allow time inverse decay of learning rate.lr
is included for backward compatibility, recommended to uselearning_rate
instead.
Properties
iterations
Variable. The number of training steps this Optimizer has run.
weights
Returns variables of this Optimizer based on the order created.
Methods
tf.keras.optimizers.Nadam.add_slot
add_slot(
var,
slot_name,
initializer='zeros'
)
Add a new slot variable for var
.
tf.keras.optimizers.Nadam.add_weight
add_weight(
name,
shape,
dtype=None,
initializer='zeros',
trainable=None,
synchronization=tf.VariableSynchronization.AUTO,
aggregation=tf.VariableAggregation.NONE
)
tf.keras.optimizers.Nadam.apply_gradients
apply_gradients(
grads_and_vars,
name=None
)
Apply gradients to variables.
This is the second part of minimize()
. It returns an Operation
that
applies gradients.
Args:
grads_and_vars
: List of (gradient, variable) pairs.name
: Optional name for the returned operation. Default to the name passed to theOptimizer
constructor.
Returns:
An Operation
that applies the specified gradients. The iterations
will be automatically increased by 1.
Raises:
TypeError
: Ifgrads_and_vars
is malformed.ValueError
: If none of the variables have gradients.
tf.keras.optimizers.Nadam.from_config
from_config(
cls,
config,
custom_objects=None
)
Creates an optimizer from its config.
This method is the reverse of get_config
,
capable of instantiating the same optimizer from the config
dictionary.
Arguments:
config
: A Python dictionary, typically the output of get_config.custom_objects
: A Python dictionary mapping names to additional Python objects used to create this optimizer, such as a function used for a hyperparameter.
Returns:
An optimizer instance.
tf.keras.optimizers.Nadam.get_config
get_config()
Returns the config of the optimimizer.
An optimizer config is a Python dictionary (serializable) containing the configuration of an optimizer. The same optimizer can be reinstantiated later (without any saved state) from this configuration.
Returns:
Python dictionary.
tf.keras.optimizers.Nadam.get_gradients
get_gradients(
loss,
params
)
Returns gradients of loss
with respect to params
.
Arguments:
loss
: Loss tensor.params
: List of variables.
Returns:
List of gradient tensors.
Raises:
ValueError
: In case any gradient cannot be computed (e.g. if gradient function not implemented).
tf.keras.optimizers.Nadam.get_slot
get_slot(
var,
slot_name
)
tf.keras.optimizers.Nadam.get_slot_names
get_slot_names()
A list of names for this optimizer's slots.
tf.keras.optimizers.Nadam.get_updates
get_updates(
loss,
params
)
tf.keras.optimizers.Nadam.get_weights
get_weights()
tf.keras.optimizers.Nadam.minimize
minimize(
loss,
var_list,
grad_loss=None,
name=None
)
Minimize loss
by updating var_list
.
This method simply computes gradient using tf.GradientTape
and calls
apply_gradients()
. If you want to process the gradient before applying
then call tf.GradientTape
and apply_gradients()
explicitly instead
of using this function.
Args:
loss
: A callable taking no arguments which returns the value to minimize.var_list
: list or tuple ofVariable
objects to update to minimizeloss
, or a callable returning the list or tuple ofVariable
objects. Use callable when the variable list would otherwise be incomplete beforeminimize
since the variables are created at the first timeloss
is called.grad_loss
: Optional. ATensor
holding the gradient computed forloss
.name
: Optional name for the returned operation.
Returns:
An Operation that updates the variables in var_list
. If global_step
was not None
, that operation also increments global_step
.
Raises:
ValueError
: If some of the variables are notVariable
objects.
tf.keras.optimizers.Nadam.set_weights
set_weights(weights)
tf.keras.optimizers.Nadam.variables
variables()
Returns variables of this Optimizer based on the order created.