CausalVariant¶

class torch.nn.attention.bias.CausalVariant(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)[source]¶

Enum for causal variants used in attention mechanisms.

Defines two types of causal biases:

UPPER_LEFT: Represents upper-left triangular bias for standard causal attention. The equivalent pytorch code for constructing this bias is:

torch.tril(torch.ones(size, dtype=torch.bool))

For instance, with shape=(3,4), the materialized bias tensor will be:

[[1, 0, 0, 0],
 [1, 1, 0, 0],
 [1, 1, 1, 0]]

LOWER_RIGHT: Represents lower-right triangular bias, the include values are aligned to the lower right corner of the matrix.

The equivalent pytorch code for constructing this bias is:

diagonal_offset = size[1] - size[0]
torch.tril(
    torch.ones(size, dtype=torch.bool),
    diagonal=diagonal_offset,
)

For instance, with shape=(3,4), the materialized bias tensor will be:

[[1, 1, 0, 0],
 [1, 1, 1, 0],
 [1, 1, 1, 1]]

Note that these variants are equivalent to each other when the sequence lengths of the query and key/value tensors are equal since the triangular matrix is square.

Warning

This enum is a prototype and subject to change.

as_integer_ratio()¶

Return integer ratio.

Return a pair of integers, whose ratio is exactly equal to the original int and with a positive denominator.

>>> (10).as_integer_ratio()
(10, 1)
>>> (-10).as_integer_ratio()
(-10, 1)
>>> (0).as_integer_ratio()
(0, 1)

bit_count()¶

Number of ones in the binary representation of the absolute value of self.

Also known as the population count.

>>> bin(13)
'0b1101'
>>> (13).bit_count()
3

bit_length()¶

Number of bits necessary to represent self in binary.

>>> bin(37)
'0b100101'
>>> (37).bit_length()
6

conjugate()¶: Returns self, the complex conjugate of any int.

denominator ¶: the denominator of a rational number in lowest terms

from_bytes(byteorder='big', *, signed=False)¶

Return the integer represented by the given array of bytes.

bytes: Holds the array of bytes to convert. The argument must either support the buffer protocol or be an iterable object producing bytes. Bytes and bytearray are examples of built-in objects that support the buffer protocol.
byteorder: The byte order used to represent the integer. If byteorder is ‘big’, the most significant byte is at the beginning of the byte array. If byteorder is ‘little’, the most significant byte is at the end of the byte array. To request the native byte order of the host system, use `sys.byteorder’ as the byte order value. Default is to use ‘big’.
signed: Indicates whether two’s complement is used to represent the integer.

imag ¶: the imaginary part of a complex number

numerator ¶: the numerator of a rational number in lowest terms

real ¶: the real part of a complex number

to_bytes(length=1, byteorder='big', *, signed=False)¶

Return an array of bytes representing an integer.

length: Length of bytes object to use. An OverflowError is raised if the integer is not representable with the given number of bytes. Default is length 1.
byteorder: The byte order used to represent the integer. If byteorder is ‘big’, the most significant byte is at the beginning of the byte array. If byteorder is ‘little’, the most significant byte is at the end of the byte array. To request the native byte order of the host system, use `sys.byteorder’ as the byte order value. Default is to use ‘big’.
signed: Determines whether two’s complement is used to represent the integer. If signed is False and a negative integer is given, an OverflowError is raised.

CausalVariant¶

Docs

Tutorials

Resources