xlstm_jax.models.xlstm_pytorch.components.init

xlstm_jax.models.xlstm_pytorch.components.init#

Functions#

bias_linspace_init_(param[, start, end])

Linearly spaced bias init across dimensions.

small_init_init_(param, dim)

Fills the input Tensor with values according to the method described in Transformers without Tears: Improving

wang_init_(param, dim, num_blocks)

Adopted from EleutherAI/gpt-neox.

Module Contents#

xlstm_jax.models.xlstm_pytorch.components.init.bias_linspace_init_(param, start=3.4, end=6.0)#

Linearly spaced bias init across dimensions.

Parameters:
Return type:

torch.Tensor

xlstm_jax.models.xlstm_pytorch.components.init.small_init_init_(param, dim)#

Fills the input Tensor with values according to the method described in Transformers without Tears: Improving the Normalization of Self-Attention - Nguyen, T. & Salazar, J. (2019), using a normal distribution. Adopted from EleutherAI/gpt-neox.

Parameters:
Return type:

torch.Tensor

xlstm_jax.models.xlstm_pytorch.components.init.wang_init_(param, dim, num_blocks)#

Adopted from EleutherAI/gpt-neox.

Parameters: