xlstm_jax.models.xlstm_pytorch.components.init#
Functions#
|
Linearly spaced bias init across dimensions. |
|
Fills the input Tensor with values according to the method described in Transformers without Tears: Improving |
|
Adopted from EleutherAI/gpt-neox. |
Module Contents#
- xlstm_jax.models.xlstm_pytorch.components.init.bias_linspace_init_(param, start=3.4, end=6.0)#
Linearly spaced bias init across dimensions.
- Parameters:
param (torch.Tensor | torch.distributed._tensor.DTensor)
start (float)
end (float)
- Return type:
- xlstm_jax.models.xlstm_pytorch.components.init.small_init_init_(param, dim)#
Fills the input Tensor with values according to the method described in Transformers without Tears: Improving the Normalization of Self-Attention - Nguyen, T. & Salazar, J. (2019), using a normal distribution. Adopted from EleutherAI/gpt-neox.
- Parameters:
param (torch.Tensor)
dim (int)
- Return type:
- xlstm_jax.models.xlstm_pytorch.components.init.wang_init_(param, dim, num_blocks)#
Adopted from EleutherAI/gpt-neox.
- Parameters:
param (torch.Tensor)
dim (int)
num_blocks (int)