xlstm_jax.dataset.hf_tokenizer#
Functions#
|
Loads the tokenizer. |
Module Contents#
- xlstm_jax.dataset.hf_tokenizer.load_tokenizer(tokenizer_path, add_bos, add_eos, hf_access_token=None, cache_dir=None)#
Loads the tokenizer.
- Parameters:
tokenizer_path (str) – The path to the tokenizer.
add_bos (bool) – Whether to add the beginning of sequence token.
add_eos (bool) – Whether to add the end of sequence token.
hf_access_token (str | None) – The access token for HuggingFace.
cache_dir (str | None) – The cache directory for the tokenizer.
- Returns:
The tokenizer.
- Return type:
transformers.AutoTokenizer