Relative Content

Tag Archive for huggingface-tokenizers

when is add_prefix_space option required and why?

What is the purpose of the add_prefix_space and how to know which models would require it?

when is add_prefix_space option required and why?

What is the purpose of the add_prefix_space and how to know which models would require it?

when is add_prefix_space option required and why?

What is the purpose of the add_prefix_space and how to know which models would require it?

when is add_prefix_space option required and why?

What is the purpose of the add_prefix_space and how to know which models would require it?

How to get custom trained Bert tokenizer not to split certain characters

I am training my own tokenizer based on bert-based-cased. The problem I have is that in my data (dead language), there are tokens that begin with = and this should not be split off from the rest of the token. How do I achieve that?
Thanks for your help!

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for huggingface-tokenizers

when is add_prefix_space option required and why?

when is add_prefix_space option required and why?

when is add_prefix_space option required and why?

when is add_prefix_space option required and why?

How to get custom trained Bert tokenizer not to split certain characters