I have a non-supervised fasttext model trained on my company data. I use it for feature extraction. It works better than non fine-tuned transformer-based models. Also, it vectorizes sentences quite faster.
But now that fasttext is deprecated (https://github.com/facebookresearch/fastText), I wonder what good alternative I could try. I’d like to leverage the advantages of self-attention but I also don’t want to deal with a huge pretrained model with a lot of “knowledge” that I don’t need.
Maybe something similar to the Universal Sentence Encoder architecture, whose code is not open-sourced.
Any suggestion?