Relative Content

Tag Archive for pythonnltk

Adding special tokens to beginning and end of ngram function

Im writing a function to that takes in text and converts the text into ngrams based on the order, n. So for bigrams n=2, fivegrams n=5, and so on. Im trying to add special tokens at the beginning and end. I need to put n-1 special tokens in the beginning, and 1 special token at the end.

Making a python dictionary with a for loop for tokens and their model score

So im trying to make a python dictionary comprising of a word and its model score for all of the words in my file. My issue is that I can’t find a way to put the keyword for my iterator, words, into the .score function without it literally giving me the score for the word “words”. The score function gives you a probability score based on the word that is input but I need it to cycle through each word in the file and give me the score for each.