is it possible to parse multiple chunks in a single nltk.regexp parser?
can grammar have multiple chunks define like this?
def parser(s):
grammar = """
NP: {<DT>?<JJ>*<NN>+}
VN: {<VB.?><DT>?<NN>}
"""
words = nltk.word_tokenize(s)
tagged_words = nltk.pos_tag(words)
cp = nltk.RegexpParser(grammar)
chunked = cp.parse(tagged_words)
print(chunked)
for _ in chunked:
if isinstance(_, nltk.tree.Tree):
if _.label() == 'NP':
print(_.leaves(), 'Noun Phrase')
if _.label() == 'VN':
print(_.leaves(),' a verbed noun')
The result do not include VN chunks
(S
i/NNS
need/VBP
(NP a/DT project/NN manager/NN)
who/WP
knows/VBZ
(NP c++/NN))
[('a', 'DT'), ('project', 'NN'), ('manager', 'NN')] Noun Phrase
[('c++', 'NN')] Noun Phrase
New contributor
konto is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.