I want to use rake to extract technical keywords from a job description that I’ve found on Linkedin, which looks like this:
input = "In-depth understanding of the Python software development stacks, ecosystems, frameworks and tools such as Numpy, Scipy, Pandas, Dask, spaCy, NLTK, sci-kit-learn and PyTorch.Experience with front-end development using HTML, CSS, and JavaScript.
Familiarity with database technologies such as SQL and NoSQL.Excellent problem-solving ability with solid communication and collaboration skills.
Preferred Skills And QualificationsExperience with popular Python frameworks such as Django, Flask or Pyramid."
I run this code, as it’s supposed to return the keywords.
from rake_nltk import Rake
r = Rake()
r.extract_keywords_from_text(input)
keywords = r.get_ranked_phrases_with_scores()
for score, keyword in keywords:
if len(keyword.split()) == 1: # Check if the keyword is one word
print(f"{keyword}: {score}")
But the output is this:
frameworks: 2.0
tools: 1.0
sql: 1.0
spacy: 1.0
scipy: 1.0
sci: 1.0
qualificationsexperience: 1.0
pytorch: 1.0
pyramid: 1.0
pandas: 1.0
numpy: 1.0
nosql: 1.0
nltk: 1.0
learn: 1.0
kit: 1.0
javascript: 1.0
front: 1.0
flask: 1.0
familiarity: 1.0
experience: 1.0
ecosystems: 1.0
django: 1.0
dask: 1.0
css: 1.0
Simply I just want the explicit name of tools, skills and frameworks. Such as “Numpy”, “Scipy”, “HTML”, etc That are used in the text and NOT every single word that’s found in it (such as “experience” or “tools”).
Is there any way to do so? Or should I just provide a list of all possible python frameworks and related skill and then filter the output of rake?
If the latter one is the solution, How can I find/make a thorough list?
Any help is appreciated.
Fatemeh is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.