Context
I’m trying to improve the relevance score for the search results, i’ve couple of candidate profiles, i’m searching the best candidate profile based on the skills and role they play in the industry .
I’ve come up with Ranking profile and using that i’m searching for best relevant candidate . I’m using lexical + semantic both .
Challenge here is the vespa generated relevance score is not so great and i wanted to fine tune the ranking as well as relevance score .
Any heads-up on this would be really appreciated !.
Here I want To :
a. improve the relevance score for this profile.
b. bm25(skills)
value is 0.0 in both matchfeatures
and summaryfeature
when actually it has both java
and python
.
Output :
{
"root": {
"id": "toplevel",
"relevance": 1,
"fields": {
"totalCount": 143
},
"coverage": {
"coverage": 100,
"documents": 143,
"full": true,
"nodes": 1,
"results": 1,
"resultsFull": 1
},
"children": [
{
"id": "id:candidate_profile:candidate_profile::a866fa7f-7e13-48fe-bdca-5a60a3198fd9",
"relevance": 0.01639344262295082,
"source": "candidate_profile",
"fields": {
"matchfeatures": {
"bm25(profile_summary)": 5.470910610067547,
"bm25(skills)": 0,
"firstPhase": 0.8789145673605757,
"nativeRank(profile_summary)": 0.08308099301928237,
"semantic": 0.8789145673605757
},
"skills": [
"HTML",
"CSS",
"Java Script",
"React Js",
"Python",
"Web Designing",
"Leadership",
"Teamwork",
"Observation",
"Time management",
"Communication",
"Avid fitness enthusiast",
"Volunteering",
"Sports",
"English",
"Hindi"
],
"summaryfeatures": {
"bm25(latest_industry)": 0,
"bm25(latest_job_title)": 0,
"bm25(latest_role)": 0,
"bm25(profile_summary)": 5.470910610067547,
"bm25(skills)": 0,
"embedding_sum": 55.06214759836439,
"latest_industry_sum": 40.86598728704121,
"latest_role_sum": 0,
"skill_sum": 52.88688380786334,
"vespa.summaryFeatures.cached": 0
}
}
}
]
}
}
Query I’m Running in Vespa DB :
"yql" : " select * from candidate_profile WHERE userQuery() or (all_role_title matches 'Software Developer') AND (skills matches 'python' OR skills matches 'java') AND (latest_role_title matches 'Senior Developer') or ({scoreThreshold:0.032 ,targetHits: 4}nearestNeighbor(embedding, e))",
"input.query(e)" : 'embed(e5, "query: Candidate who is working as Software Developer, Senior Developer has the following skills python, java.")',
"query": " Candidate who is working as Software Developer, Senior Developer has the following skills python, java.",
"ranking" : "common"
Ranking Profile Which I’ve created :
rank-profile common {
weight skills : 500
weight latest_role : 500
weight latest_industry : 500
weight latest_job_title : 400
inputs {
query(e) tensor<float>(x[384])
}
function semantic() {
expression: max(0, cos(distance(field, embedding)))
}
function semantic_skills() {
expression: max(0, cos(distance(field, skills_embedding)))
}
function semantic_latest_role() {
expression: max(0, cos(distance(field, latest_role_embedding)))
}
function semantic_latest_job_title() {
expression: max(0, cos(distance(field, latest_job_title_embedding)))
}
function semantic_latest_industry() {
expression: max(0, cos(distance(field, latest_industry_embedding)))
}
function keyword_match(){
expression: bm25(skills) + bm25(latest_role) + bm25(latest_industry) + bm25(latest_job_title)
}
first-phase {
expression: sum(keyword_match + semantic)
}
rank-properties {
fieldMatch(skills).occurrenceImportance: 0.5
fieldMatch(skills).proximityCompletenessImportance: 0.9
bm25(skills).k1: 1.5
bm25(skills).b: 0.85
fieldMatch(profile_summary).occurrenceImportance: 0.5
fieldMatch(profile_summary).proximityCompletenessImportance: 0.9
bm25(profile_summary).k1: 1.5
bm25(profile_summary).b: 0.85
}
summary-features: embedding_sum skill_sum latest_role_sum latest_industry_sum bm25(profile_summary) bm25(skills) bm25(latest_role) bm25(latest_industry) bm25(latest_job_title)
function embedding_score() {
expression: attribute(embedding) * query(e)
}
function embedding_sum() {
expression: sum(embedding_score)
}
function skill_score(){
expression : attribute(skills_embedding) * query(e)
}
function skill_sum(){
expression : sum(skill_score)
}
function latest_role_score(){
expression : attribute(latest_role_embedding) * query(e)
}
function latest_role_sum(){
expression : sum(latest_role_score)
}
function latest_industry_score(){
expression : attribute(latest_industry_embedding) * query(e)
}
function latest_industry_sum(){
expression : sum(latest_industry_score)
}
match-features {
bm25(skills)
bm25(profile_summary)
nativeRank(profile_summary)
semantic
firstPhase
}
global-phase {
expression {
reciprocal_rank(semantic)
}
}
}