So Im using voyager api to get profile infor on linkedin , also i have another api called linkedin-api to scrape linkedin profile. but the problem is while im getting all the data, i only get the first 5 experiences even if there are more than 5 experiences and I only get first 3 educations even if there are more. It seems the response is paginated but im not not able to access the remaining education and experince details.
Thats the response showing the pagination im just showing a snippet of the json,
“positionGroupView”: {
“paging”: {
“start”: 0,
“count”: 5,
“total”: 9,
“links”: []
},.
Now how can I access all the education and experince details. do i need to change the end points or anything else. Would love some help here! Much Thanks
here are the necessary docs: scrape linkedin data
Linkedin api end points:
import requests
import json
def get_all_experiences(company_id):
# Properly defined headers
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36",
"csrf-token": "ajax:xxxxxxxxxxxxxxxxx" # Placeholder for the CSRF token
}
company_link = f'https://www.linkedin.com/voyager/api/identity/profiles/{company_id}/profileView'
all_experiences = []
start = 0
count = 5 # The number of items to retrieve per request
with requests.Session() as s:
# Ensure these tokens are valid
s.cookies['li_at'] = "xxxxxxxxxxxxxxx"
s.cookies['JSESSIONID'] = "ajax:xxxxxxxxxxx"
# Update headers to include the csrf-token from JSESSIONID
s.headers.update(headers)
while True:
# Add pagination parameters to the URL
paginated_link = f"{company_link}?start={start}&count={count}"
# Make the GET request
response = s.get(paginated_link)
# Handle potential errors
if response.status_code == 200:
response_dict = response.json()
# Assuming experiences are found in response_dict['positionGroupView']['elements']
experiences = response_dict.get('positionGroupView', {}).get('elements', [])
# Append retrieved experiences to the all_experiences list
all_experiences.extend(experiences)
# Get pagination info
paging = response_dict.get('positionGroupView', {}).get('paging', {})
total = paging.get('total', 0)
# Break the loop if we've retrieved all experiences
if start + count >= total:
break
# Update the start index for the next iteration
start += count
else:
print(f"Error: {response.status_code} - {response.text}")
return None
return all_experiences
# Example usage
experiences = get_all_experiences('adrian-von-lewinski')
print(json.dumps(experiences, indent=2))
PRIYANSHU is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.