I need help how to configure the proxy server properly while using page.request.get. Here is my setup:
I made sure proxy is configure in the context, and I am able to execute page.goto(pdf_url) no problem:
browser = playwright.chromium.launch(headless=False, slow_mo=1,)
context = browser.new_context(
proxy={"server": proxy_url},
http_credentials={"username": proxy_user, "password": proxy_password},
)
page = context.new_page()
page.goto(pdf_url)
But in order to download the pdf from the page, the only method I found is by using page.request.get(pdf_url). The response.body() from response = page.goto(pdf_url) shows simply an html element. But the
response = page.request.get(pdf_url)
But I am not able to connect the proxy server, with this error:
response = page.request.get(url)
*** playwright._impl._errors.Error: APIRequestContext.get: Socket is closed
Call log:
→ GET https://www.masslandrecords.com/MiddlesexSouth/ACSResource.axd?SCTTYPE=OPEN&URL=d:i2middlesexsouthtemp1huqe045o4ngnw3dbws1jivr8_16_2024_1_22_29_pm151063_10_7_2022.pdf&EXTINFO=&RESTYPE=PDF&ACT=PRINT
- user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36
- accept: */*
- accept-encoding: gzip,deflate,br
- ← 407 too many failures
- connection: close
- x-proxymesh-error: 407
- proxy-authenticate: Basic realm="us-ca.proxymesh.com"
It seems the page.request.get() might need additional proxy auth for it to work. Please advice.
I tried adding a header with basic encoding and parse it into the page.request.get()
auth = base64.b64encode(f"{proxy_user}:{proxy_password}".encode()).decode()
headers = {
"Proxy-Authorization": f"Basic {auth}"
}
response = page.request.get(url, headers=headers)
But I got the same authorization required error.
YQN is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.