I’m currently facing an issue with pycurl while trying to download a somewhat large file from a remote server. Despite trying different methods, I consistently encounter a “502 Bad Gateway” error. Here are different snippets of the code I am trying to use
Writing directly to file:
if reset:
c.setopt(c.FORBID_REUSE ,1)
else:
c.setopt(c.FORBID_REUSE ,0)
c.setopt(c.URL, url + '?' + params_encoded)
c.setopt(c.HTTPHEADER, ['Authorization: Bearer ' + token_json["access_token"]])
c.setopt(c.CONNECTTIMEOUT, 10)
f=open(r'/home/USER/deleteme.json','wb')
c.setopt(pycurl.WRITEDATA, f)
c.perform()
f.close()
return status_code,c
Writing to buffer:
buffer = BytesIO()
if reset:
c.setopt(c.FORBID_REUSE ,1)
else:
c.setopt(c.FORBID_REUSE ,0)
c.setopt(c.URL, url + '?' + params_encoded)
c.setopt(c.HTTPHEADER, ['Authorization: Bearer ' + token_json["access_token"]])
c.setopt(c.CONNECTTIMEOUT, 10)
c.setopt(c.WRITEDATA, buffer)
c.perform()
status_code = c.getinfo(pycurl.RESPONSE_CODE)
response_body = buffer.getvalue().decode('utf-8')
return response_body, status_code,c
Using function to write to buffer in chunks:
buffer = BytesIO()
def write_to_buffer(data):
buffer.write(data)
if reset:
c.setopt(c.FORBID_REUSE ,1)
else:
c.setopt(c.FORBID_REUSE ,0)
c.setopt(c.URL, url + '?' + params_encoded)
c.setopt(c.HTTPHEADER, ['Authorization: Bearer ' + token_json["access_token"]])
c.setopt(c.CONNECTTIMEOUT, 10)
c.setopt(pycurl.RANGE, '0-') #download in chunks?
c.setopt(pycurl.WRITEFUNCTION, write_to_buffer) #use buffer writer function to perform download.
c.perform()
status_code = c.getinfo(pycurl.RESPONSE_CODE)
response_body = buffer.getvalue().decode('utf-8')
return response_body, status_code,c
This approach consistently results in:
{"fault":{"faultstring":"Body buffer overflow","detail":{"errorcode":"protocol.http.TooBigBody"}}}
Interestingly, when using Python’s requests library, the same download operation succeeds without encountering any errors:
requests.get(url, params=params, headers=headers,timeout=5)
The file downloads successfully using requests, suggesting that the issue might be specific to pycurl or its configuration. This is an odd use case scenario where I can’t just use requests, as requests behaves strangely with this server and sometimes infinitely hangs, something that I only managed to resolve using pycurl instead. In no small part, I believe there are issues on the server side as well which is forcing this odd workaround. However, something is clearly off here on my side as well given that requests seems to work for this one case.
Any suggestions on fixing this error with pycurl?