I’ve created a script to issue a POST HTTP request with appropriate parameters, generated upon complying with the fields shown in this image, to this website. The script is supposed to produce a 200 status code, indicating that the response I received from the server was successful.
However, when I run the script, I always get a 428 status code in the second request. How can I fix the issue? Here is the script for your consideration:
import requests
url = 'https://wizzair.com/en-gb'
link = 'https://be.wizzair.com/24.9.0/Api/search/search'
payload = {"isFlightChange":False,"flightList":[{"departureStation":"TIA","arrivalStation":"VIE","departureDate":"2024-09-17"},{"departureStation":"VIE","arrivalStation":"TIA","departureDate":"2024-10-20"}],"adultCount":1,"childCount":0,"infantCount":0,"wdc":True}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'Accept-Encoding': 'gzip, deflate, br, zstd',
'Accept-Language': 'en-US,en;q=0.9',
'Host': 'wizzair.com',
'Referer': 'https://wizzair.com/en-gb',
}
with requests.Session() as session:
session.headers.update(headers)
r = session.get(url)
print(r.status_code)
session.headers['Accept'] = 'application/json, text/plain, */*'
session.headers['Referer'] = 'https://wizzair.com/en-gb/booking/select-flight/TIA/VIE/2024-09-17/2024-10-20/1/0/0/null'
session.headers['Host'] = 'be.wizzair.com'
session.headers['Origin'] = 'https://wizzair.com'
resp = session.post(link,json=payload)
print(resp.status_code)
2
This simply won’t work.
Why? Because Wizzair protects their sites with an anti-bot solution from kasada.io.
If you inspect the headers of the request (in a browser) you’ll notice the following:
access-control-allow-headers: x-kpsdk-ct, x-kpsdk-cd, x-kpsdk-h, x-kpdk-fc, x-kpsdk-v, x-kpsdk-r
just to name a few there.
Also, there’s another header key – x-kpsdk-ct
.
The Kasada blocking errors are usually associated with some custom headers used by Kasada, such as the X-Kpsdk-Ct header
quote source
To bypass it, you’d have to hit the nail on the head of that request, if I can say so, keeping in mind that:
- your headers must be spot on
- you’d probably need to rotate your IP
- there are cookies to go through and get, if you want to appear as a normal user
- you’d have to change your scraping behavior because that gets analyzed too
- you might have to deal with JavaScript fingerprinting
So, you can either do some trial and error and hope for the best or use an automated browser.
Alternatively, you could use a service that offers Web Scraping APIs, but that’s paid.
On the final note, what you’re trying to do is against their ToS.
The issue you’re facing occurs because you’re using an outdated API endpoint (https://be.wizzair.com/24.9.0/Api/search/search
). Wizz Air updates their API version periodically, and using an incorrect version can lead to unexpected errors like the 428 status code you’re seeing.
Here’s how you can fix the problem:
-
Fetch the Current API Version Dynamically: Instead of hardcoding the API version, you can extract it from the initial response when accessing
https://wizzair.com/en-gb
. The API version is embedded within the HTML or JavaScript of the page, typically in a JavaScript variable. -
Set the Correct Headers: Ensure that you include all necessary headers, particularly the
Content-Type
header, which should be set toapplication/json;charset=UTF-8
when sending JSON data in a POST request. -
Avoid Manually Setting the
Host
Header: Therequests
library handles theHost
header automatically based on the URL you’re accessing. Manually setting it can lead to inconsistencies.
Here’s the corrected script:
import requests
import re
url = 'https://wizzair.com/en-gb'
payload = {
"isFlightChange": False,
"flightList": [
{
"departureStation": "TIA",
"arrivalStation": "VIE",
"departureDate": "2024-09-17"
},
{
"departureStation": "VIE",
"arrivalStation": "TIA",
"departureDate": "2024-10-20"
}
],
"adultCount": 1,
"childCount": 0,
"infantCount": 0,
"wdc": True
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
'Referer': 'https://wizzair.com/en-gb',
}
with requests.Session() as session:
session.headers.update(headers)
r = session.get(url)
print(r.status_code)
# Extract the API version from the response
match = re.search(r'apiUrl:s*"https:\/\/be.wizzair.com\/(.*?)\/Api"', r.text)
if match:
api_version = match.group(1)
print('API Version:', api_version)
link = f'https://be.wizzair.com/{api_version}/Api/search/search'
else:
print('Cannot find API version')
exit(1)
# Update headers for the POST request
session.headers.update({
'Accept': 'application/json, text/plain, */*',
'Referer': f'https://wizzair.com/en-gb/booking/select-flight/TIA/VIE/2024-09-17/2024-10-20/1/0/0/null',
'Origin': 'https://wizzair.com',
'Content-Type': 'application/json;charset=UTF-8'
})
# Send the POST request
resp = session.post(link, json=payload)
print(resp.status_code)
if resp.status_code == 200:
# Process the response as needed
print(resp.json())
else:
print('Error:', resp.text)
1
The 428 HTTP status code indicates “Precondition Required,” which means that the server requires certain conditions to be met before it can process the request.
-
Ensure All Required Headers Are Included
session.headers[‘X-Requested-With’] = ‘XMLHttpRequest’
session.headers[‘Content-Type’] = ‘application/json;charset=UTF-8’ -
Check for Cookies
Since you’re using a Session, it should automatically handle cookies. However, check if the GET request sets any cookies that are relevant to the POST request
print(session.cookies.get_dict())
May this work:
import requests
url = ‘https://wizzair.com/en-gb’
link = ‘https://be.wizzair.com/24.9.0/Api/search/search’
payload = {
“isFlightChange”: False,
“flightList”: [
{“departureStation”: “TIA”, “arrivalStation”: “VIE”, “departureDate”: “2024-09-17”},
{“departureStation”: “VIE”, “arrivalStation”: “TIA”, “departureDate”: “2024-10-20”}
],
“adultCount”: 1,
“childCount”: 0,
“infantCount”: 0,
“wdc”: True
}
headers = {
‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36’,
‘Accept’: ‘text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.7′,
‘Accept-Encoding’: ‘gzip, deflate, br, zstd’,
‘Accept-Language’: ‘en-US,en;q=0.9’,
‘Host’: ‘wizzair.com’,
‘Referer’: ‘https://wizzair.com/en-gb’,
‘X-Requested-With’: ‘XMLHttpRequest’,
‘Content-Type’: ‘application/json;charset=UTF-8’,
‘Origin’: ‘https://wizzair.com’,
}
with requests.Session() as session:
session.headers.update(headers)
# First, get the initial page to start a session and collect cookies
r = session.get(url)
print('GET status code:', r.status_code)
print('Cookies after GET:', session.cookies.get_dict())
# Update headers and send POST request
session.headers['Accept'] = 'application/json, text/plain, */*'
session.headers['Referer'] = 'https://wizzair.com/en-gb/booking/select-flight/TIA/VIE/2024-09-17/2024-10-20/1/0/0/null'
session.headers['Host'] = 'be.wizzair.com'
# Now, post to the search API
resp = session.post(link, json=payload)
print('POST status code:', resp.status_code)
print('Response:', resp.text)