Say one is reading a csv file.
It has a column, UnitPrice, that has decimals.
In Pyspark, code was constructed in error with
StructField("UnitPrice", IntegerType())
The data frame is loading the file when using the schema with NULL values for column UnitPrice.
df=spark.read.format("csv").schema(ordersSchema).load
I want an error to be thrown as it is not loading this column’s data correctly from the CSV file.
NOTE: There may exist row with null values for UnitPrice.