Is there a way to get pandas to ignore missing columns in usecols when reading excel?
I know there is a similar solution for read_csv here
But can’t find a solution for Excel.
For example:
# contents of sample.xlsx:
#A,B,C
#1,2,3
#4,5,6
I’d like to return the following, using pandas.read_excel and the subset, but this raises an error:
pd.read_excel(sample.xlsx, usecols=“A:D”)
ParserError: Defining usecols without of bounds indices is not allowed. [3]
I think the solution is to use a callable, but not sure how to implement it.
1
Yeah. You can use callable function with usecols
parameter in pandas.read_excel. It can be used to filter the columns based on the available columns in Excel file.
import pandas as pd
def filter_usecols(cols):
desired_cols = {'A', 'B', 'C', 'D'}
return [col for col in cols if col in desired_cols]
# Read the Excel file using the callable for usecols
df = pd.read_excel('sample.xlsx', usecols=filter_usecols)
print(df)
1