I have to read set of csv files which have 5 columns like name, age, address, type and distance. As these column names are string and I want to iterate them in my pandas df. I have created a class that store name with a variable so that if a name change in these files, I just need to update this in one location and work for all case.
class PersonDetailName:
name = "name"
age = "age"
type = "type"
address = "address"
distance = "distance"
Now to access this in iteration I can directly call it like row[PersonDetailName.name"]
and it can handle the name automatically.
If some files which still have these five fields however instead of age they have personAge
and for address they have FullAddress
. How can I handle this without changing much in the code. My approaches for this are:
Dynamic Column Name Mapping
class PaymentDetailName:
age = "age"
address = "address"
dynamic_col_name_mapping = {
"type1": {"PersonAge", "PersonAddress"},
"type2": {"PersonAge1", "PersonAddress1"},
"default": {"age", "address"}
}
@classmethod
def update_payment_name(cls, type):
cls.age, cls.address = cls.dynamic_col_name_mapping[type]
Updating name list in iterator
age, address = PaymentDetailName.dynamic_col_name_mapping[type]
age = row[age]
type = row[PaymentDetailName.type]
I want to understand which of the solution is better and is there other method or programming knowledge I can use to make it more dynamic and less amount of code. The PaymentDetailName class should remain and rest I can update to fulfill this requirement.
Creating a base class and creating two subclass for this seems overkill to me.
Please provide your insights to make better class design.