I’m a researcher analyzing acoustic and aerodynamic data from speech science research participants. I was recently introduced to OOP, and have been considering it as an alternative to my existing code if I wanted to refactor. If, based on my description, anyone has any viable alternatives to OOP, I’d be happy to hear those, as well.
The data for this project are nested. Research participants (first layer) perform different tasks (second layer) that consist of some number of exercise intervals (third layer), each containing multiple utterances (fourth layer):
-
Participant (Parent Class)
- Attributes (Vital Capacity, Age, Sex, etc.)
- Child Classes (Task)
-
Task (Child Class)
- Attributes (Task ID, Date of Data Collection, Number of Intervals per Task, etc.)
- Child Classes (Interval)
-
Interval (Child Class)
- Attributes (Number of Utterances per Interval, Interval Duration, Mean/SD of dependent variables for interval, etc.)
- Child Classes (Utterance)
-
Utterance (Child Class)
- Attributes (Utterance Length, % Vital Capacity Exhaled, Mean/SD of dependent variables for utterance, etc.)
Is this a valid use of OOP parent and child classes? How would I partition raw CSV data into these different classes, or recombine them once partitioned? Could I use methods to calculate things like means or standard deviations separately for each level (participant, task, interval, and utterance)?
I apologize if these are basic questions. I’m okay at Python for being self-taught, but still a novice.
I’ve previously used Python to store and iterate on these data in Pandas dataframes using conventional functions and variables, but bugs are common and it’s difficult to make sense of the code.