Relative Content

Tag Archive for dataframefor-looppyspark

Transfering information between dataframe rows

I have a Dataframe in the form of a genealogy tree with the following columns – (“Generation”, “Child_name”, “child_hair_color”, “Parent_name”, “parent_hair_color”, “parent_eye_color”).
The oldest generation has the assigned value “0” in the column generation, the youngest I assume the maximum value.
I’d like to transfer the information about black hair color from daughter to mother, from mother to grandmother, etc. (but it is important to do this step by step not go through whole Dataframe once).
The only one condition is to stop to transfer information when parent_eye_color would be “Hazel”.