I’m trying to parse a nested JSON using pandas and currently have below example structure.
input = {'id': ['uid1', 'uid2'],
'code': ['A:D;X:X;Y:E;R:Name;P:3;R:Name2;P:1;Z:3', 'B:1,B:2,R:SomeName,P:1,R:SomeName_Z,P:0,M:1,M:3'],
'values': [';;;;1.0;2.0;3.0;;;0.5;0.1;1.5;;;', ';;;1.5;1.1;1.4;;;4.1;3.2;3.3;;;;'],
'reference': [['Name', 'Name2'], ['SomeName', 'SomeName_T']]}
df = pd.DataFrame(data=input, index=[0, 1])
id code values reference
0 uid1 A:D;X:X;Y:E;R:Name;P:3;R:Name2;P:1;Z:3 ;;;;1.0;2.0;3.0;;;0.5;0.1;1.5;;; [Name, Name2]
1 uid2 B:1,B:2,R:SomeName,P:1,R:SomeName_Z,P:0,M:1,M:3 ;;;1.5;1.1;1.4;;;4.1;3.2;3.3;;;; [SomeName, SomeName_T]
I’m stuck at a point where I need to map each reference
to 3 values. The logic to find those 3 values is as follows:
- Items in code/values columns are separated by
;
. Each item incode
will have one value invalues
column. - If there is any
R:
in code column and the next item isP:
then thatP:
will have 4 values in thevalues
column, first 3 values will be decimals and the 4th one will be empty. We need to assign these 3 values to givenR:
and put them underfirst
,second
,third
columns.
Expected:
id code values reference first second third
0 uid1 A:D;X:X;Y:E;R:Name;P:3;R:Name2;P:1;Z:3 ;;;;1.0;2.0;3.0;;;0.5;0.1;1.5;;; Name 1.0 2.0 3.0
1 uid1 A:D;X:X;Y:E;R:Name;P:3;R:Name2;P:1;Z:3 ;;;;1.0;2.0;3.0;;;0.5;0.1;1.5;;; Name2 0.5 0.1 1.5
2 uid2 B:1,B:2,R:SomeName,P:1,R:SomeName_Z,P:0,M:1,M:3 ;;;1.5;1.1;1.4;;;4.1;3.2;3.3;;;; SomeName 1.5 1.1 1.4
3 uid2 B:1,B:2,R:SomeName,P:1,R:SomeName_Z,P:0,M:1,M:3 ;;;1.5;1.1;1.4;;;4.1;3.2;3.3;;;; SomeName_Z 4.1 3.2 3.3