I have a Pandas Series that has these unique values:
array(['17', '19', '21', '20', '22', '23', '12', '13', '15', '24', '25',
'18', '16', '14', '26', '11', '10', '12/16', '27', '10/14',
'16/22', '16/21', '13/17', '14/19', '11/15', '10/15', '15/21',
'13/19', '13/18', '32', '28', '12/15', '29', '42', '30', '31',
'34', '46', '11/14', '18/25', '19/26', '17/24', '19/24', '17/23',
'13/16', '11/16', '15/20', '36', '17/25', '19/25', '17/22',
'18/26', '39', '41', '35', '50', '9/13', '33', '10/13', '9/12',
'93/37', '14/20', '10/16', '14/18', '16/23', '37', '9/11', '37/94',
'20/54', '22/31', '22/30', '23/33', '44', '40', '50/95', '38',
'16/24', '15/23', '15/22', '18/23', '16/20', '37/98', '19/27',
'38/88', '23/31', '14/22', '45', '39/117', '28/76', '33/82',
'15/19', '23/30', '47', '46/115', '14/21', '17/18', '25/50',
'12/18', '12/17', '21/28', '20/27', '26/58', '22/67', '22/47',
'25/51', '35/83', '39/86', '31/72', '24/56', '30/80', '32/85',
'42/106', '40/99', '30/51', '21/43', '52', '56', '25/53', '34/83',
'30/71', '27/64', '35/111', '26/62', '32/84', '39/95', '18/24',
'22/29', '42/97', '48', '55', '58', '39/99', '49', '43', '40/103',
'22/46', '54/133', '25/54', '36/83', '29/72', '28/67', '35/109',
'25/62', '14/17', '42/110', '52/119', '20/60', '46/105', '25/56',
'27/65', '25/74', '21/49', '29/71', '26/59', '27/62'], dtype=object)
The ones that have the ‘/’, I want to split these into arrays and then average their values. One simpler but a flawed approach is to simply extract the first value:
master_data["Cmb MPG"].str.split('/').str[0].astype('int8')
However, what I truly require is the two values being averaged.
I have tried several commands and this one:
np.array(master_data["Cmb MPG"].str.split('/')).astype('int8').mean()
Should ideally do the job, but I get a ValueError followed by a TypeError:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
TypeError: int() argument must be a string, a bytes-like object or a real number, not 'list'
The above exception was the direct cause of the following exception:
ValueError Traceback (most recent call last)
Cell In[88], line 1
----> 1 np.array(master_data["Cmb MPG"].str.split('/')).astype('int8')
ValueError: setting an array element with a sequence.
The slice()
method returns a Series but it won’t proceed either with the splitting of strings.
What is required is:
'18/25' ---> [18, 25] ---> 22 (rounded)