I’m a medical doctor, just starting Python, so excuse me if I misuse some terms.
We use an Oracle Userform for our Electronic Health System.
Blood Pressure data appears in a text field, and can contain up to 650 individual blood pressure measurements.
This is an example of 4 measurements:
= 01-02-2024 16:05 - A. I AHMED
128 / 84
= 01-02-2024 08:07 - G. I AL KHAIR
122 / 80
= 01-02-2024 00:48 - E. H AL MALKY
122 / 80
= 31-01-2024 00:48 - M. SHIBI
124 /
The main issue with this data is that it can’t be directly analysed, because it’s a string.
Analyses, such as averages or terminal digit preference, requires data to be cleaned, with dates as datetime.date, time as datetime.time, blood pressure as Integer…
It seems that, before cleaning the data, and storing every type of data into a variable (date, time, nurse_name, systolic, diastolic), I can (?) / have to (?) parse and split this multiline data into separate lines.
I tried this:
data = """
= 23-10-2023 01:43 - M. M AL FAKEEH
119 / 79
= 03-10-2023 23:49 - H. I ADAM
112 /
= 03-10-2023 23:49 - H. I ADAM
/ 70
= 03-10-2023 16:32 - S. M JEEZ
/ 76
= 03-10-2023 16:32 - S. M JEEZ
120 /
= 03-10-2023 08:54 - S. I YOUNIS
122 / 81
"""
# Split the data into lines
lines = data.split('n')
raw_BP_data_odd_lines = []
raw_BP_data_even_lines = []
for i, line in enumerate(lines):
if i % 2 == 0:
raw_BP_data_even_lines.append(line)
else:
raw_BP_data_odd_lines.append(line)
# Split the data inside raw_BP_data_odd_lines and raw_BP_data_even_lines
odd_lines_split = [line.split() for line in raw_BP_data_odd_lines]
even_lines_split = [line.split() for line in raw_BP_data_even_lines]
#This split the data in each line into separate words
print("Odd lines:")
#for line in odd_lines_split:
for line in raw_BP_data_odd_lines:
print(line) # This means: Print every line in the variable "odd_lines_split"
print("Even lines:")
#for line in even_lines_split:
for line in raw_BP_data_even_lines:
print(line)
The result contained an unexpected empty line after “Even lines:
Odd lines:
= 23-10-2023 01:43 - M. M AL FAKEEH
= 03-10-2023 23:49 - H. I ADAM
= 03-10-2023 23:49 - H. I ADAM
= 03-10-2023 16:32 - S. M JEEZ
= 03-10-2023 16:32 - S. M JEEZ
= 03-10-2023 08:54 - S. I YOUNIS
Even lines:
119 / 79
112 /
/ 70
/ 76
120 /
122 / 81
I’m guessing the issue lies in choosing the “n” delimiter when splitting the lines.
So I decided to try pasting the blood pressure data from the clipboard to a variable, hoping it would solve the issue of the unexpected space. But the output wad worse (nothing at all was printed):
import pyperclip
# Paste data from the clipboard into a variable
clipboard_data = pyperclip.paste()
# Check if clipboard data is empty
if not clipboard_data:
print("No data found in the clipboard.")
else:
# Split the data into lines
lines = clipboard_data.split('n')
I read, on this forum, about using os.linesp
import os
import pyperclip
text=pyperclip.paste()
#paste will paste a big string of text in 'text' string
# Separate lines and add stars
lines = text.split(os.linesep)
Is os.linesp my best option to avoid the unwanted empty line?
Accessory question:
Am I on the right track in order to clean the blood pressure data and separate it into the 4 different data types it contains?