I want my program to do this: Open a .csv file, the file consists of stock prices with 1 minute intervals. My problem is, if there at a speficic time, eg. 06:37:00, where no trades, there are no data in the .csv file for that given time.
So, what I want is to copy the data from the next timestamp with trades, so 06:38:00, and add a 06:37:00 timestamp, with the data from 06:38:00. In my .csv file, I deleted the data for the first 5 minutes, so the first timestamp is 04:06:00.
So what my program does, is that it adds the 04:00:00 timestamp, and adds the data from the 04:06:00 timestamp. Next it should add the 04:01:00 timestamp, again with the 04:06:00 timestamps data, but it adds the data from the 04:07:00 timestamp.
Here is my code:
import csv
import datetime
import itertools
import matplotlib.pyplot as plt
import numpy as np
import datetime as dt
csv_path = "D:/Stockmarket-data/frd_complete_sample/"
with open(csv_path + "SPY_1min_sample_mod.csv", "r") as source,
open(csv_path + "SPY_1min_sample_noGap.csv", "w", newline="") as target:
csv_reader = csv.reader(source)
csv_writer = csv.writer(target)
def write_header():
for row in csv_reader:
# Write the row to the target CSV file
csv_writer.writerow(row)
break
write_header()
def start_time():
#We want the start time to be 04:00:00, but if no trades were made at the given time, no data for this time point exists.
start_datetime = datetime.datetime(1970,1,1,4,0,0)
start_time = start_datetime.time()
csv_next_data = []
for row in csv_reader:
csv_datetime = row[0]
csv_start_date = csv_datetime[0:10]
csv_start_time = csv_datetime[11:19]
if not csv_next_data:
csv_next_data = row[1:]
if csv_start_time != str(start_time):
new_data_list = [csv_start_date + " " + str(start_time)]
for data in csv_next_data:
new_data_list.append(data)
time_change = datetime.timedelta(minutes=1)
start_datetime = start_datetime + time_change
start_time = start_datetime.time()
csv_next_data = row[1:]
csv_writer.writerow(new_data_list)
start_time()
And here is a sample of my data:
timestamp,open,high,low,close,volume
2024-06-24 04:06:00,545.3100,545.4000,545.2500,545.4000,1100
2024-06-24 04:07:00,545.4700,545.4700,545.3600,545.3600,600
2024-06-24 04:08:00,545.4600,545.4600,545.4600,545.4600,305
2024-06-24 04:09:00,545.5000,545.5600,545.4600,545.5400,1125
2024-06-24 04:10:00,545.5400,545.5500,545.5400,545.5500,468
2024-06-24 04:11:00,545.5500,545.5500,545.4500,545.4500,660
2024-06-24 04:12:00,545.5300,545.6200,545.5300,545.6200,314
2024-06-24 04:14:00,545.5300,545.5300,545.5300,545.5300,400
2024-06-24 04:16:00,545.5900,545.5900,545.5800,545.5800,434
2024-06-24 04:19:00,545.4700,545.4700,545.4700,545.4700,650
2024-06-24 04:20:00,545.4900,545.4900,545.4900,545.4900,500
2024-06-24 04:21:00,545.5000,545.5000,545.5000,545.5000,100
2024-06-24 04:24:00,545.5400,545.6000,545.5400,545.6000,635
2024-06-24 04:25:00,545.6100,545.6600,545.6100,545.6600,599
2024-06-24 04:29:00,545.7300,545.7300,545.7300,545.7300,100
2024-06-24 04:31:00,545.6000,545.6000,545.6000,545.6000,200
2024-06-24 04:33:00,545.5500,545.5500,545.5500,545.5500,400
2024-06-24 04:37:00,545.5200,545.5200,545.5200,545.5200,2000
2024-06-24 04:41:00,545.5700,545.5700,545.5700,545.5700,100
2024-06-24 04:44:00,545.5800,545.5800,545.5800,545.5800,162
2024-06-24 04:46:00,545.5700,545.5700,545.5700,545.5700,200
2024-06-24 04:53:00,545.3700,545.4000,545.3700,545.4000,250
2024-06-24 04:54:00,545.4100,545.4100,545.4100,545.4100,100
2024-06-24 04:59:00,545.3800,545.3800,545.3800,545.3800,990
2024-06-24 05:02:00,545.4900,545.4900,545.4900,545.4900,100
2024-06-24 05:03:00,545.4000,545.4000,545.4000,545.4000,184
2024-06-24 05:04:00,545.3600,545.3600,545.3600,545.3600,100
2024-06-24 05:08:00,545.4900,545.4900,545.4900,545.4900,150
2024-06-24 05:11:00,545.3100,545.3100,545.3100,545.3100,525
2024-06-24 05:15:00,545.2100,545.2100,545.2100,545.2100,1600
2024-06-24 05:17:00,545.2300,545.2300,545.1800,545.1800,475
2024-06-24 05:18:00,545.2200,545.2200,545.2200,545.2200,100
2024-06-24 05:20:00,545.2500,545.2500,545.2500,545.2500,500
2024-06-24 05:22:00,545.3300,545.3300,545.3300,545.3300,200
2024-06-24 05:24:00,545.3600,545.3600,545.3600,545.3600,198
2024-06-24 05:25:00,545.3200,545.3200,545.3200,545.3200,100
2024-06-24 05:29:00,545.2500,545.2500,545.2500,545.2500,501
2024-06-24 05:30:00,545.1000,545.1000,545.1000,545.1000,122
2024-06-24 05:31:00,545.1100,545.1100,545.1100,545.1100,200
2024-06-24 05:32:00,545.0900,545.0900,545.0900,545.0900,100
2024-06-24 05:33:00,545.0100,545.0100,544.9100,544.9100,900
2024-06-24 05:34:00,544.9800,544.9800,544.9800,544.9800,100
2024-06-24 05:36:00,544.9600,544.9600,544.9600,544.9600,331
2024-06-24 05:40:00,544.9700,544.9700,544.9700,544.9700,100
2024-06-24 05:45:00,544.8200,544.8200,544.8200,544.8200,100
2024-06-24 05:47:00,544.6900,544.6900,544.6900,544.6900,1000
2024-06-24 05:52:00,544.7000,544.7100,544.7000,544.7100,1500
2024-06-24 05:53:00,544.7200,544.7200,544.7200,544.7200,500
I tried making the program add lines in a .csv file, where the data from the next line is copied, until I met the timestamp of the next line, with 1 minute intervals.
Marco Poulsen is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.