I’ve build a model with 65 input neurons and 4880 output neurons. My training data is stored in two large text files: ‘X_train.txt’ contains lines where each line represents a list of 65 numbers, and ‘Y_train.txt’ contains lines where each line represents an index number. I need to perform one-hot encoding on ‘Y_train’ to create a list with 4880 zeros and a single ‘1’ at the specified index.
Due to the size of these files, I want to train my model in batches. How can I effectively train my model with these txt files in Python using TensorFlow?
so I tried to get these files tried to convert these files to variables with
with open(xPath, 'r') as file:
line = file.readline()
while line:
x_train.append(eval(line.strip()))
line = file.readline()
x_train = np.array(x_train)
but since the files are too big it takes too much time, it is not the main problem I can wait but the main problem is it uses too much memory…
ba2s is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Try using this function to parse each line.
Then map it in a tensorflow dataset.
def parse_x(line):
# Convert the line to a list of floats
values = tf.strings.to_number(tf.strings.split(line, ' '), tf.float32)
return values
def parse_y(line):
index = tf.strings.to_number(line, tf.int32)
one_hot = tf.one_hot(index, num_outputs)
return one_hot
Daniel Johnson is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.