Thiết kế website giá rẻ

Question

I’m coding my very first neural network model in python. The data I use for training has 14 features, 4 different classes and a total of 1953 instances. I use mini-batches of 4 instances for training. However, the error only keeps increasing with each training epoch, even when I adjust the learning rate. I will add the methods that I believe might cause the issue below. This is the graph I get for 60 training epochs with a learning rate of 0.001 (changing the learning rate doesn’t change the overall shape of the graph) :

graph of training error over n training epochs

My forward propagation function :

    def forward_propagation(self, X, y):
        # Initialize activations for the input layer
        self.activations[0] = X.T
        
        # Forward propagation through each hidden layer
        for i in range(self.n_layers):
            # Calculate weighted inputs and apply activation function
            self.weighted_inputs[i+1] = np.dot(self.weights[i].T, self.activations[i]) + self.biases[i].T
            #print("n°", i, " weighted inputs = ", self.weighted_inputs[i+1].shape, " weights = ", self.weights[i].T.shape)
            self.activations[i+1], self.df[i] = self.activation_function(self.weighted_inputs[i+1])
        
        # Forward propagation to the output layer with softmax activation
        self.weighted_inputs[-1] = np.dot(self.weights[-2].T, self.activations[-2]) + self.biases[-2].T
        self.activations[-1], self.df[-1] = self.softmax(self.weighted_inputs[-1])
        
        # Compute the error using cross-entropy cost function
        y_hat = self.activations[-1]
        error = self.cross_entropy_cost(y_hat, y)

        # test :
        #print("y_hat, error : ", y_hat, error)
        #print("y_hat shape = ", y_hat.shape)
        
        return y_hat, error

Backward propagation :

    def backward_pass(self, X, y):
        # initialisation des listes pour les erreurs et les ajustements
        delta = [None] * (self.n_layers + 1)
        dW = [None] * (self.n_layers + 1)
        db = [None] * (self.n_layers + 1)

        # propagation avant
        y_hat, error = self.forward_propagation(X, y)

        # calcul de l'erreur de la couche de sortie
        delta[-1] = y_hat - error
        #print("erreur sortie = ", delta[-1])

        # calcul des ajustements des poids et des biais pour la couche de sortie
        dW[-1] = np.dot(delta[-1], self.activations[-2].T)
        db[-1] = np.sum(delta[-1], axis=1, keepdims=True)

        # rétro-propagation de l'erreur pour les couches cachées
        for l in range(self.n_layers-1, -1, -1):
            # erreur de la couche actuelle
            delta[l] = np.multiply(np.dot(self.weights[l+1], delta[l+1]), self.df[l])

            # ajustements des poids & biais de la couche actuelle
            if l == 0:
                dW[l] = np.dot(delta[l], X) # première couche

            else:
                dW[l] = np.dot(delta[l], self.activations[l].T)

            db[l] = np.sum(delta[l], axis=1, keepdims=True)
            self.weights[l] -= (self.learning_rate * dW[l]).T
            self.biases[l] -= (self.learning_rate * db[l]).T

        return error

Function that goes through one epoch of training :

    def epoch(self):
        total_error = 0
        self.batches = self.batch_generator(4)
        self.batches_X_train = [batch[0] for batch in self.batches]
        self.batches_y_train = [batch[1] for batch in self.batches]

        for batch_X, batch_y in zip(self.batches_X_train, self.batches_y_train):
            # passe avant et rétro-propagation pour chaque mini-batch d'entraînement
            error = self.backward_pass(batch_X, batch_y)
            total_error += error

        # erreur moyenne sur tous les mini-batchs d'entraînement pour cette epoch
        average_error = total_error / self.n_batches
        #print("average_error = ", average_error)

        return average_error

fit function which calls epoch n times and shows the graph of training error :

    def fit(self, n_epochs):
        self.n_epochs = n_epochs
        average_error_train = []

        for epoch in range(1, n_epochs + 1):
            print("epoch n°", epoch)
            # Calculate and store the training error for this epoch
            error_train = self.epoch()
            average_error_train.append(error_train)

            print(f"Epoch {epoch}/{n_epochs} - Train Error: {error_train}")

        # Plot the training error
        plt.plot(range(1, n_epochs + 1), average_error_train, label='Train Error')
        plt.xlabel('Epochs')
        plt.ylabel('Error')
        plt.title('Training Error')
        plt.legend()
        plt.show()

        print("total average error train : ", average_error_train)
        return average_error_train

I’ve tested each method separately and the results were coherent. When calling fit(n_epochs) and thus using all these methods together, the error only increases with each epoch. The softmax, cross_entropy_cost and activation functions (tanh or ReLu depending on input) return the correct values and the data has been normalised before training. At this point there is for sure something I must’ve overlooked but I just can’t figure it out.

Thiết kế website giá rẻ

Danh mục

Error of my neural network increases with each training epoch