I apologize for the poor English at first.
I believe most neural networks out there are static i.e they stay how they initially were. There are no dynamic neural networks that I have heard about so far. After much fiddling around, I came up with an idea of a dynamic neural network. My wish to implement it is superseded by the fact that it is computationally expensive even for simple models. Please note that the device I have is not the most advanced one and hence I believe others with capable hardware can do so. I am all ears for feedback on this idea.
NOTE: If you do not understand anything below, I am sorry about that. I will include an example at the end which should make things clear.
The concept:
I am heavily inspired by a biological neuron for this idea. I truly believe that the thing limiting our modern AI models is the fact that they are unable to be dynamic. Then why hasn’t anyone tried this? The reason will be clear to you by the end of this.
The Neuron:
Each neuron will have certain values namely Creation Potential, Destruction Potential, Creation Threshold, Destruction Threshold and Activation thresholds. A Bias or other values that add to the capabilities of the Neuron can be added as needed by the creator. So what is the meaning of these values? Creation Potential describes how much a Neuron wants to create a new connection. Destruction Potential describes how much a Neuron doesn’t want to create a new connection or how much it wants to destroy its connections. The Creation and Destruction Thresholds describe the minimum requirements for the creation and destruction of connections respectively.
Creation and Destruction thresholds don’t really have to be one value rather a number of values which describe a neuron and once again it is up to the creator to use any values they desire.
Activation thresholds will be explained in the example.
The Structure:
How should a neural network using the above be created? Here is a detailed explanation which should cover Activation threshold as well:
Neuron all_neuron[NEURON_COUNT];
We start with a lot of neurons. We first have to initialize each:
for (size_t i = 0; i < NEURON_COUNT; i++)
{
all_neuron[i].init();
}
The init
function should initialize the neuron with random values for everything and the connections
array in the neuron to 0 connections. What is the connections
array?
std::vector<std::pair<double, Neuron>> connections;
What does this array do? It holds the Activation potential for each connection and the neuron it activates. This is the best time to explain Activation Thresholds. Each time a neuron makes a connection with another neuron, it builds a path. It has many such paths. How does the neuron know which path to send the signal through? This is answered by the Activation Thresholds. Each path has its unique Activation Threshold. The path is activated i.e the connected neuron is notified only when the Activation Threshold is fulfilled. Again, it doesn’t have to be just a number. Each new connection adds to this vector and each destruction of connection removes from this vector.
A full explanation:
We start with an array of neurons. We don’t yet have input neurons and output neurons. There are no connections and all we have are Neurons with random values. Now, we randomly choose our input and output neurons. Then we create random connections here and there while making sure that a neuron doesn’t create a connection with itself. How do we make the network useful? Start with the training data and give it to the input neurons. The input neurons will create new connections, destroy some old ones and then eventually create a hidden layer by itself. The connections reach the output layer as well.
Advantages:
- Dynamic model
- Build a complex network with just a few neurons
- Freedom to add interpretation to output values from chosen neurons
- Freedom to add new types of values to make the network more versatile
- More adaptive
- Closer to biological neurons
- And much more obviously
- Continues to learn all the time.
Disadvantages:
- Too much memory required
- Computationally expensive
- Complex
- Slow to learn
- Possibility of closed connections.
The biggest hurdles for this model are: Memory hogging, computationally expensive and the most hard of all, The possibility of the creation of looped connections which degrades the overall network. As the network can continuously build and destroy connections, it continues to learn all the time. With large enough networks, we can have structures forming similar to a biological brain where a given set of inputs triggers a group of neuron which specialize in doing one thing. This gives rise to local specialty in the model itself where the model learns to divide itself into groups to perform different tasks. This should be more easily achievable with more parameters as part of the thresholds. The creator also has the freedom to add new values which may add to the models’ capabilities such as a “Division Threshold” which allows a neuron to divide into 2 potentially imitating “Cell Division”.
The creator has the freedom to choose which output neuron means what. Say 3 output neurons were chosen. If neuron 1 was fired, a program behind may interpret it as “Use camera to take picture” and so the input could have been “Take a picture of me” or something equivalent. If neuron 1 and 3 were fired, it could be interpreted as “Listen for socket connections”. Then the model would require a port and IP which it won’t know. This adds room for the model to improve. The program may instruct the model to “listen” to the next input and provide it as IP address and Port. The model truly is dynamic but extremely complex.
By starting with an array of Neurons, we begin by creating our input and output layers. The very act of using the network will improve it and create a hidden layer. Take this for example: You start with 10 neurons, take 2 input neurons and 2 output neurons. The 2 input neurons can make many many connections with other neurons. Say input neuron 1 made a connection with neuron 4. Does that mean that input neuron 1 can no longer make further connections to neuron 4? No! For one Activation Threshold, neuron 4 may be activated with one value and so the signal could propagate through one path while for the next Activation Threshold, neuron 4 may be activated with a different value which could propagate though a different path.
How the different values are interpreted and used together mathematically depends upon the creator again. One could add a new threshold “Death Threshold”. Upon reaching this, the neuron would kill itself(remove itself from the array and destroy all connections). But how would this threshold be calculated inside? It could be “(creation_threshold + connections[2].first) >= destruction_potential)”.
There are a number of interesting things with this approach:
Since the values are randomly selected, a network with majority of neurons having high Creation Potential, high Destruction Threshold, low Creation Threshold and low Destruction Potential, will create a very complex yet slow network. The ideal ratio is something that would need experimentation. And again, the creator has all the freedom they need to make this in whatever way they like. All I require is feedback on how this type of model performs.
A potential implementation in C++(I don’t know why I chose this):
// Includes here please
class Neuron
{
std::vector<std::pair<double, Neuron>> connections;
double connection_potential, destruction_potential,
connection_threshold, destruction_threshold, other_val1, other_val2;
public:
Neuron(){/*Randomly initialize the values*/}
void check_for_connection_loops();
void add_connection(Neuron *connect_to);
void destroy_connection(Neuron *destroy_from);
void activate(/*Parameters*/);
};
static Neuron all_neurons[10];
void init(); // initialize all_neurons
void choose_io(size_t in_count, out_count); // Randomly choose IO neurons
// Other functions...
int main()
{
init();
choose_io(3, 3);
// Other things...
}