I am using Norse and PyTorch to train a Liquid State Machine (LSM) on the Heidelberg Speech Digit dataset. I’ve designed an architecture for the LSM, but I’m having difficulty training the readout layer.
Here is how I designed LSM:
n_classes = 20
n_channels = 700
n_hidden = 1000
tau_mem = 2
tau_syn = 1
v_th = .6
Jin = 0.05
LIF_params = LIFParameters(
tau_mem_inv = torch.tensor([1/tau_mem]),
tau_syn_inv = torch.tensor([1/tau_syn]),
v_th = torch.tensor(v_th),
)
w0 = 1/np.sqrt(n_hidden)
w_rec = torch.tensor(np.random.uniform(-w0, w0, (n_hidden, n_hidden),), dtype = torch.float)
w_in = torch.tensor(np.random.uniform(-0, Jin, (n_hidden, n_channels)), dtype = torch.float)
w_rec[np.random.rand(w_rec.shape[0], w_rec.shape[1]) < 0.98] = 0
w_in[np.random.rand(w_in.shape[0], w_in.shape[1]) < 0.8] = 0
print('Average Connection per Neuron:', np.mean(np.sum(np.abs(w_rec.detach().cpu().numpy()) > 0, 1)))
reservoir = LIFRecurrent(
input_size = n_channels,
hidden_size = n_hidden,
p = LIF_params,
recurrent_weights = w_rec,
input_weights = w_in
)
def simulate(events):
event = events.squeeze(2).permute(1, 0, 2)
return reservoir(event)
here is the activity of reservoir over a single event:
For training the readout layer, I attempted to sum the spikes over the time steps and then feed this sum into a linear layer whos weights needs to be trained only, followed by a softmax activation. However, I feel that summing the spikes over time might not be the correct approach. Could someone help me identify and fix the issue in the following code?
read_out = torch.nn.Sequential(
torch.nn.Linear(n_hidden, n_classes),
torch.nn.Softmax(dim = 1)
)
loss = torch.nn.CrossEntropyLoss()
optimzer = torch.optim.SGD(read_out.parameters(), lr = 0.1)
for events, labels in dataloader:
output = simulate(events)
output = read_out(output[0].sum(0))
loss_value = loss(output, labels)
optimzer.zero_grad()
loss_value.backward()
optimzer.step()
print(loss_value.item())