the net config:
protected int[] cnnStrides = {1, 2};// Strides for each CNN layer
protected int[] cnnNeurons = {72, 36}; //cnn各层的神经元数量
protected int[] rnnNeurons={64,32};//rnn各层的神经元数量
int[] cnnKernelSizes = {3, 3}; // Kernel sizes for each CNN layer
int[] cnnPaddings = {1,1}; // Paddings for each CNN layer
public MultiLayerConfiguration getNetConf() {
DataType dataType = DataType.FLOAT;
NeuralNetConfiguration.Builder nncBuilder = new NeuralNetConfiguration.Builder()
.seed(System.currentTimeMillis())
.weightInit(WeightInit.XAVIER)
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.updater(new Adam(lrSchedule))//(lrSchedule))
// .gradientNormalization(GradientNormalization.RenormalizeL2PerLayer)
.dataType(dataType);
nncBuilder.l1(l1);
nncBuilder.l2(l2);
NeuralNetConfiguration.ListBuilder listBuilder = nncBuilder.list();
int nIn = featuresCount;//36
int layerIndex = 0;
final int cnnLayerCount = cnnNeurons.length;
// Add CNN layers
for (int i = 0; i < cnnLayerCount; i++) {
listBuilder.layer(layerIndex, new Convolution1D.Builder()
.kernelSize(cnnKernelSizes[i])
.stride(cnnStrides[i])
.padding(cnnPaddings[i])
.nIn(nIn)
.nOut(cnnNeurons[i])
.activation(Activation.RELU)
.build());
nIn = cnnNeurons[i];
++layerIndex;
}
// Add RNN layers
for (int i = 0; i < this.rnnNeurons.length; ++i) {
listBuilder.layer(layerIndex, new LSTM.Builder()
.dropOut(dropOut)
.activation(Activation.SOFTSIGN)
.nIn(nIn)
.nOut(rnnNeurons[i])
.build());
nIn = rnnNeurons[i];
++layerIndex;
}
listBuilder.layer(layerIndex, new RnnOutputLayer.Builder(new LossMSE()).updater(new Adam(outLrSchedule))//
.activation(Activation.IDENTITY).nIn(nIn).nOut(1).build());
MultiLayerConfiguration conf = listBuilder.build();
return conf;
}
Exception in thread “main” java.lang.IllegalStateException: Sequence lengths do not match for RnnOutputLayer input and labels:Arrays should be rank 3 with shape [minibatch, size, sequenceLength] – mismatch on dimension 2 (sequence length) – input=[256, 32, 30] vs. label=[256, 1, 30]
at org.nd4j.common.base.Preconditions.throwStateEx(Preconditions.java:639)
at org.nd4j.common.base.Preconditions.checkState(Preconditions.java:337)
at org.deeplearning4j.nn.layers.recurrent.RnnOutputLayer.backpropGradient(RnnOutputLayer.java:59)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.calcBackpropGradients(MultiLayerNetwork.java:1998)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2813)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.computeGradientAndScore(MultiLayerNetwork.java:2756)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:174)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:61)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fitHelper(MultiLayerNetwork.java:1767)
at org.deeplearning4j.nn.multilayer.MultiLayerNetwork.fit(MultiLayerNetwork.java:1688)
at com.cq.aifocusstocks.train.RnnPredictModel.train(RnnPredictModel.java:175)
at com.cq.aifocusstocks.train.CnnLstmRegPredictor.trainModel(CnnLstmRegPredictor.java:209)
at com.cq.aifocusstocks.train.TrainCnnLstmModel.main(TrainCnnLstmModel.java:15)
The exception message indicates that the sequence lengths between the output layer’s input and the labels do not match, but both are 30. The issue seems to be with the first dimension (input=[256, 32, 30] vs. label=[256, 1, 30]). The output layer’s nIn is 32, and shouldn’t the labels match the shape of the output? Why should it match the input shape?
user25261821 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.