This is my first model using DJL. I am using PyTorch as the underlying ML engine. I am trying to use the StanfordQuestionAnsweringDataset for question & answering pattern. Using the DJL example, I came up with the following code:
public class LstmExample {
public static void main(String[] args) throws Exception {
int batchSize = 32;
Arguments arguments = new Arguments().parseArgs(args);
if (arguments == null) {
return;
}
// Step 1: Prepare TextDataset for training and testing
StanfordQuestionAnsweringDataset trainDataset = StanfordQuestionAnsweringDataset.builder()
.setSampling(batchSize, true)
.optUsage(Dataset.Usage.TRAIN)
.optDataBatchifier(new SimplePaddingStackBatchifier(0))
.optLabelBatchifier(new SimplePaddingStackBatchifier(0))
.build();
trainDataset.prepare(new ProgressBar());
// Prepare StanfordQuestionAnsweringDataset for testing
StanfordQuestionAnsweringDataset testDataset = StanfordQuestionAnsweringDataset.builder()
.setSampling(batchSize, true)
.optUsage(Dataset.Usage.TEST)
.optDataBatchifier(new SimplePaddingStackBatchifier(0))
.optLabelBatchifier(new SimplePaddingStackBatchifier(0))
.build();
testDataset.prepare(new ProgressBar());
// Step 2: Create your model
Model model = Model.newInstance("lstm", arguments.getEngine());
model.setBlock(getLSTMModel());
// Step 3: Create a trainer
DefaultTrainingConfig config = new DefaultTrainingConfig(Loss.softmaxCrossEntropyLoss())
.addEvaluator(new Accuracy())
.optDevices(new Device[] {Device.cpu()});
try (Trainer trainer = model.newTrainer(config)) {
// Step 4: Initialize trainer with proper input shape
trainer.initialize(new Shape(32, 5, 28, 28));
// Step 5: Train your model
EasyTrain.fit(trainer, 2, trainDataset, testDataset);
// Step 6: Evaluate your model
EasyTrain.evaluateDataset(trainer, testDataset);
}
}
private static Block getLSTMModel() {
SequentialBlock block = new SequentialBlock();
block.addSingleton(
input -> {
Shape inputShape = input.getShape();
long batchSize = inputShape.get(0);
long channel = inputShape.get(1);
long time = inputShape.size() / (batchSize * channel);
return input.reshape(new Shape(batchSize, time, channel));
});
block.add(
new LSTM.Builder()
.setStateSize(64)
.setNumLayers(1)
.optDropRate(0)
.optReturnState(false)
.build());
block.add(BatchNorm.builder().optEpsilon(1e-5f).optMomentum(0.9f).build());
block.add(Blocks.batchFlattenBlock());
block.add(Linear.builder().setUnits(10).build());
return block;
}
}
Looks like the dataset needs to be created in a different way. I am getting the error as below. Please advise.
18:14:33,120 INFO ~ PyTorch graph executor optimizer is enabled, this may impact your inference latency and throughput. See: https://docs.djl.ai/docs/development/inference_performance_optimization.html#graph-executor-optimization
18:14:33,133 INFO ~ Number of inter-op threads is 4
18:14:33,133 INFO ~ Number of intra-op threads is 4
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 0 out of bounds for length 0
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:266)
at java.base/java.util.Objects.checkIndex(Objects.java:359)
at java.base/java.util.ArrayList.get(ArrayList.java:427)
at ai.djl.translate.PaddingStackBatchifier.findMaxSize(PaddingStackBatchifier.java:141)
at ai.djl.translate.SimplePaddingStackBatchifier.batchify(SimplePaddingStackBatchifier.java:53)
at ai.djl.training.dataset.DataIterable.fetch(DataIterable.java:180)
at ai.djl.training.dataset.DataIterable.next(DataIterable.java:145)
at ai.djl.training.dataset.DataIterable.next(DataIterable.java:43)
at ai.djl.training.EasyTrain.fit(EasyTrain.java:54)
at scs.deeplearning.inference.LstmExample.main(LstmExample.java:63)