I use pytorch and Neural network to do some training for the bond classification, the problem is that the input data is variable size. For example, the cash flows of bond are different, some short term bonds may have several cash flows, other long term bonds may have tens or hundreds cash flows. To handle the case, the input nodes are extends to the maximize of the cash flows and padding 0 for missing values. Anther problem is that there are many time serials data, the datetime is important for the training. For example the cash happens on certain date but different bond vary on the datetime. For Neural network input nodes, I extend the datetime to maximize of the cash flows and padding 0 fro missing values. Suppose the maximize cash flow is 100, then input nodes size is 200 like this:
[date1,date2,date3…date99,date100][cash1,cash2,cash3…cash99,cash100]
Is this the correct solution for the variable size data handling? Are there any better solution for this problem?
Some experience for handle the problems or suggestion.
user25915782 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.