I built a simple feed forward network, but it predict almost the same value for all test sample. I fine-tuned many parameters, the prediction are almost the same. Hope to hear some suggestions!
prediction for test sample:
- 0.00526326
- 0.00526551
- 0.00526246
- 0.00526332
- 0.00526256
my data(google drive):
https://drive.google.com/drive/folders/1nC48KTEftjYPjYBu5h43cHQt3jbSwhQA
data shape:
X_trn.shape,X_vld.shape,X_tst.shape
((98051, 1), (59638, 1), (2911, 1))
y_trn.shape,y_vld.shape,y_tst.shape
((98051, 1), (59638, 1), (2911, 1))
To simplify the model, it has just 1 layer and one input varible.
- the training and loss plot:
enter image description here - the value and predict value of test sample plot:
enter image description here - The mse and r2 of training, valid and test sample:
enter image description here
loss(mse):
- training sample: around 0.75%
- valid samlple : around 0.87%
- test sample : around 0.79%
the r2:( calculate by my own function below):
- training sample: around 0.45%
- valid samlple : around 0.55%
- test sample : around 2.8%
the r2 of test is 2.8%, which is much greater than r2 of training and valid sample. And the prediction of the test sample are all the same.
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.regularizers import L1L2
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l1, l2, l1_l2
from sklearn.metrics import r2_score as r2_score1
X_trn = pd.read_pickle("X_trn.pkl")
X_vld = pd.read_pickle("X_vld.pkl")
X_tst = pd.read_pickle("X_tst.pkl")
y_trn = PD.read_pickle("y_trn.pkl")
y_vld = pd.read_pickle("y_vld.pkl")
y_tst = pd.read_pickle("y_tst.pkl")
features = list(set(X_trn.columns).difference({'permno','DATE','ret_exc_lead1m','ret','ret_exc'}))
def R_oos_tf(y_true, y_pred):
resid = tf.square(y_true-y_pred)
denom = tf.square(y_true)
return 1 - tf.divide(tf.reduce_mean(resid),tf.reduce_mean(denom))
def R_oos(actual, predicted):
actual, predicted = np.array(actual), np.array(predicted).flatten()
return 1 - ((np.dot((actual-predicted),(actual-predicted)))/(np.dot(actual,actual)))
# parameter:
learning = 0.0001
l1_rate = 0.01
l2_rate = np.nan
batch_size_num = 500
activation = 'relu'
epochs_num = 30
delta_num = 1.0
# Define the model
mod = Sequential()
mod.add(Input(shape=(X_trn[features].shape[1],)))
mod.add(Dense(32, activation = activation, kernel_regularizer=l1(l1_rate)))
mod.add(Dense(1))
# Adam optimizer
opt = Adam(learning_rate=learning)
# Compile the model
mod.compile(loss='mse', optimizer=opt, metrics=[R_oos_tf])
# Fit the model
history = mod.fit(X_trn[features], np.array(y_trn).reshape((len(y_trn), 1)),
epochs=epochs_num, batch_size=batch_size_num,
verbose=1,
validation_data=(X_vld[features], np.array(y_vld).reshape((len(y_vld), 1))))
y_pred = mod.predict(X_tst[features]).flatten()
# 绘制损失
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train and Validation Loss')
plt.legend()
# 绘制指标
plt.subplot(1, 2, 2)
plt.plot(history.history['r_oos_tf'], label='Train R2')
plt.plot(history.history['val_r_oos_tf'], label='Validation R2')
plt.xlabel('Epoch')
plt.ylabel('R2')
plt.title('Train and Validation OOS_R2')
plt.legend()
plt.tight_layout()
plt.show()
pa = {}
pa["loss"] = mod.loss
pa["huber_delta"] = delta_num
pa["activation"] = activation
pa["learning"] = opt.learning_rate.numpy()
pa["n_layers"] = len(mod.layers)
pa["l1"] = l1_rate
pa["l2"] = l2_rate
pa["train_mse"] = history.history['loss'][-1]
pa["valid_mse"] = history.history['val_loss'][-1]
pa["tst_mse"] = mse
pa["train_r2"] = history.history['r_oos_tf'][-1]
pa["valid_r2"] = history.history['val_r_oos_tf'][-1]
pa["test_r2"] = R_oos(y_tst,y_pred)
pa["test_r2_1"] = r2_score1(y_tst,y_pred)
pa["epochs"] =epochs_num
df = pd.DataFrame([pa])
I don’t know if there’s something wrong? Or the almost same prediction is indeed the best prediction for the neural network.