I have readings from a single sensor. Sample data looks as follows –
<code>df = pd.DataFrame({'date': ['2010-01-01 00:00:00+00:00','2010-01-01 01:00:00+00:00','2010-01-01 02:00:00+00:00','2010-01-01 03:00:00+00:00','2010-01-01 04:00:00+00:00','2010-01-01 05:00:00+00:00','2010-01-01 06:00:00+00:00','2010-01-01 07:00:00+00:00','2010-01-01 08:00:00+00:00','2010-01-01 09:00:00+00:00'],
'sensor_861': [13.9285,13.278501,12.7285,12.3285,12.8285,16.478498,21.4285,24.228498,24.7785,23.8785]})
</code>
<code>df = pd.DataFrame({'date': ['2010-01-01 00:00:00+00:00','2010-01-01 01:00:00+00:00','2010-01-01 02:00:00+00:00','2010-01-01 03:00:00+00:00','2010-01-01 04:00:00+00:00','2010-01-01 05:00:00+00:00','2010-01-01 06:00:00+00:00','2010-01-01 07:00:00+00:00','2010-01-01 08:00:00+00:00','2010-01-01 09:00:00+00:00'],
'sensor_861': [13.9285,13.278501,12.7285,12.3285,12.8285,16.478498,21.4285,24.228498,24.7785,23.8785]})
</code>
df = pd.DataFrame({'date': ['2010-01-01 00:00:00+00:00','2010-01-01 01:00:00+00:00','2010-01-01 02:00:00+00:00','2010-01-01 03:00:00+00:00','2010-01-01 04:00:00+00:00','2010-01-01 05:00:00+00:00','2010-01-01 06:00:00+00:00','2010-01-01 07:00:00+00:00','2010-01-01 08:00:00+00:00','2010-01-01 09:00:00+00:00'],
'sensor_861': [13.9285,13.278501,12.7285,12.3285,12.8285,16.478498,21.4285,24.228498,24.7785,23.8785]})
<code>df.shape
(10 x 2)
</code>
<code>df.shape
(10 x 2)
</code>
df.shape
(10 x 2)
I am developing an LSTM-Autoencoder & using the following steps –
- I prepare my data first by reshaping the data using function
create_dataset
. I create a numpy ndarray compatible for training fromdf_new
, for this I use a sequence length of 3 for creating these sample
<code>from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler = scaler.fit(df_new[['sensor_861']])
df_new['sensor_861'] = scaler.transform(df_new[['sensor_861']])
def create_dataset(X, sequence_length =3,cols = ['date', 'sensor_861']):
Xs, idx = [], []
for i in range(len(X) - sequence_length +1):
v = X[cols[1]].iloc[i:(i + sequence_length)].values
v = v.reshape(sequence_length, 1)
Xs.append(v)
idx.append(X[cols[0]].iloc[i:(i + sequence_length)].reset_index(drop = True))
return np.array(Xs), list(idx)
# Creating Training data
sequence_length = 3
train_df, train_idx = create_dataset(df_new,sequence_length,['date', 'sensor_861'])
</code>
<code>from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler = scaler.fit(df_new[['sensor_861']])
df_new['sensor_861'] = scaler.transform(df_new[['sensor_861']])
def create_dataset(X, sequence_length =3,cols = ['date', 'sensor_861']):
Xs, idx = [], []
for i in range(len(X) - sequence_length +1):
v = X[cols[1]].iloc[i:(i + sequence_length)].values
v = v.reshape(sequence_length, 1)
Xs.append(v)
idx.append(X[cols[0]].iloc[i:(i + sequence_length)].reset_index(drop = True))
return np.array(Xs), list(idx)
# Creating Training data
sequence_length = 3
train_df, train_idx = create_dataset(df_new,sequence_length,['date', 'sensor_861'])
</code>
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler = scaler.fit(df_new[['sensor_861']])
df_new['sensor_861'] = scaler.transform(df_new[['sensor_861']])
def create_dataset(X, sequence_length =3,cols = ['date', 'sensor_861']):
Xs, idx = [], []
for i in range(len(X) - sequence_length +1):
v = X[cols[1]].iloc[i:(i + sequence_length)].values
v = v.reshape(sequence_length, 1)
Xs.append(v)
idx.append(X[cols[0]].iloc[i:(i + sequence_length)].reset_index(drop = True))
return np.array(Xs), list(idx)
# Creating Training data
sequence_length = 3
train_df, train_idx = create_dataset(df_new,sequence_length,['date', 'sensor_861'])
Here is shape of the train_df
<code>train_df.shape
(8, 3, 1)
</code>
<code>train_df.shape
(8, 3, 1)
</code>
train_df.shape
(8, 3, 1)
- Next I define Keras model for training as follows –
<code>model = Sequential()
model.add(LSTM(units=64,input_shape=(train_df.shape[1], train_df.shape[2])))
model.add(Dropout(rate=0.2))
model.add(RepeatVector(n=train_df.shape[1]))
model.add(LSTM(units=64, return_sequences=True))
model.add(Dropout(rate=0.2))
model.add(TimeDistributed(Dense(units=train_df.shape[2])))
model.compile(loss='mae', optimizer='adam')
</code>
<code>model = Sequential()
model.add(LSTM(units=64,input_shape=(train_df.shape[1], train_df.shape[2])))
model.add(Dropout(rate=0.2))
model.add(RepeatVector(n=train_df.shape[1]))
model.add(LSTM(units=64, return_sequences=True))
model.add(Dropout(rate=0.2))
model.add(TimeDistributed(Dense(units=train_df.shape[2])))
model.compile(loss='mae', optimizer='adam')
</code>
model = Sequential()
model.add(LSTM(units=64,input_shape=(train_df.shape[1], train_df.shape[2])))
model.add(Dropout(rate=0.2))
model.add(RepeatVector(n=train_df.shape[1]))
model.add(LSTM(units=64, return_sequences=True))
model.add(Dropout(rate=0.2))
model.add(TimeDistributed(Dense(units=train_df.shape[2])))
model.compile(loss='mae', optimizer='adam')
Model Summary confirms the output from the last layer is a 24 x 1 vector
- Next I train the model as an autoencoder where both train & target are same –
<code>history = model.fit(train_df, train_df,epochs=10,batch_size=32,validation_split=0.1,shuffle=False)
</code>
<code>history = model.fit(train_df, train_df,epochs=10,batch_size=32,validation_split=0.1,shuffle=False)
</code>
history = model.fit(train_df, train_df,epochs=10,batch_size=32,validation_split=0.1,shuffle=False)
- The model training was successful. However, when I use the model for prediction I get an output of
3,3,1
i.e. 3 vector’s of size3x1
. I was expecting an output of the size3,1
.
<code>history = model.fit(train_df, train_df,epochs=10,batch_size=32,validation_split=0.1,shuffle=False)
</code>
<code>history = model.fit(train_df, train_df,epochs=10,batch_size=32,validation_split=0.1,shuffle=False)
</code>
history = model.fit(train_df, train_df,epochs=10,batch_size=32,validation_split=0.1,shuffle=False)
Where am I going wrong?