I was doing Neural Networks exercises for my upcoming test and I skipped the last classes, so I can’t figure out how to solve these transformer problems. Unfortunately, asking the teacher, didn’t really work out either…
He gave us the following code:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import layers
Simple keras implementation of a transformer encoder block
class TransformerBlock(layers.Layer):
def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
super().__init__()
self.att = layers.MultiHeadAttention(num_heads=num_heads, key_dim=embed_dim)
self.ffn = keras.Sequential(
[layers.Dense(ff_dim, activation="relu"), layers.Dense(embed_dim),]
)
self.layernorm1 = layers.LayerNormalization(epsilon=1e-6)
self.layernorm2 = layers.LayerNormalization(epsilon=1e-6)
self.dropout1 = layers.Dropout(rate)
self.dropout2 = layers.Dropout(rate)
def call(self, inputs):
attn_output = self.att(inputs, inputs)
attn_output = self.dropout1(attn_output)
out1 = self.layernorm1(inputs + attn_output)
ffn_output = self.ffn(out1)
ffn_output = self.dropout2(ffn_output)
return self.layernorm2(out1 + ffn_output)
Use tf.keras.layers.Embedding layers
To create an initial embedding for a categorical variable.
To create a positional embedding (to be added to another already existing sequence embedding):
-
Define maxLen, the maximum length of your sequences. Let x be a batch of sequences of k-dimensionalvectors (embeddings).
-
batch_size, l, k = x.shape
class positionEmb(layers.Layer):
def __init__(self, max_len, output_dim):
super().__init__()
self.output_dim = output_dim
self.posEmb = layers.Embedding(input_dim=max_len, output_dim=output_dim)
def call(self, inputs):
length = tf.shape(inputs)[1]
batchSize = tf.shape(inputs)[0]
positions = tf.range(length)
position_embeddings = self.posEmb(positions)
position_embeddings = tf.broadcast_to(position_embeddings,
[batchSize,length,self.output_dim])
return position_embeddings
positionEmbedding = positionEmb(1000,3)
x = np.ones((2,4,3))
pos = positionEmbedding(x)
pos
Which gave the following output:
<tf.Tensor: shape=(2, 4, 3), dtype=float32, numpy=
array([[[ 0.0496983 , -0.01929103, 0.04586143],
[ 0.03327824, -0.04059364, 0.01518938],
[-0.02593035, -0.00701543, 0.03312309],
[-0.04562145, -0.02720568, 0.01570702]],
[[ 0.0496983 , -0.01929103, 0.04586143],
[ 0.03327824, -0.04059364, 0.01518938],
[-0.02593035, -0.00701543, 0.03312309],
[-0.04562145, -0.02720568, 0.01570702]]], dtype=float32)>
The 1st problem is to use a transformer in the given dataset. I already wrote some things, to explain better how the data looked.
dfwindTrain = pd.read_csv('wdTrain.csv')
dfwindVal = pd.read_csv('wdVal.csv')
dfwindTest = pd.read_csv('wdTest.csv')
print("Train")
print(dfwindTrain.head())
print("Columns:", dfwindTrain.columns)
print("Val")
print(dfwindVal.head())
print("Columns:", dfwindVal.columns)
print("Test")
print(dfwindTest.head())
print("Columns:", dfwindTest.columns)
Which gave me:
Train
p (mbar) T (degC) Tpot (K) Tdew (degC) rh (%) VPmax (mbar)
0 996.50 -8.05 265.38 -8.78 94.4 3.33
1 996.62 -8.88 264.54 -9.77 93.2 3.12
2 996.84 -8.81 264.59 -9.66 93.5 3.13
3 996.99 -9.05 264.34 -10.02 92.6 3.07
4 997.46 -9.63 263.72 -10.65 92.2 2.94
VPact (mbar) VPdef (mbar) sh (g/kg) H2OC (mmol/mol) rho (g/m**3)
0 3.14 0.19 1.96 3.15 1307.86
1 2.90 0.21 1.81 2.91 1312.25
2 2.93 0.20 1.83 2.94 1312.18
3 2.85 0.23 1.78 2.85 1313.61
4 2.71 0.23 1.69 2.71 1317.19
wv (m/s) max. wv (m/s) wind direction
0 0.21 0.63 S
1 0.25 0.63 S
2 0.18 0.63 S
3 0.10 0.38 E
4 0.40 0.88 S
Columns: Index(['p (mbar)', 'T (degC)', 'Tpot (K)', 'Tdew (degC)', 'rh (%)',
'VPmax (mbar)', 'VPact (mbar)', 'VPdef (mbar)', 'sh (g/kg)',
'H2OC (mmol/mol)', 'rho (g/m**3)', 'wv (m/s)', 'max. wv (m/s)',
'wind direction'],
dtype='object')
Val
p (mbar) T (degC) Tpot (K) Tdew (degC) rh (%) VPmax (mbar)
0 994.77 6.07 279.64 2.92 80.20 9.40
1 994.54 9.88 283.48 4.18 67.58 12.20
2 994.35 13.53 287.15 5.79 59.42 15.53
3 994.04 15.43 289.08 6.84 56.50 17.56
4 993.88 15.54 289.21 7.58 59.02 17.68
VPact (mbar) VPdef (mbar) sh (g/kg) H2OC (mmol/mol) rho (g/m**3)
0 7.54 1.86 4.73 7.58 1237.52
1 8.24 3.95 5.17 8.29 1220.23
2 9.23 6.30 5.79 9.28 1204.01
3 9.92 7.64 6.23 9.98 1195.39
4 10.44 7.25 6.56 10.50 1194.50
wv (m/s) max. wv (m/s) wind direction
0 0.74 1.68 O
1 0.58 1.12 O
2 0.70 1.46 S
3 0.92 1.84 S
4 0.30 0.56 E
Columns: Index(['p (mbar)', 'T (degC)', 'Tpot (K)', 'Tdew (degC)', 'rh (%)',
'VPmax (mbar)', 'VPact (mbar)', 'VPdef (mbar)', 'sh (g/kg)',
'H2OC (mmol/mol)', 'rho (g/m**3)', 'wv (m/s)', 'max. wv (m/s)',
'wind direction'],
dtype='object')
Test
p (mbar) T (degC) Tpot (K) Tdew (degC) rh (%) VPmax (mbar)
0 980.12 18.27 293.11 13.23 72.4 21.03
1 980.66 17.85 292.64 13.14 73.9 20.48
2 981.13 16.65 291.39 13.00 79.0 18.98
3 981.43 15.85 290.56 12.92 82.7 18.04
4 981.71 15.09 289.78 12.86 86.5 17.18
VPact (mbar) VPdef (mbar) sh (g/kg) H2OC (mmol/mol) rho (g/m**3)
0 15.22 5.80 9.72 15.53 1164.71
1 15.13 5.34 9.65 15.43 1167.07
2 14.99 3.99 9.56 15.28 1172.53
3 14.92 3.12 9.51 15.20 1176.18
4 14.86 2.32 9.47 15.14 1179.64
wv (m/s) max. wv (m/s) wind direction
0 2.57 3.64 S
1 2.47 3.62 S
2 1.72 2.64 S
3 1.40 2.04 S
4 1.67 2.38 E
Columns: Index(['p (mbar)', 'T (degC)', 'Tpot (K)', 'Tdew (degC)', 'rh (%)',
'VPmax (mbar)', 'VPact (mbar)', 'VPdef (mbar)', 'sh (g/kg)',
'H2OC (mmol/mol)', 'rho (g/m**3)', 'wv (m/s)', 'max. wv (m/s)',
'wind direction'],
dtype='object')
The second exercise is similar, but it says to use a transformer to predict the temperature for all the next 6 hours in the same dataset.
I watched some youtube videos to try to understand, but I’m only supposed to use tensorflow and keras to solve this, so I haven’t been able to figure it out yet…
Any help would be appreciated!
I tried to do it by myself and failed miserably. Also tried chatGPT, but it also didn’t work very well…