I am using a fully connected Neural Network as a function approximator in a more complex model. When I set up the model using the functional API I get convergence to a proper solution and the neural net does the job.
inputs = keras.Input(shape=(setupDict['inputShape'],), name='input')
x1 = keras.layers.Dense(setupDict['layerNodes'][0], activation='relu', kernel_initializer='he_normal', name='hidden1')(inputs)
x2 = keras.layers.Dense(setupDict['layerNodes'][1], activation='relu', kernel_initializer='he_normal', name='hidden2')(x1)
output1 = keras.layers.Dense(1, kernel_initializer='he_normal', name='outputD')(x2)
output1 = LogisticActivation(0.15, 1.5, 5, 'debt')(output1)
output2 = keras.layers.Dense(1, kernel_initializer='he_normal', name='outputS')(x2)
output2 = LogisticActivation(0.005, 6, 10, 'stab')(output2)
output3 = keras.layers.Dense(1, activation='softplus', kernel_initializer='he_normal', bias_initializer='ones', name='outputV')(x2)
outputs = keras.layers.concatenate([output1, output2, output3], name='concatAll')
# Define the model
model = keras.Model(inputs=inputs,
outputs=outputs)
As there are some more complex use cases I wanted to port this to model subclassing and work with graph execution via tf.function. This is now where things get odd. I subclassed the model class and rewrote it in the follwoing way:
class defaultFreeModel(keras.Model):
def __init__(self, econDict, setupDict, GHDict, **kwargs):
super().__init__(**kwargs)
self.econDict = econDict
self.setupDict = setupDict
self.GHDict = GHDict
# Layers
self.hidden1 = keras.layers.Dense(setupDict['layerNodes'][0], activation='relu', kernel_initializer='he_normal', name='hidden1')
self.hidden2 = keras.layers.Dense(setupDict['layerNodes'][1], activation='relu', kernel_initializer='he_normal', name='hidden2')
self.output1tmp = keras.layers.Dense(1, kernel_initializer='he_normal', name='outputD')
self.output1 = LogisticActivation(0.15, 1.5, 5, 'debt')
self.output2tmp = keras.layers.Dense(1, kernel_initializer='he_normal', name='outputS')
self.output2 = LogisticActivation(0.005, 6, 10, 'stab')
self.output3 = keras.layers.Dense(1, activation='softplus', kernel_initializer='he_normal', bias_initializer='ones',
name='outputV')
self.outputs = keras.layers.Concatenate()
def call(self, inputs):
x = self.hidden1(inputs)
x = self.hidden2(x)
x1 = self.output1tmp(x)
x1 = self.output1(x1)
x2 = self.output2tmp(x)
x2 = self.output2(x2)
x3 = self.output3(x)
return self.outputs([x1, x2, x3])
def compile(self, optimizer, loss_fn):
super().compile()
self.optimizer = optimizer
self.loss_fn = loss_fn
If I run this model with my training loop, everything works fine, and I again get a proper solution. However, if I decorate the call function with tf.function in order to speed up the execution, there is no convergence anymore, and I do not get a solution. Does anyone know what the reason for this could be? The Gradient function is not decorated with tf.function, only the call.
As soon as I decorate the function with tf.function and run the training loop, the loss value stops going down and instead just hovers around some value (Different for each run due to the initializer).
As a test, I also decorated the gradient function with tf.function which would enable full graph execution. This actually speeds up computation substantially but also doesnt converge.
EDIT:
Here is the code of the Layer:
import tensorflow as tf
import keras
class LogisticActivation(keras.layers.Layer):
def __init__(self, level, slope, upper, name):
super().__init__()
self.slope = tf.constant(slope, dtype=tf.float32, name=name + "_slope")
self.level = tf.constant(level, dtype=tf.float32, name=name + "_level")
self.upper = tf.constant(upper, dtype=tf.float32, name=name + "_upper")
@tf.function
def call(self, inputs):
return tf.divide(tf.multiply(self.level, self.upper), tf.constant(1, dtype=tf.float32)+tf.exp(-self.slope*(inputs-self.level)))
def get_config(self):
return {'level': self.level, 'slope': self.slope, 'upper': self.upper}
OliverK is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
3
Your code is ok, but here are some issues that I found from looking at the docs on tensor flow
.
You need to check if the result of your tf.constant
(add a print statement), if it is doing exactly what it says it is doing, you want it to be different variables,
You might need to change tf.constant
to tf.Variable
just because tf.function
might have interpreted it wrongly.
Since these are constants, they might not be correctly tracked as trainable variables or could cause issues with backpropagation.
The better way is to use tf.Variable
If you look at this reference document from tensor flow you would see they used tf.Variable
Tensorflow
docs on Variables used: DOC
x= tf.Variable(1., shape=tf.TensorShape(None))
Having said that you need to change all your variables that have tf.constant
such as self.slope
, self.level
and self.upper
to
This code snippet I provided below.
self.slope = tf.Variable(slope, dtype=tf.float32, name=name + "_slope")
self.level = tf.Variable(level, dtype=tf.float32, name=name + "_level")
self.upper = tf.Variable(upper, dtype=tf.float32, name=name + "_upper")
OR
If your intention is to have constants there’s a way tensorflow does it cc: docs
I provided
for example:
w = tf.Variable([[1.], [2.]])
x = tf.constant([[3., 4.]])
tf.matmul(w, x)
tf.sigmoid(w + x)
0R
You can do this modify the get_config
function
def get_config(self):
config = super().get_config()
config.update({
'level': self.level.numpy(), # Convert to Python scalar
'slope': self.slope.numpy(), # Convert to Python scalar
'upper': self.upper.numpy() # Convert to Python scalar
})
return config