—
TLDR before long story:
1-channel TF model behaves differently than 3-channel one. Both converted successfully from Darknet -> TF, but 1-channel model does not perform as well as before conversion.
Task at hand and statements:
I have two trained yolov4-tiny darknet weight files (.weights), other has grayscale input (1-channel) and other has color input (3-channel). I am converting both of the weight files to Tensorflow checkpoint format with a common repository (for this task) found from:
https://github.com/hunglc007/tensorflow-yolov4-tflite.git
Performance of both models has been tested by c++ opencv LoadFromDarknet() and with Python equivalent. Both models are inherently trained with grayscale images and operate on grayscale images. Input for 3-channel model is just grayscale image scaled to 3-channels.
Python version: 3.10.11
TF version: 2.10.1
Problem statement:
Weight file with color input converts fine and works well afterwards when loaded with tf.keras.Models.load_model(X), however when converting grayscale input weight file, performance of the model when loaded with Tensorflow, drops drastically, and I mean detections are bad or non-existent in the most obvious cases where model with color input works perfectly. Note worthy thing is that boxes are not out of place, meaning when detections are found they are about at the correct position, but for example width and height might be off.
I am aware of the common problems with this repository (hardcoded stuff, etc.) and changed parameters accordingly for each conversion/model loading, and no errors what so ever happens during conversion nor the model loading.
I have confirmed Input layers:
Grayscale: (None,640,640,1)
Color: (None,640,640,3)
Test images (for performance testing) are loaded with opencv-python, and their validity has also been reviewed, even if error occurs if data with wrong dimension is being inserted to the input layer.
Architectures excluding Input layer are the same, confirmed with model.summary().
I have noted that architecture produced by model.summary() to a 3-channel model that I have converted few **years ** ago with different TF version is somewhat different. Some tile layers seem to be missing. Also some of the tf operations are with a different name but that might be just the different TF version.
Old Color model:
tf_op_layer_Sigmoid (TensorFlo (None, 40, 40, 3, 2 0 [‘tf_op_layer_split_3[0][0]’]
wOpLayer) )
tf_op_layer_Tile/multiples (Te (5,) 0 [‘tf_op_layer_strided_slice[0][0]
nsorFlowOpLayer) ‘]
tf_op_layer_Sigmoid_3 (TensorF (None, 20, 20, 3, 2 0 [‘tf_op_layer_split_4[0][0]’]
lowOpLayer) ) )
New Grayscale model:
tf.math.sigmoid (TFOpLambda) (None, 40, 40, 3, 2 0 [‘tf.split_3[0][0]’]
)
—tile layer missing here—
tf.math.sigmoid_3 (TFOpLambda) (None, 20, 20, 3, 2 0 [‘tf.split_4[0][0]’]
)
I am quite stuck at the moment, and would appreciate any help. If any additional information is required, I am happy to provide it at the earliest convenience.
Some trial and errors:
-Playing with Yolov4-tiny Head decoding block -> No changes even in cases where model was able to be loaded (wrong dimensions in decode raise errors)
-Older 3-channel model (already converted Darknet -> TF years ago) referenced earlier, works perfectly when loaded the same way than the new models
Kuski is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.