I have a Coral USB Accelerator and ran the setup on Windows with Python 3.9 and was able to run the example without any issues.
> python examples/classify_image.py --model test_data/mobilenet_v2_1.0_224_inat_bird_quant_edgetpu.tflite --labels test_data/inat_bird_labels.txt --input test_data/parrot.jpg
----INFERENCE TIME----
Note: The first inference on Edge TPU is slow because it includes loading the model into Edge TPU memory.
13.1ms
2.7ms
2.6ms
2.6ms
2.7ms
-------RESULTS--------
Ara macao (Scarlet Macaw): 0.75781
I now want to run a Hugging Face model which I have converted to Tensor Flow Lite using optimum-cli
> optimum-cli export tflite --model tahaenesaslanturk/mental-health-classification-v0.2 --quantize int8 --sequence_lengt
h 128 mental_tflite8/
This created a TensorFlow Lite model using version 2.12.1 which was incompatible with the tflite-runtime version. Unfortunately, that package version doesn’t exist for Windows using Python 3.9 so I had to compile it from the source.
Now that the versions match the following code works
import tflite_runtime.interpreter as tflite
interpreter = tflite.Interpreter(model_path='mental_tflite8/model.tflite',
experimental_delegates=[tflite.load_delegate('edgetpu.dll')])
interpreter.allocate_tensors()
inputs = interpreter.get_input_details()
for i in range(0, len(inputs)):
print('Input: ', i, inputs[i])
outputs = interpreter.get_output_details()
for i in range(0, len(outputs)):
print('Output: ', i, outputs[i])
Which produces the following output:
> python test.py
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Input: 0 {'name': 'model_attention_mask:0', 'index': 0, 'shape': array([ 1, 128]), 'shape_signature': array([ 1, 128]), 'dtype': <class 'numpy.int64'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}
Input: 1 {'name': 'model_input_ids:0', 'index': 1, 'shape': array([ 1, 128]), 'shape_signature': array([ 1, 128]), 'dtype': <class 'numpy.int64'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}
Input: 2 {'name': 'model_token_type_ids:0', 'index': 2, 'shape': array([ 1, 128]), 'shape_signature': array([ 1, 128]), 'dtype': <class 'numpy.int64'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}
Output: 0 {'name': 'StatefulPartitionedCall:0', 'index': 5099, 'shape': array([ 1, 15]), 'shape_signature': array([ 1, 15]), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}
Does anyone know how I can feed the TFlite model text? For the Hugging Face Transformer the code is as simple as:
from transformers import pipeline
pipe = pipeline("text-classification", model="tahaenesaslanturk/mental-health-classification-v0.2")
print(pipe('I worry a lot'))
Which returns
[{'label': 'anxiety', 'score': 0.6711282730102539}]