My setup
I try to use USB FS Host on GD32F405 (=STM32) microcontroller. I use CMSIS and operate directly on registers.
I have Android device, that is working as VCP (Virtual COM port) by default and there is an app, that is running on this android device, that facilitates a simple enough protocol, that requires one device to send parcels in STX-ETX bytes and receive ACK symbol in return as parcel acknowledge. Both devices communicate in turns. If parcel is not ACKed one should resend it with some timeout.
The device descriptor looks completely basic: CDC class, subclass 0, protocol 0, one configuration.
There are 3 endpoints in CDC descriptor: BULK IN, BULK OUT and IRQ IN. I use only BULK ones and exchange raw data using 64-byte packets (size is listed in endpoint descriptors).
The problem:
At some point of communication there is a breaking point, where android device stops receiving data from host. At packet level, I see USB packet ACKs from device (interrupt flag is set), DPIDs are flipped correctly etc., but if i try hard enough to receive an answer and repeat request many times, i receive HALT flag during OUT operation. It feels like there is ZeroWindow-like event as Android device does not perform read operations.
This problem happens at random moment – I have detected it in both “Sender” and “Responder” roles – sometimes system freezes on retransmits from my side, and sometimes i get spammed by requests (my ack is lost).
What i have tried
I have tried to poll IRQ endpoint in parallel with BULK IN, but I have received only a couple of transfers during first polls, than silence (NAKs) only.
I have tried to limit payload to 63 bytes per packet to force “last packet”, but with no luck. In some cases same problem arises even with 20-bytes long requests, so i think it is not a odd-packet issue.
The only solution that works so far
To repair the system I just reinitialize the USB device (Send Bus Reset signal, Set Address and Select Configuration). System starts to work and device is responsive again, but the buffers seem to be flushed. Same problem happens in next couple of minutes.
If i connect the same device to Linux PC and flood it with requests, there is no similar problem happening.
Same MCU setup works well with another CDC device (STM32-based), so it should be an implementation-dependent issue.
What may be the issue? How would one try to troubleshoot similar problem?