Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

USB CDC can receive but cannot transmit data (IDFGH-12740) #13722

Closed
3 tasks done
StevenMacias opened this issue Apr 30, 2024 · 10 comments
Closed
3 tasks done

USB CDC can receive but cannot transmit data (IDFGH-12740) #13722

StevenMacias opened this issue Apr 30, 2024 · 10 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@StevenMacias
Copy link
Contributor

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v5.1.2

Espressif SoC revision.

ESP32-S3

Operating System used.

Linux

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

Development Kit.

ESP32-S3-DevKitC-1 v1.0

Power Supply used.

USB

What is the expected behavior?

Reliable bi-directional USB CDC communication.

What is the actual behavior?

The board can receive data but cannot send data back to the host.

Steps to reproduce.

  1. get_idf
  2. cp -r $IDF_PATH/examples/peripherals/usb/device/tusb_serial_device Workspace/local/espressif/serial_issue
  3. cd Workspace/local/espressif/serial_issue
  4. idf.py set-target esp32s3
  5. Connect micro USB to the UART connector on the DevKit
  6. idf.py flash monitor
  7. Connect micro USB to the USB connector on the DevKit
  8. Execute spamming script python serial_spammer.py
import serial
import time

# Configure the serial port
ser = serial.Serial(
    port='/dev/ttyACM0',  # Change this to your UART port
    baudrate=115200,      # Change this to match your device's baud rate
    timeout=0.01,         # Timeout for readline() function
    rtscts = False, 
    dsrdtr = False,
)

# Open the serial port
#if(ser.isOpen() == False):
#    ser.open()

if ser.is_open:
    try:
        # Track the start time
        start_time = time.time()
        for i in range(10000):
            message = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.\n'
            ser.write(message.encode())

            # Read the reply
            replied = False
            while replied == False:
                reply = ser.readline().decode().strip()
                if reply != "":
                    print("Message sent:", message)
                    print("Reply received:", reply)
                    replied = True
            
            time.sleep(0.25)

        # Calculate the elapsed time
        elapsed_time = time.time() - start_time
        print("Script has been running for {:.2f} minutes.".format(elapsed_time / 60))

    except Exception as e:
        print("Error:", e)
        print("Closing Serial")
        ser.close()

    finally:
        # Close the serial port
        ser.close()
else:
    print("Serial port is not open.")
  1. Disconnect USB cable while spamming. The spammer program will exit.
  2. Reconnect USB cable and execute spamming script again.
  3. Repeat step 9 and 10 until the spammer program does not show replies. At this point the issue is triggered. You can send data, the ESP32 will receive data, it will 'reply' but the data will never reach the host.
  4. Double check the issue with minicom -D /dev/ttyACM0. Bytes are received and plotted in the idf.py monitor but not echoed to minicom.
    ...

Debug Logs.

No response

More Information.

Our project only has one USB (not self-powered device) but this issue appears by just using the interface. However, the symptoms are exactly the same than when the USB is disconnected in the DevKit. The USB disconnection helps us to trigger the issue quickly but from our experience we can say it can happen at any moment. Right after reset or after an hour of usage.

This issue seems related to #13518, however we believe that reproducing it with a vanilla example and a DevKit might be helpful to triangulate the root cause. Following some of the comments, we are using the latest version of espressif/tinyusb (0.15.0~8) but the issue persists.

@StevenMacias StevenMacias added the Type: Bug bugs in IDF label Apr 30, 2024
@StevenMacias StevenMacias changed the title USB CDC can received but not transmit data USB CDC can receive but cannot transmit data Apr 30, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Apr 30, 2024
@github-actions github-actions bot changed the title USB CDC can receive but cannot transmit data USB CDC can receive but cannot transmit data (IDFGH-12740) Apr 30, 2024
@peter-marcisovsky
Copy link
Collaborator

Hi @StevenMacias thanks for reporting.
I have managed to replicate the issue by following your steps to reproduce.
Will take a look at it an let you know.

@espressif-bot espressif-bot added Status: In Progress Work is in progress and removed Status: Opened Issue is new labels May 3, 2024
@StevenMacias
Copy link
Contributor Author

Hi @peter-marcisovsky,
do you have any update regarding this issue? It is impacting seriously our project and we cannot find a workaround...
Thanks!

@peter-marcisovsky
Copy link
Collaborator

Hi @StevenMacias, sorry for a delay. After some extensive testing, following your steps to reproduce, I have found a repeating pattern, causing an error.

I observed, that the USB device sometimes sends a 0 length CDC IN transfer to the USB host, right after re-plugging the USB device. The device sends the transfer before the USB host is ready to receive one, which in general might cause transfer problems. (The host indicates readiness for data reception by usb_cdc_line_state_changed_callback.)

Once the 0 length CDC in transfer appears before the host is ready to receive any transfer, the error is triggered and the USB device behaves just like you are explaining.

To fix the issue, I canceled all the pending transfers after the USB device is mounted, by stalling and un-stalling the endpoints. I could not reproduce the error anymore after that.

A pseudo function to cancel the pending transfers, would be placed into the TinyUSB device stack usbd.c and would look like this:

void usbd_edpt_stall_unstall(uint8_t rhport, uint8_t ep_addr) {
    usbd_edpt_stall(rhport, ep_addr);
    usbd_edpt_clear_stall(rhport, ep_addr);
}

The function shall be invoked before any data transfer is made, or right after unplugging the USB device. In my case, I used mount callback from TinyUSB TU_ATTR_WEAK void tud_mount_cb(void);
The callback is used, for example, in our addition to the MSC Device class here, this way the endpoints are cleared, right after enumerating and mounting of the USB device.

The TinyUSB log of stalling and un-stalling the endpoints shall look like this:

Mounted                       // USB device mounted

    Stall EP 81               // Stall Interrupt IN EP 0x81 
    Clear Stall EP 81
    Stall EP 02               // Stall Bulk OUT EP 0x02
    Clear Stall EP 02
    Stall EP 82               // Stall Bulk IN EP 0x82
    Clear Stall EP 82

What could also be a problem in your case especially, is the way how the tinyusb_cdc_rx_callback in tusb_serial_device example is written. There is sending CDC data and receiving CDC data happening in the callback, which might be causing issues in your case, if you are using the similar logic in your code. I will be updating the callbacks in our examples accordingly using xSendQueue and xReceiveQueue in following days.

I hope this workaround resolves the issue you are having.

@StevenMacias
Copy link
Contributor Author

Thanks a lot for your reply! We will test it immediately and see if it solves the issue. Thanks again 😄

@peter-marcisovsky
Copy link
Collaborator

Hi @StevenMacias any update on this issue?
If you problem was solved, feel free to close the issue.

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: In Progress Work is in progress labels Jul 9, 2024
@StevenMacias
Copy link
Contributor Author

@peter-marcisovsky so sorry for the delay, we have been a quite overloaded lately. This does not seems to solve our issue since it also fails during runtime, not only when plugging and unplugging the cable.

This is how our TX callback looks like:

esp_err_t usb_cdc_serial_tx(const char* buffer, size_t length)
{
    ESP_LOGD(TAG, "Send:\t%s", buffer);
    size_t bytes_written = tinyusb_cdcacm_write_queue(TINYUSB_CDC_ACM_0, (uint8_t*)buffer, length);
    if (bytes_written != length)
        ESP_LOGW(TAG, "USB serial TX FIFO lacked space, dropping outgoing message; bytes_written: %d, length: %d", bytes_written, length);
    return tinyusb_cdcacm_write_flush(TINYUSB_CDC_ACM_0, pdMS_TO_TICKS(200));
}

When the issue is triggered it looks like this, please pay attention to the "USB serial TX FIFO lacked space" log and correlate it with the above function.

2024-07-09 10:02:37 I (10309906) DOWNSTREAM: Processing: msg_num: 2196252184, command: previous
2024-07-09 10:02:38 I (10310264) UPSTREAM: src_act: playing  - src_pos: 5
2024-07-09 10:02:38 E (10310378) rmt-uart: not a stop bit byte_num=0
2024-07-09 10:02:38 W (10310466) tusb_cdc_acm: Flush failed
2024-07-09 10:02:53 I (10325815) MAIN_APP: Current audio buffer utilization: 87.33%
2024-07-09 10:02:53 I (10325816) MAIN_APP: Audio buffer difference: [0.00% in 30 seconds]
2024-07-09 10:02:53 I (10325862) DOWNSTREAM: Processing: msg_num: 672158606, command: previous [Original Message from Host]
2024-07-09 10:02:54 W (10326165) tusb_cdc_acm: Flush failed
2024-07-09 10:02:55 I (10327363) DOWNSTREAM: Processing: msg_num: 672158606, command: previous [Host Retry]
2024-07-09 10:02:55 I (10327621) UPSTREAM: src_act: playing  - src_pos: 4
2024-07-09 10:02:55 I (10327622) MAIN_APP: [Masterlink event] Track change
2024-07-09 10:02:55 W (10327666) tusb_cdc_acm: Flush failed
2024-07-09 10:02:55 W (10328067) tusb_cdc_acm: Flush failed
2024-07-09 10:02:57 I (10329167) DOWNSTREAM: Processing: msg_num: 672158606, command: previous [Host Retry]
2024-07-09 10:02:57 W (10329469) tusb_cdc_acm: Flush failed
2024-07-09 10:02:59 I (10331169) DOWNSTREAM: Processing: msg_num: 672158606, command: previous [Host Retry]
2024-07-09 10:02:59 W (10331270) USB-CDC: USB serial TX FIFO lacked space, dropping outgoing message; bytes_written: 80, length: 82
2024-07-09 10:02:59 W (10331471) tusb_cdc_acm: Flush failed
2024-07-09 10:03:01 I (10333171) DOWNSTREAM: Processing: msg_num: 672158606, command: previous [Host Retry]
2024-07-09 10:03:01 W (10333272) USB-CDC: USB serial TX FIFO lacked space, dropping outgoing message; bytes_written: 0, length: 82
2024-07-09 10:03:01 W (10333473) tusb_cdc_acm: Flush failed
2024-07-09 10:03:03 I (10335173) DOWNSTREAM: Processing: msg_num: 902212124, type: system, command: reset
2024-07-09 10:03:03 W (10335173) USB: Stopping TinyUSB task
2024-07-09 10:03:03 W (10335274) USB-CDC: USB serial TX FIFO lacked space, dropping outgoing message; bytes_written: 0, length: 77
2024-07-09 10:03:03 W (10335475) tusb_cdc_acm: Flush failed

Basically we are sending a JSON message over CDC from Host to ESP32 (Downstream), a message is reported upstream, you can also see tha we are using the RMT peripheral at the same time (I don't know if that can be related). When message 2196252184 is sent, the Flush failed issue is triggered. However the reply for 2196252184 seems to be correctly sent to the host since the host does not retry to send the message. After that, the host sends a new message 672158606 that never receives a reply and that is why it is re-sent many times.

Can you see something wrong in our callback? Not sure if you shared the new version of the examples already.
Also, we did some modifications to the esp_tinyusb component to support USB audio, do you know if it is in the roadmap to officially support that feature anytime soon?

Thanks again and sorry for the late feedback.

@peter-marcisovsky
Copy link
Collaborator

peter-marcisovsky commented Jul 10, 2024

Hi @StevenMacias sorry to hear that the issue has not been resolved.
Could first try to modify your callback according to the as-promised tusb_serial_device example update here

Is there any reason, why are you using timeout other than 0 in your tinyusb_cdcacm_write_flush function?
Does the tinyusb_cdcacm_write_flush return any error?

Regarding the audio support. We just released our UAC driver (USB Host side) here. The USB Device side is not a priority right now for us.

@StevenMacias
Copy link
Contributor Author

Hi @peter-marcisovsky,
I think our TX and RX callbacks are already detached similarly to your example. We use xMessageBufferSend instead for sending information downstream. Here you have them, I have updated the TX callback to print the tinyusb_cdcacm_write_flush error:

esp_err_t usb_cdc_serial_tx(const char* buffer, size_t length)
{
    ESP_LOGD(TAG, "Send:\t%s", buffer);
    size_t bytes_written = tinyusb_cdcacm_write_queue(TINYUSB_CDC_ACM_0, (uint8_t*)buffer, length);
    if (bytes_written != length) {
        ESP_LOGW(TAG, "USB CDC failed to write queue. Written %d out of %d bytes.", bytes_written, length);
        ESP_LOG_BUFFER_HEXDUMP(TAG, buffer, length, ESP_LOG_WARN);
    }
    esp_err_t err = tinyusb_cdcacm_write_flush(TINYUSB_CDC_ACM_0, pdMS_TO_TICKS(200));
    if (err != ESP_OK) {
        ESP_LOGE(TAG, "Write flush failed. Reason: %s", esp_err_to_name(err));
    }
    return err;
}

Since we are expecting JSON strings that end with \n we do some extra logic to build the whole message.

void usb_cdc_rx_callback(int interface_index, cdcacm_event_t*)
{
    size_t rx_size = 0;
    esp_err_t ret = tinyusb_cdcacm_read(interface_index, usb_cdc_rx_buf,
        CONFIG_TINYUSB_CDC_RX_BUFSIZE, &rx_size);

    if (ret == ESP_OK) {
        if (msg_idx <= (USB_CDC_MSG_MAX_SIZE - rx_size)) {
            memcpy(serial_msg + msg_idx, usb_cdc_rx_buf, rx_size);
            msg_idx += rx_size;
        } else {
            ESP_LOGE(TAG, "Message too big (%zu+%zu bytes) for buffer size (%d bytes)",
                msg_idx,
                rx_size,
                USB_CDC_MSG_MAX_SIZE);
            msg_idx = 0;
            usb_cdc_clear_serial_buffers();
        }

        if (wanted_char_flag) {
            wanted_char_flag = false;
            // ESP_LOG_BUFFER_HEXDUMP(TAG, serial_msg, msg_idx, ESP_LOG_INFO);
            if (msg_buffer_handle != NULL) {
                size_t written_bytes = xMessageBufferSend(msg_buffer_handle, serial_msg, msg_idx, 0);
                if (written_bytes < msg_idx) {
                    ESP_LOGW(TAG, "RX Message buffer is full.");
                }
            } else {
                ESP_LOGW(TAG, "Received data but message buffer not registered.");
            }
            usb_cdc_clear_serial_buffers();
        }
    } else {
        ESP_LOGE(TAG, "Read error");
    }
}

No special reason to use a timeout in tinyusb_cdcacm_write_flush other that playing with that value to see if it made a difference, I will change it to 0. The error returned by the flush function when the issue is triggered is:

2024-07-15 10:34:40 W (251977906) tusb_cdc_acm: Flush failed
2024-07-15 10:34:40 E (251977906) USB-CDC: Write flush failed. Reason: ESP_ERR_TIMEOUT

Thanks again for your time and support! 😃

@peter-marcisovsky
Copy link
Collaborator

  1. There should also be a xMessageBufferReceive() somewhere in your usb_cdc_serial_tx() if I am not mistaken. Just like in our example, there is xQueueSend() in rx_callback and xQueueReceive() in the tx while loop

  2. How and from where are you calling your usb_cdc_serial_tx() ? In our example, we are calling a transmitting part of the code, once an Queue item is successfully received from the Queue

  3. Do you have multiple writers/readers of teh messagebuffer? Doc here

  4. xMessageBuffer is in IDF internally ported to xStreamBuffer here. Please mind the differences between those two and make sure you are using it correctly.

  5. Regarding the tinyusb_cdcacm_write_flush() returning errors, or warnings: please mind this warning

@StevenMacias
Copy link
Contributor Author

Hi @peter-marcisovsky,

  • Replying to 1 and 2:
    I wanted to keep my USB CDC component as a pure C library (no task involved). My usb_cdc.h looks like this:
#pragma once

#include "freertos/FreeRTOS.h"
#include "freertos/message_buffer.h"
#include "esp_err.h"

#define USB_CDC_TERMINATION_CHAR 0x0A
#define USB_CDC_MSG_MAX_SIZE 2048u

esp_err_t usb_cdc_init(void);
esp_err_t usb_cdc_serial_tx(const char* buffer, size_t length);
void usb_cdc_serial_register_message_buffer(MessageBufferHandle_t msg_buffer);

The goal of usb_cdc_init is to configure and call tusb_cdc_acm_init. Also it sets up the wanted character (tud_cdc_n_set_wanted_char).
Then I use usb_cdc_serial_register_message_buffer to store a reference to the message buffer handle and therefore being able to decouple the usb_cdc_rx_callback from the message processing logic.

At the other end of the message buffer it will be a task (Message Router) that will process the JSON message and trigger different actions:

static char message_router_rx_msg[USB_CDC_MSG_MAX_SIZE] = { 0 };
static char message_router_tx_msg[USB_CDC_MSG_MAX_SIZE] = { 0 };
static MessageBufferHandle_t message_router_tx_msg_buf_handle;

void message_router_application_task(void* pvParams)
{
    // This is the same handler used in usb_cdc_serial_register_message_buffer
    MessageBufferHandle_t message_router_rx_msg_buf_handle = (MessageBufferHandle_t)pvParams;
    // This tx handler is used internally by this task
    message_router_tx_msg_buf_handle = xMessageBufferCreate(USB_CDC_MSG_MAX_SIZE);
    size_t rx_message_size = 0;
    size_t tx_message_size = 0;

    for (;;) {
        rx_message_size = xMessageBufferReceive(
            message_router_rx_msg_buf_handle,
            message_router_rx_msg,
            sizeof(message_router_rx_msg),
            pdMS_TO_TICKS(MESSAGE_ROUTER_TX_WAIT_MS));

        if (rx_message_size > 0) {
            message_router_process_incoming_message(message_router_rx_msg, rx_message_size, NULL);
            // Limit downstream message throughput
            vTaskDelay(pdMS_TO_TICKS(MESSAGE_ROUTER_TX_SENT_DELAY_MS));
            ESP_LOGD(TAG, "Sending message downstream");
            ESP_LOG_BUFFER_HEXDUMP(TAG, message_router_rx_msg, rx_message_size, ESP_LOG_DEBUG);
        }

        tx_message_size = xMessageBufferReceive(
            message_router_tx_msg_buf_handle,
            message_router_tx_msg,
            sizeof(message_router_tx_msg),
            pdMS_TO_TICKS(MESSAGE_ROUTER_RX_WAIT_MS));

        if (tx_message_size > 0) {
            usb_cdc_serial_tx(message_router_tx_msg, tx_message_size);
            // Limit upstream message throughput
            vTaskDelay(pdMS_TO_TICKS(MESSAGE_ROUTER_RX_SENT_DELAY_MS));
            ESP_LOGD(TAG, "Sending message upstream");
            ESP_LOG_BUFFER_HEXDUMP(TAG, message_router_tx_msg, tx_message_size, ESP_LOG_DEBUG);
        }
    }
}

The task loop basically checks two message buffers, the one shared with the USB CDC component (message_router_rx_msg_buf_handle) and the internal one used to send replies back (message_router_tx_msg_buf_handle). The message_router_process_incoming_message will analyze the JSON message coming from USB CDC and do some action. For example, if a syntax error is detected in the incoming message, a JSON string will be build with information about the error and written into the message_router_tx_msg_buf_handle. Finally, since data has been written into message_router_tx_msg_buf_handle, the function usb_cdc_serial_tx will be called.

  • Replying to 3: We do not have multiple reader or writers.
    USB CDC ------message_router_rx_msg_buf_handle---> Message Router
    USB CDC <-----------usb_cdc_serial_tx------------- Message Router

I do not have a message buffer from Message Router to USB CDC because I do not have a task that can block waiting for a message buffer to be written. Do you think that is an issue? As I mentioned at the beginning we did not find the need of the USB CDC component to be a task.

  • Replying to 4: Since we are dealing with discrete JSON messages we are using Message Buffers. We are creating the message buffer in the main task somewhat like this:
static MessageBufferHandle_t message_router_msg_buf_handle;

static void init_tasks()
{
    message_router_msg_buf_handle = xMessageBufferCreate(USB_CDC_MSG_MAX_SIZE);

    xTaskCreate(message_router_application_task,
        "message_router_application_task",
        (1024 * 6),
        (void*)message_router_msg_buf_handle,
        tskIDLE_PRIORITY + 1,
        NULL);
}

void main_application_task(void* pvParams)
{
    [...]
    init_tasks();
    init_led();
    [..]
    usb_cdc_serial_register_message_buffer(message_router_msg_buf_handle);
    [...]
    err = usb_init(); // calls tinyusb_driver_install and usb_cdc_init
    if (err != ESP_OK) {
        ESP_LOGE(TAG, "Error initializing USB: %s", esp_err_to_name(err));
    }
    for (;;) {
        [...]
    }
}
  • Replying to 5: I will add a retry mechanism that re-flushes and delays for some attempts.

As always, thank you so much for your time and help! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

3 participants