-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SmartLedsAdapter::new does not have async impl for Rmt::new_async? #4
Comments
I somehow missed that on Matrix. There is no async-version of |
Thanks! Perhaps worth opening another issue/discussion although applies to this particular case too: What do you think about the use of
Is it worth making this uniform across peripherals in /cc @MabezDev |
Seeing as the (Though one could argue that since this struct's sole purpose was to provide an adapter for smart_leds, maybe it shouldn't have an async variant without an upstream trait, but no reason to be strict I guess)
This could probably be done for the drivers that mandate DMA, as the async mode could be inferred from the dma channel as well. (Personally I'd say it's too soon as the DMA based apis aren't mature enough) |
Thanks for the comments and ideas!
Indeed, that Rmt struct member,
I'm not sure if limiting the async-ness to just the For now I'm being cheeky, breaking stuff locally, patching and defaulting to on smartleds-esp-hal: diff --git a/esp-hal-smartled/src/lib.rs b/esp-hal-smartled/src/lib.rs
index 98164779..86a91249 100644
--- a/esp-hal-smartled/src/lib.rs
+++ b/esp-hal-smartled/src/lib.rs
@@ -30,7 +30,7 @@ use esp_hal::{
clock::Clocks,
gpio::OutputPin,
peripheral::Peripheral,
- rmt::{Error as RmtError, PulseCode, TxChannel, TxChannelConfig, TxChannelCreator},
+ rmt::{asynch::TxChannelAsync, Error as RmtError, PulseCode, TxChannelConfig, TxChannelCreatorAsync},
};
use smart_leds_trait::{SmartLedsWrite, RGB8};
@@ -73,7 +73,7 @@ macro_rules! smartLedBuffer {
/// interaction functionality using the `smart-leds` crate
pub struct SmartLedsAdapter<TX, const BUFFER_SIZE: usize>
where
- TX: TxChannel,
+ TX: TxChannelAsync,
{
channel: Option<TX>,
rmt_buffer: [u32; BUFFER_SIZE],
@@ -82,7 +82,7 @@ where
impl<'d, TX, const BUFFER_SIZE: usize> SmartLedsAdapter<TX, BUFFER_SIZE>
where
- TX: TxChannel,
+ TX: TxChannelAsync,
{
/// Create a new adapter object that drives the pin using the RMT channel.
pub fn new<C, O>(
@@ -93,7 +93,7 @@ where
) -> SmartLedsAdapter<TX, BUFFER_SIZE>
where
O: OutputPin + 'd,
- C: TxChannelCreator<'d, TX, O>,
+ C: TxChannelCreatorAsync<'d, TX, O>,
{
let config = TxChannelConfig {
clk_divider: 1,
@@ -160,7 +160,7 @@ where
impl<TX, const BUFFER_SIZE: usize> SmartLedsWrite for SmartLedsAdapter<TX, BUFFER_SIZE>
where
- TX: TxChannel,
+ TX: TxChannelAsync,
{
type Error = LedAdapterError;
type Color = RGB8;
@@ -168,7 +168,7 @@ where
/// Convert all RGB8 items of the iterator to the RMT format and
/// add them to internal buffer, then start a singular RMT operation
/// based on that buffer.
- fn write<T, I>(&mut self, iterator: T) -> Result<(), Self::Error>
+ async fn write<T, I>(&mut self, iterator: T) -> Result<(), Self::Error>
where
T: IntoIterator<Item = I>,
I: Into<Self::Color>,
@@ -187,14 +187,15 @@ where
*seq_iter.next().ok_or(LedAdapterError::BufferSizeExceeded)? = 0;
// Perform the actual RMT operation. We use the u32 values here right away.
- let channel = self.channel.take().unwrap();
- match channel.transmit(&self.rmt_buffer).wait() {
- Ok(chan) => {
- self.channel = Some(chan);
+ let mut channel = self.channel.take().unwrap();
+ match channel.transmit(&self.rmt_buffer).await {
+ Ok(_) => {
+ //self.channel = Some(chan);
Ok(())
}
- Err((e, chan)) => {
- self.channel = Some(chan);
+ //Err((e, chan)) => {
+ Err(e) => {
+ //self.channel = Some(chan);
Err(LedAdapterError::TransmissionError(e))
}
} Then adding |
Oh, scratch that comment about what to convert to async @Dominaezzz: I just realised that With this out of the way, it seems like I just need to carefully gate my changes through |
Digging around with this today (see comments in esp-rs/esp-hal#1779) I seem to be stuck. The async context switching is surprisingly slow. Sometimes so slow that by the first time the future is polled, the RMT has already read 32 pulses. And wake up triggered by interrupt is particularly bad. (though I am also suspicious I may be doing something wrong with the interrupts) I have this kind of working in my branch (esp-rs/esp-hal@main...tschundler:esp-hal:async-led-iterator) in very messy code with no interrupts. Maybe it can serve as inspiration? I may personally give up on async and focus on my original goal in esp-rs/esp-hal#1768 of iterator input for RMT since that part I have working well. I'm well beyond my depth at this point. I'm debugging by panicking and dumping RMT registers. I feel there has to be a better way. |
Are you running in release mode? |
Yes. Even though the cargo generate template has some optimization in debug mode, I also tried with Digging around FastLED I see a lot of mention of IRAM for timing critical stuff. Maybe that is a factor? Specifically for Embassy itself? Fwiw without async, I get working LED signals in Wokwi at 40MHz stimulated, so that gives time for 5x more code than that is running, which is why I feel the issue is the overhead of async executor. But I also haven't played with |
@tschundler Maybe async is an issue, but you'll likely also want to place the rmt driver in RAM (esp-idf has an option for this), for better perf. We already have a config option for this with SPI, once esp-rs/esp-hal#2156 is merged we could add one for RMT. |
I actually implemented this before I saw the issue. I opened a PR here: I successfully used it to drive a 46 LED lightstrip asynchronously :) |
I wasn't able to get this working well enough for Burning Man so priority of fixing it up dropped. I'll play with your PR when I get a chance. I have a variety of LEDs with different timing tolerances. Skimming it... So you didn't work around the bug that async RMT right now can't refill the buffer and only supports one-shot tx, rather you're sending a bunch of single shorts. As long as the next single shot is less than the latch time, that would work. That's simpler than my approach trying to fix RMT to allow arbitrary input lengths in async mode. (And then doing to conversion to pulse-codes on the fly) Have you checked how long it is between chunks? I wonder if it could do the pulse code conversion while RMT is sending in the SmartLEDs code. My goals are parallel output and lower memory usage to support >2k total LEDs. |
I actually measured the time compared to the blocking API. Writing all 46 LEDs with async API takes around 3-4 ms, blocking API takes ~2 ms (but is blocking, which incurred issues with the IR receiver handling) . Considering than my LED fine ticker timer is 20 ms, this was not a problem for me. I actually saw that the blocking API is doing some magic of continously filling the other buffer, and doing that with the async API would probably require changes or a custom driver, because the async API does not listen to the threshold interrupts. I did not measure the time between chunks yet, but I can try to do that. |
Each write takes around 52 microseconds. (all tests of course done with |
That's close to what I got testing your code. It work with only 2 of 6 types of LEDs that I have handy. For others, the extra 20-some microsecond delay between pixels is treated as the latch time, and you only get one pixel working. So I think it is reasonable with comments and we can work on improving it. |
Hmm, does that mean that these pixels would require a new transfer to be started immediately? The current async API does not allow this from what I can see, custom (interrupt) code would probably required to do this :/ |
The new transfer needs to be within 6us judging by https://wp.josh.com/2014/05/13/ws2812-neopixels-are-not-so-finicky-once-you-get-to-know-them/amp/
despite the datasheet saying 50us for latch time. Though it seems some clones do have a longer latch time. |
Okay, that's unfortunate. Just so I understand correctly: The amount of time required for the RMT to finish, wake the task again, context switch back to start the next transfer is longer than those 6 us? I don't see another reliable way to deal with this other than interrupts or blocking/polling just like the blocking API does.. |
Yes, it looks like it is taking 20uS (~30us to send 24 bits + ~50uS total per pixel) I found that when I faked async by doing a loop of polling I had no problem, and I forgot how, but I estimated the async context switch to take about 8us, which seemed long, but I was also suspicious that I'm doing something incorrectly. I've realized that testing my variant, I should probably flip pins for when ISR is running and when the task is awake, then I can see the timing relationship better. (And in my case, since my real goal is parallel output, maybe I2S/LCD out is what I should really be looking into) |
Mentioned in esp-rs matrix and here too, flagging it here:
The text was updated successfully, but these errors were encountered: