Core

Foundation

Before the engine is created, it needs a foundation to stand on. This includes basic data structures, traits, etc (I can't think of anything else).

(B0ney) TODO for me: Refine notes, I'm currently regurgitating my personal notes.

Units

Sample: Usually stored as a f32 and must lie between 1.0 and -1.0. All values beyond this range must be "clamped".
Frame: A group of N samples, where N is the number of channels. For example, for stereo audio (2 channels), a frame would look like this [f32; 2].

Question: Do we ever need to worry about the audio engine handling more than 2 channels? We can of course separate the internal processing frames from the output frames.

Chunk: A list of frames.

Structs

ADSR

An envelope: Attack, Decay, Sustain, Release. can be represented as an enum state machine:

enum ADSR {
    Attack,
    Decay,
    Sustain,
    Release,
}

Note

A note has:

Position
Key
octave/pitch
ADSR (maybe)
Velocity
Length
Panning

Pattern

List of notes. Could be a hashset.

Instrument

Sample Buffer

A sample buffer can be a Vec<Vec<f32>> (A vector containing a vector of 32 bit floats). Where the inner-most Vec represents a channel.

struct SampleBuffer {
    pub rate: u32,
    pub pcm: Vec<Vec<f32>>,
    /* need suggestions */
}

When iterating through its frames we need to consider the amount of channels the buffer has.

I have a couple of ideas:

Store the frame in a Box<[f32]>. It's the easiest to work with, but uses heap allocations. 44100 heap allocations per second doesn't sound optimal, even for modern allocators.
Store the frame in an enum like so:

#[derive(Clone)]
pub enum AudioFrame {
    Mono([f32; 1]),
    Stereo([f32; 2]),
    Multi(Box<[f32]>),
}

Though this is too complicated.

Have the sample buffer write its frames directly to a pre-allocated buffer.
Force the frames to be (f32, f32). When a sample buffer is created, it will duplicate mono audio, or mix channels if there are over 2.
Force the frames to be (f32, f32), but ignore channels beyond 2, and clone the sample if it's mono. This is done when the audio data is streamed, not when the sample buffer is created.

Sample Cache

TODO: explain how this works...

use std::collections::HashMap;
use std::sync::{Arc, Weak};
use parking_lot::RwLock; // external crate btw


#[derive(Default)]
pub struct SampleCache {
    cache: RwLock<HashMap<String, Weak<SampleBuffer>>>,
}

impl SampleCache {
    pub fn get(&self, id: &str) -> Option<Arc<SampleBuffer>> {
        self.cache.read().get(id)?.upgrade()
    }

    #[must_use = "Cache will be immediately invalidated as this is the only owning reference."]
    pub fn add<K, V>(&self, id: K, sample: V) -> Arc<SampleBuffer> 
    where
	K: Into<String>,
	V: Into<Arc<SampleBuffer>> 
    {
        let sample = sample.into();

        self.cache
            .write()
            .insert(id.into(), Arc::downgrade(&sample));

        sample
    }
}

I'm not sure if this is optimal, or if it even serves our purpose.

The cache is a HashMap that stores Weak references to the sample buffer. A weak reference is used so that the sample buffer can be deallocated when there are no more references to it. This is acts as a basic form of cache invalidation.

The HashMap is wrapped in a RwLock to allow multiple reads and exclusive writes across threads. While it may seem that reading from the cache will provide immutable access to the buffer, Rust's Arc smart pointer does allow mutation with the make_mut method, but with a few caveats:

If there are no other strong references, the inner data can be mutated directly.
If there are more than one strong references, the inner data is cloned (Clone on Write)

MidiInput

todo

MidiOutput

todo

Traits

AudioOutputDevice

Represents a generic output device.

The audio engine needs to send audio data somewhere, how else are we going to get sound?

Implementation idea:

// A frame is a type alias for a tuple of floats: (f32, f32)
use rmms::core::Frame; 

pub trait AudioOutputDevice {
    // sample rate of output device
    fn rate(&self) -> u32;

    fn channels(&self) -> u32;
    
    // Send processed frames
    fn write(&mut self, chunk: impl AsRef<[Frame]>);

    /* need suggestions, feel free to edit*/
}

Possible implementers:

Dummy: Does absolutely nothing. can also act as a fallback
Sdout: Standard Out, basically what you see in the terminal. This can be piped to other applications.
CPAL: Cross Platform Audio Library, a rust crate to output sound.

Possible issues:

TODO: (something something writing processed frames should not block the engine something something so i don't forget)

AudioInputDevice

Represents a generic input device.