Skip to content

Latest commit

 

History

History
336 lines (247 loc) · 26.7 KB

README.md

File metadata and controls

336 lines (247 loc) · 26.7 KB

Awesome-Audio Awesome

A curated list of awesome audio technology resources for developers

Gitter

Code

Software applications, tools, and APIs you can use to solve audio-related problems to use in your own awesome audio projects.

How-To Analyze Audio

  • APIs
    • Dolby.io Media Analyze API - services to analyze an audio file to identify codec, clipping, loudness, sound classification, silence, etc. Also has options force Speech, and Diagnostics.
  • Apps
  • Python
    • Librosa - python package for music and audio analysis
    • PyAudio Analysis - python package for audio analysis and feature extraction

How-To Edit Audio

  • APIs
    • Dolby.io Media Enhance API - services to enhance media such as correcting audio impurities like noise, sibilance, equalization, tonality, loudness
    • Dolby.io Media Transcode API - Convert and assemble content that looks and sounds great no matter the device or where it’s viewed. With support for high resolution, high frame rates, and web and streaming formats.
    • Dolby.io Media Music Mastering API - Get professional-sounding audio masters that keep your creative intent intact with the powerful Music Mastering API from Dolby.io — the result of thousands of hours of musical analysis.
  • Apps
    • Avid Pro Tools - music software to create audio recording, composing, editing, and mastering
    • iZotope - audio software for music production and post production, composing, editing, and mastering
    • FL Studio - DAW for MacOS and Windows
    • Ableton Live - DAW for MacOS and Windows
    • Nuendo - DAW for MacOS and Windows that has support for Dolby Atmos and other forms of spatial audio
    • Logic Pro - Logic Pro is a digital audio workstation and MIDI sequencer software application for macOS
    • Garageband - Free tool for MacOS users to record and edit audio
    • Audacity - Audacity is a free and open-source digital audio editor and recording application software
    • Reaper - Propietary cross platform DAW
    • Bitwig Studio - Cross Platform DAW made by ex-Ableton employees
    • Ardour - Ardour is a hard disk recorder and digital audio workstation application that runs on Linux, macOS, FreeBSD and Microsoft Windows.
    • LMMS - free, open source, cross platform DAW

How-To Playback Audio

  • Android
    • AudioTrack - Android class that streams PCM audio buffers to audio hardware for playback
    • ExoPlayer - library for local or streaming playback of audio and video
    • MediaPlayer - class for controlling playback of a pre-existing audio or video file
    • Oboe - C++ library that wraps OpenSL ES and AAudio for high performance audio operations
  • JavaScript
  • Python
    • PyAudio - python bindings for PortAudio to interface with audio drivers to record or playback audido (Open-Source/MIT)

How-To Read and Write Audio Files

  • CLI
    • ffmpeg - A complete, cross-platform solution to record, convert and stream audio and video.
    • GStreamer - library for constructing graphs of media-handling components
    • mpv - mpv is a free (as in freedom) media player for the command line. It supports a wide variety of media file formats, audio and video codecs, and subtitle types.
    • VLC - VLC is a free and open source cross-platform multimedia player and framework that plays most multimedia files as well as DVDs, Audio CDs, VCDs, and various streaming protocols.
    • Handbrake - HandBrake is a tool for converting video from nearly any format to a selection of modern, widely supported codecs.
    • Sound eXchange - SoX is billed as the swiss army knife of sound processing programs
  • Python
    • pyAV - python bindings for ffmpeg to access media via containers, streams, packets, codecs, and frames

How-To Record Audio

Audio capture solutions...

  • Android
    • AudioRecord - Android class for reading buffers of raw audio data from audio hardware
    • MediaRecorder - records encoded audio or video and saves the recording to an output file
    • Oboe - C++ library that wraps OpenSL ES and AAudio for high performance audio operations compatible across API levels
  • JavaScript
    • MediaRecorder - Web API for processing a stream of media content such as audio tracks
    • react-mic - javascript / react library to record audio cross-platform
  • Python
    • PyAudio - python bindings for PortAudio to interface with audio drivers to record or playback audido (Open-Source/MIT)
  • Swift
    • AVFoundation - framework for audiovisual assets, control devices, audio processing, and system audio interactions
      • AVCapture - device, input, session, and output classes for a graph processing architecture allowing buffer analysis and processing (including video support)
    • AVFAudio - foundation framework to play, record, and process audio and configure system behavior
      • AVAudioRecorder - class to record audio to a file and may be simplest when getting started
      • AVAudioEngine - group of audio nodes to generate and process audio signals for input and output; does not natively support video capture but highly configurable processing nodes
    • Audio Toolbox - framework to record or play audio, convert formats, parse audio streams, and configure your audio session
  • Windows

How-To Send Real-Time Audio

Communications solutions...

  • APIs
    • Dolby.io Communications API - services with SDKs for adding audio and video conferencing and communications
    • Dolby.io Streaming API - Millicast, acquired by Dolby, is now part of the Dolby.io platform. Millicast is a WebRTC-based real-time streaming service that enables sub-second latency, broadcast-quality color and sound, global scale, and end-to-end encryption
  • JavaScript
    • WebRTC API - capture and stream audio / video media between browsers without requiring an intermediary
    • HLS Streaming - HLS lets you deploy content using ordinary web servers and content delivery networks. HLS is designed for reliability and dynamically adapts to network conditions by optimizing playback for the available speed of wired and wireless connections.
  • Local
    • PulseAudio - PulseAudio is a sound server system for POSIX OSes
    • JACK - JACK Audio Connection Kit is a professional sound server API and pair of daemon implementations to provide real-time, low-latency connections for both audio and MIDI data between applications
    • Loopback - Cable-free audio routing for Mac
    • Soundflower - MacOS system extension that allows applications to pass audio to other applications.
    • BlackHole - BlackHole is a modern MacOS virtual audio driver that allows applications to pass audio to other applications with zero additional latency.

How-To Transcribe Audio into Text

Transcription solutions...

  • APIs
  • Apps
    • descript.com - use transcripts to cut and edit video
    • Otter.ai - Generate rich notes for meetings, interviews, lectures, and other important voice conversations
  • Python
    • PyKaldi - Python scripting layer for the Kaldi speech recognition toolkit.

How-to Turn Text into Voice and Speech

Speech synthesis solutions...

How-To Visualize Audio

  • Apps
    • headliner.app - create engaging social video with audio editing, transcription, and visualization
    • getaudiogram.com - create engaging social video with audio visualizations
  • JavaScript
    • Wavesurfer - a customizable audio waveform visualization built on Web Audio API; supporting spectrograms and other features

Audio Plugin Development Tools

  • JUCE - JUCE is an open-source cross-platform C++ application framework for desktop and mobile applications, including VST, VST3, AU, AUv3, RTAS and AAX audio plug-ins.
    • react-juce - React-JUCE is a hybrid JavaScript/C++ framework that enables a React.js frontend for a JUCE application or plugin.
  • iPlug2 - iPlug 2 is a simple-to-use C++ framework for developing cross platform audio plug-ins/apps and targeting multiple plug-in APIs with the same minimalistic code.
  • AudioKit - AudioKit is an entire audio development ecosystem of code repositories, packages, libraries, algorithms, applications, playgorunds, tests, and scripts, built and used by a community of audio programmers, app developers, engineers, researchers, scientists, musicians, gamers, and people new to programming.
  • Plug'n Script - Blue Cat's Plug'n Script is an audio and MIDI scripting plug-in and application that can be programmed to build custom effects or virtual instruments, without quitting your favorite DAW software.
  • Faust - Faust (Functional Audio Stream) is a functional programming language for sound synthesis and audio processing with a strong focus on the design of synthesizers, musical instruments, audio effects, etc. Faust targets high-performance signal processing applications and audio plug-ins for a variety of platforms and standards.
  • SOUL - The SOUL project is creating a new language and infrastructure for writing and deploying audio code. It aims to unlock improvements in latency, performance, portability and ease-of-development that aren't possible with the current mainstream techniques that are being used.
  • Max - Max is an infinitely flexible space to create your own interactive software

Community

Social media, discussion groups, events, and audio experiences you can seek out to increase your appreciation for awesome audio.

Awesome Lists

Collections

Conferences and Events

  • Audio Developers Conference - ADC is an annual event celebrating audio development technologies from music applications and game audio to audio processing andd embedded systems. ADC's mission is to help attendees acquire and develop new skills.
  • Demuxed - video-tech community event for technical topics related to video technology
  • KrankyGeek - annual event for WebRTC technology used for real time communications in a web browser
  • Web Audio Conference - WAC is an international conference dedicated to web audio technologies and applications. The conference addresses academic research, artistic research, development, design, evaluation and standards concerned with emerging audio-related web technologies such as Web Audio API, Web RTC, WebSockets and Javascript.

Experiences and Places

Groups

  • Audio Engineering Society - AES is an international organization that unites audio engineers, creative artists, scientists, and students promoting advances in audio and disseminating new knowledge and research with many local communities
  • International Society for Music Information Retrieval - ISMIR is a non-profit seeking to advance access, organization, and understanding of music information
  • Women's Audio Mission - WAM is a non-profit built and run by women to inspire and educate on the subject of audio in music and media

Podcasts

  • Audio Programmer Podcast - all things audio programming, including DSP (digital signal processing), coding, and audio tech.
  • Dissect - Long form music analysis of albums that goes track by track discussing music theory and artist intention
  • Game Audio Podcast - aims to provide sound designers, composers, and everyone else interested in game audio a biweekly show
  • Song Exploder - music podcast where musicians take apart their songs and tell the story of how they were made
  • Twenty Thousand Hertz - the stories behind the world's most recognizable and interesting sounds

Social Forums

  • Music and Audio Professionals - LinkedIn group for audio engineers, music arrangers, music composers, etc.
    • r/audioengineering - products, practices, and stories about the profession or hobby of recording, editing, and producing audio
  • Signal Processing StackExchange - question and answer for practioners of the art and science of signal, image, and video processing
  • The Audio Programmer Discord - We invite you to the Audio Programmer community, where you can connect with other audio programmers, ask questions about coding and choosing the right career path, find job opportunities and more!

Social Networks

  • Display - social platform for creators
  • Lava - social network for audio

Video Channels

  • The Audio Programmer - SOUL tutorials, JUCE tutorials, teaching audio programming for beginners, etc.

Education

Resources such as books, courses, tutorials, journals, and blogs that are worth checking out to become more awesome with audio yourself.

See something missing, view the contribute section and let us know.

Books

  • Corey, Jason. (2016). Audio Production and Critical Listening: Technical Ear Training. Focal Press.
  • Dittmar, Tim. (2017). Audio Engineering 101: A Beginner's Guide to Music Production. Routledge.
  • Watkinson, John. (2002). Introduction to Digital Audio. Focal Press.

Courses

  • Audio Signal Processing - audio signal methodologies for music. Topics include: spectral processing techniques, transformation of sounds, analyze, synthesize, transform audio signals, python (Coursera)
  • Digital Media Foundations - Audio Made Simple. Topics include creating space with channels, measuring power of sound, capturing tone as frequency, phase. (LinkedIn Learning)
    • Communication Acoustics - This is a comprehensive course starting from the basics: what is sound, how it propagates and prepares us gradually to learn about the human auditory system, psychoacoustics(connecting the physical world to how we perceive sounds), speech acoustics(human speech production system) and finally electroacoustics(the world of loud speakers and microphones)(Edx)
  • Fundamentals of Audio and Music Engineering - basic concepts of acoustics and electronics and how they can be applied to understanding musical sound and make music with electronic instruments. Topics include: sound waves, musical sound, basic electronics, and applications of these basic principles in amplifiers and speaker design (Coursera)

Journals

  • Computer Music Journal - a peer-reviewed academic journal that covers a wide range of topics related to digital audio signal processing and electroacoustic music
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing - dedicated to innovative theory and methods for processing signals representing audio, speech and language, and their applications. This includes analysis, synthesis, enhancement, transformation, classification and interpretation of such signals as well as the design, development, and evaluation of associated signal processing systems
  • Journal of the Acoustical Society of America - a monthly peer-reviewed scientific journal covering aspects of acoustics
  • Journal of the Audio Engineering Society - peer-reviewed journal devoted to audio technology
  • SMPTE Motion Imaging Journal - the key publication of the Society, providing peer-reviewed articles on topics in 3D, imaging processing, display technologies, audio, compression, digitaal cinema, and much more

Tutorials and Blogs


Hardware

Resources for hardware considerations for recording and listening to awesome audio.

View the contribute section and let us know what you think would be great resources for this section.


Industry

Domains and use-case specific resources such as broadcasting, communications, gaming, music, and the web where awesome audio is applied.

Standards

  • AES Standards - 2-channel digital audio, MADI, analog XLR pin-out, networked audio, etc.
  • ATSC A/85 - Advanced Television Systems Committee (ATSC) Techniques for establishing and maintaining audio loudness for digital television
  • EBU R.128 - European Broadcasting Union (EBU) loudness normalisation and permitted maximum level of audio signals
  • ITU-R BS.1770 - International Telecommunication Union (ITU) algorithms to measure audio programme loudness and true-peak audio level
  • ITU-R BS.2159-7 - International Telecommunications Union (ITU) multi-channel speaker configurations for home and broadcast applications
  • MPEG Advanced Audio Coding - aac wideband perceptual audio coding algorithm that provides state of the art levels of compression for audio signals
  • SMPTE Audio Standards - collection of standards related to audio

Research

Areas of experimentation and exploration for awesome algorithms.

Data

  • AudioSet - large-scale dataset of manually annotated audio events with sound ontology
  • CSTR VCTK - speech data uttered by 110 English speakers with various accents reading about 400 sentences from newspapers
  • Freesound - Freesound is a collaborative database of Creative Commons Licensed sounds.
  • LibriSpeech - text-to-speech training corpus with 1000 hours of English speech of read audiobooks from the LibriVox project
  • Mozilla Common Voice - open-source, multi-language dataset of voices to train speech-enabled applications with 68 validated hours and 18 languages
  • Netflix Open Content - test titles with documentary, live action, and animation films
  • Spoken Wikipedia Corpora - SWC is comprised of spoken articles in multiple languages
  • Voice Datasets - A comprehensive list of open source voice and music datasets.

Contribute

Contributions welcome! Read the contribution guidelines first.