Android Voice Recorder Demo App (Kotlin, Jetpack Compose, MVVM, JNI, OpenAI Whisper, and TensorFlow Lite)

This is a dictaphone app where users can record and log short voice messages. The app maintains a history of recordings, allowing users to play them later. Users can transcribe selected recordings, search through recordings by transcribed text, delete recordings, and share selected items. When a user plays a recording, subsequent recordings automatically play in descending order.

Brief Project Description

This project was created using the default Android Studio New App template (Empty Activity) and implemented as a Kotlin Jetpack Compose project.

Environment

Android Studio Iguana | 2023.2.1 Patch 1
Gradle Version: 8.4

Changes Made

Added dependency to an extended icons library for Mic and Stop icons for the Record/Stop Button.
Added user permissions for recording audio and wake lock to keep the screen on while playing audio.
Declared the microphone feature as required for compatibility with devices equipped with a microphone.

Components

Record/Stop Button: Includes a circle gauge indicating the remaining time until the end of recording (capped at 1 minute).
- RecordButton
List Item View Components:
- Regular item (ItemView): Displays only the recording time.
- Selected item (SelectedItemView): Allows users to play audio, control feedback, and delete the recording.
- ItemViews

Architecture

MVVM pattern:
- View: HomeScreen
- Data Model: RecordingItem
- ViewModel: HomeViewModel
- Repository: HomeRepository

TODO:

Integrate Hilt for dependency injection.
Add MVI implementation for comparison.
Add Timber as a logger.
Add functionality to share audio files and transcriptions.
Add navigation and a Settings screen.
Add UI/Unit tests.
Experiment with different models to make the transcriber work for languages other than English and generate timecodes for transcribed words to highlight them synchronously with audio playback.

Offline Speech Recognition (Transcription) with OpenAI Whisper and TensorFlow Lite

The offline transcription in this project based on vilassn. This Whisper implementation transcribes only English audio to English text and Any Language audio to Translated to English text. The project includes the Whisper Tiny Model (39M parameters), TensorFlow Lite, and FlatBuffers.

I updated the transcribeFile function in TFLiteEngine.cpp to support audio files longer than 30 seconds, although testing has been limited to 60-second files. Also, I had to fix the wav recorder, as it saved wav file with the size of maximum buffer size regardless actual recording duration.

How to generate TFLite model from Whisper

The original Whisper models are in PyTorch format and needs to be converted into TensorFlow Lite format. Google Colab

Screenshots

Initial screen	A few recording added	A transcription example

Download

Note that this APK contains whisper-tiny.tflite model, so it is heavy, around 100 MB. Also, make sure your Android device allows installation of APKs from sources other than Google Play. Story Rec Demo App

Contact

Connect and follow me on LinkedIn: Sergey N

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
app		app
gradle		gradle
models		models
whisper_tools		whisper_tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
local.properties		local.properties
screen1.png		screen1.png
screen2.png		screen2.png
screen3.png		screen3.png
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Android Voice Recorder Demo App (Kotlin, Jetpack Compose, MVVM, JNI, OpenAI Whisper, and TensorFlow Lite)

Brief Project Description

Environment

Changes Made

Components

Architecture

TODO:

Offline Speech Recognition (Transcription) with OpenAI Whisper and TensorFlow Lite

How to generate TFLite model from Whisper

Screenshots

Download

Contact

About

Releases

Packages

Languages

License

sergenes/voice-recorder-android

Folders and files

Latest commit

History

Repository files navigation

Android Voice Recorder Demo App (Kotlin, Jetpack Compose, MVVM, JNI, OpenAI Whisper, and TensorFlow Lite)

Brief Project Description

Environment

Changes Made

Components

Architecture

TODO:

Offline Speech Recognition (Transcription) with OpenAI Whisper and TensorFlow Lite

How to generate TFLite model from Whisper

Screenshots

Download

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages