Android Voice Recorder Demo App (Kotlin, Jetpack Compose, MVVM, JNI, OpenAI Whisper, and TensorFlow Lite)
This is a dictaphone app where users can record and log short voice messages. The app maintains a history of recordings, allowing users to play them later. Users can transcribe selected recordings, search through recordings by transcribed text, delete recordings, and share selected items. When a user plays a recording, subsequent recordings automatically play in descending order.
This project was created using the default Android Studio New App template (Empty Activity) and implemented as a Kotlin Jetpack Compose project.
- Android Studio Iguana | 2023.2.1 Patch 1
- Gradle Version: 8.4
- Added dependency to an extended icons library for Mic and Stop icons for the Record/Stop Button.
- Added user permissions for recording audio and wake lock to keep the screen on while playing audio.
- Declared the microphone feature as required for compatibility with devices equipped with a microphone.
- Record/Stop Button: Includes a circle gauge indicating the remaining time until the end of recording (capped at 1 minute).
- List Item View Components:
- Regular item (ItemView): Displays only the recording time.
- Selected item (SelectedItemView): Allows users to play audio, control feedback, and delete the recording.
- ItemViews
- MVVM pattern:
- View: HomeScreen
- Data Model: RecordingItem
- ViewModel: HomeViewModel
- Repository: HomeRepository
- Integrate Hilt for dependency injection.
- Add MVI implementation for comparison.
- Add Timber as a logger.
- Add functionality to share audio files and transcriptions.
- Add navigation and a Settings screen.
- Add UI/Unit tests.
- Experiment with different models to make the transcriber work for languages other than English and generate timecodes for transcribed words to highlight them synchronously with audio playback.
The offline transcription in this project based on vilassn. This Whisper implementation transcribes only English audio to English text and Any Language audio to Translated to English text. The project includes the Whisper Tiny Model (39M parameters), TensorFlow Lite, and FlatBuffers.
I updated the transcribeFile
function in TFLiteEngine.cpp
to support audio files longer than 30
seconds, although testing has been limited to 60-second files. Also, I had to fix the wav recorder,
as it saved wav file with the size of maximum buffer size regardless actual recording duration.
The original Whisper models are in PyTorch format and needs to be converted into TensorFlow Lite format. Google Colab
Initial screen | A few recording added | A transcription example |
Note that this APK contains whisper-tiny.tflite model, so it is heavy, around 100 MB. Also, make sure your Android device allows installation of APKs from sources other than Google Play. Story Rec Demo App
Connect and follow me on LinkedIn: Sergey N