-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Unknown
committed
Nov 10, 2024
1 parent
0a8f95a
commit a0378a9
Showing
55 changed files
with
1,723 additions
and
2,281 deletions.
There are no files selected for viewing
Binary file removed
BIN
-65.5 KB
_images/858c0c3a67cb46fbe88e58ec11eb1d4cb103c52f5262b85390cd4c9c87137962.png
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+64.3 KB
_images/e692e9ea55390448c55fedef11b8b25312e4ffd011734351d267200b0f3774f0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
# Beyond Audio Modality |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,18 @@ | ||
# Conclusion | ||
|
||
Congratulations! You finished the book, executed every code we typed, and read every line we wrote! | ||
Congratulations! You've completed the book, working through all the code examples and content we've prepared! | ||
|
||
In the first chapter, The Basics, we defined music classification and introduced its applications. We then looked into input representations with a special focus on biological plausibility. We also looked into music classification datasets with a special focus on the secrets of how to use some popular datasets correctly. In the evaluation section, we showed the concepts of important metrics such as precision and recall as well as code demo to compute them. After finishing this chapter, we hope you’re ready to start working on your music classification model. | ||
In Chapter 2, we provided a comprehensive overview of language models, examining their key components from tokenizers to training methodologies and conditioning methods. We also investigated the challenges that arise when using language modeling as a framework and explored how these challenges are currently being addressed in NLP and multimodal domains. | ||
|
||
In the second chapter, Supervised Learning, we reviewed popular architectures - their definitions, pros, and cons. We also demonstrated data augmentation methods for music audio - the code, spectrograms, and audio signals you can play. At the end of the chapter, we showed a full example of data preparation, model training, and evaluation on Pytorch. After this chapter, you can implement a majority of music classification models that were introduced during the deep learning era. | ||
In Chapter 3, we introduced Music Description as a novel MIR task. We discussed how the abstractness and specificity of music description, combined with the flexibility of language, create unique advantages for music and language models. This chapter traced the evolution of methodologies from classification models to encoder-decoder architectures and audio LLMs, demonstrating how the field has leveraged music description in increasingly sophisticated ways. | ||
|
||
In the third chapter, Semi-Supervised Learning, we covered transfer learning and semi-supervised learning – approaches that became popular, recently, due to annotation cost. Both are strategies one can consider when there is only a small number of labeled items. These approaches can be useful in many real-world situations where you only have, for example, less than a thousand labeled items. | ||
In Chapter 4, we focused on traditional Music Retrieval approaches and how audio-text joint embedding helps overcome their limitations. We explored the advantages and disadvantages of multimodal metric learning using triplet and contrastive losses, and examined how advances in text encoders have enhanced joint embedding capabilities. The chapter concluded by analyzing the current limitations of joint embedding models and exploring the possibilities of conversational music retrieval. | ||
|
||
In the fourth chapter, Self-Supervised Learning, an even more radical approach. The goal of self-supervised learning is to learn useful representations without any labels. To achieve the goal, researchers assume some structural/internal patterns purely within input and design loss functions to predict the patterns. We covered a wide range of self-supervised learning methods introduced in music, speech, and computer vision. The lesson of this chapter liberates you from the worry of getting annotations. | ||
In Chapter 5, we reviewed two prominent text-to-music generation methods: discrete token-based language models and diffusion-based generative models operating in continuous space. We also conducted an in-depth discussion about the importance of evaluation and current challenges in evaluation methodologies. | ||
|
||
In the fifth chapter, Towards Real-world Applications, we introduce you to what people care about in industry. After finishing this chapter, you can understand the procedures and tasks researchers and engineers in industry spend time on. | ||
We're delighted that you've studied these topics with us. Have you achieved your learning goals? Were your questions answered? We hope we've succeeded in our aims: making these complex topics more accessible to newcomers, providing practical solutions for data challenges, and bridging the gap between academic research and practical applications. Please don't hesitate to reach out if you have any questions or feedback. | ||
|
||
We’re delighted that you have studied music classification with us. Did you achieve your goal while reading it? Are your questions solved now? We hope we also achieved our goals - lowering the barrier of music classification to the newcomers, providing methods to cope with data issues, and narrowing the gap between academia and industry. Please feel free to reach out to us if you have any questions or feedback. | ||
As a sweet dessert, we've prepared two exciting future directions in the following pages. Don't miss these delightful treats! | ||
|
||
Best wishes, | ||
|
||
Minz, Janne, and Keunwoo. | ||
|
||
SeungHeon, Ilaria, Zachary, JongWook, Ke |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Background | ||
|
||
```{figure} ../img/history.png | ||
```{figure} ./img/qbd.png | ||
--- | ||
name: history | ||
--- | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.