Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Reproduce experiments" don't work #58

Open
FelipeMarra opened this issue Sep 17, 2024 · 0 comments
Open

"Reproduce experiments" don't work #58

FelipeMarra opened this issue Sep 17, 2024 · 0 comments

Comments

@FelipeMarra
Copy link

The error

Stack error in the data loader due to tensors with different shapes.

How to reproduce:

  1. Run the preprocessing step python3 scripts/baseline/get_npy.py run 'your_path_to_spectrogram_npy' on the mood/theme subset, since the baseline.pth outputs 56 classes.
  2. Run the train command python3 scripts/baseline/main.py --mode 'TRAIN'

Trying to solve:

The article says:

we only used a centered 29.1s audio segment

Which I believe would be the equivalent to getting the mel with melspectrograms.py setting full_audio to False. That yields a [96, 1366] tensor that is the shape needed to run inference in the baseline model.

Since the mels in the dataset were calculated over the whole duration of the audios, the data loader might need to center a [96, 1366] segment in the dataset's mels.

When trying to obtain a mel from an audio to see if getting the 29.1s segment would be equivalent to center a [96, 1366] segment in the dataset's mels, I obtained the same dimensions, but different values. For example, for the 00/13400.mp3 audio, the precomputed mel and the mel calculated with melspectrogram.py will have the dimentions [96, 9602]. But if you print both at [:,0] the dataset precomputed one will contain the following numbers:

[-69.5358, -64.7463, -61.8604, -59.8808, -58.1119, -58.2752, -58.9025,
-60.2660, -62.0527, -64.3706, -68.4771, -72.2208, -75.7047, -79.4953,
-85.4376, -85.6893, -81.9504, -80.0834, -79.7122, -82.1272, -89.4751,
-90.0000, -90.0000, -90.0000, -90.0000, -88.8482, -86.1220, -84.0110,
-81.6328, -81.6245, -82.9754, -83.6547, -85.0630, -88.5137, -90.0000,
-87.7471, -85.0853, -82.7995, -84.5712, -88.1776, -88.0879, -86.8838,
-89.5533, -90.0000, -84.0632, -81.3411, -83.6548, -87.9001, -90.0000,
-90.0000, -88.2064, -84.8365, -85.5288, -87.3742, -88.8410, -90.0000,
-90.0000, -85.1121, -83.0755, -86.6247, -90.0000, -89.6840, -87.7929,
-84.6036, -86.9026, -90.0000, -90.0000, -87.8175, -83.3707, -84.7766,
-90.0000, -90.0000, -90.0000, -90.0000, -90.0000, -88.1323, -90.0000,
-88.8589, -90.0000, -90.0000, -90.0000, -88.7473, -90.0000, -89.0149,
-90.0000, -90.0000, -90.0000, -90.0000, -90.0000, -90.0000, -88.6646,
-90.0000, -90.0000, -90.0000, -90.0000, -90.0000]

While the calculated with melspectrogram.py will be like:

[-139.0715, -129.4926, -123.7208, -119.7616, -116.2238, -116.5503,
-117.8051, -120.5321, -124.1054, -128.7413, -136.9542, -144.4415,
-151.4094, -158.9905, -170.8752, -171.3786, -163.9008, -160.1669,
-159.4244, -164.2545, -178.9503, -193.2552, -186.2103, -188.0788,
-188.0027, -177.6964, -172.2440, -168.0220, -163.2655, -163.2491,
-165.9508, -167.3093, -170.1259, -177.0275, -180.8245, -175.4943,
-170.1705, -165.5989, -169.1423, -176.3552, -176.1757, -173.7676,
-179.1065, -182.1857, -168.1263, -162.6822, -167.3096, -175.8002,
-185.6764, -189.3085, -176.4127, -169.6730, -171.0577, -174.7484,
-177.6820, -192.4283, -181.8572, -170.2243, -166.1510, -173.2494,
-181.5207, -179.3679, -175.5858, -169.2072, -173.8052, -189.5120,
-199.9228, -175.6349, -166.7414, -169.5531, -190.6465, -191.5059,
-186.6069, -193.5956, -188.5288, -176.2646, -181.7400, -177.7178,
-189.9011, -180.9200, -181.7761, -177.4945, -183.4301, -178.0298,
-189.3605, -186.7196, -189.7235, -185.6219, -188.4031, -185.2255,
-177.3292, -184.3699, -185.4904, -200.0000, -200.0000, -200.0000]

Also, there is a bug in the validation function, because the data loader returns 3 values not 2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant