Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I want to change the window width and step width of the frame in openSMILE #42

Open
Zinc0816 opened this issue Sep 2, 2021 · 10 comments

Comments

@Zinc0816
Copy link

Zinc0816 commented Sep 2, 2021

Hello.

As the title says, I want to change the window width and step width of the frame in openSMILE.
(For example,frameSize:0.025→0.050,frameStep:0.010→0.020)

Can I ask you how to do this?
(I'm sorry, I am not a native speaker of English, so my writing may be unnatural.)

@chausner-audeering
Copy link

You can make a copy of the config you would like to adapt, apply the changes in frameSize and frameStep there, and then run it via the openSMILE Python library as described at https://audeering.github.io/opensmile-python/usage.html#custom-config.

@Zinc0816
Copy link
Author

Zinc0816 commented Sep 3, 2021

Thank you for your answer!
In addition to this, I would like to ask how to change the component if it is not in the .conf file but in the .inc file?
I want to change the frame settings for LLD extraction in ComParE_2016, how do I do that?

@chausner-audeering
Copy link

chausner-audeering commented Sep 3, 2021

There is no difference in the format of .conf and .inc files. The only difference is that .inc files get included from other config files.

To change frame parameters in ComParE_2016, make a copy of all configs in https://github.com/audeering/opensmile-python/tree/master/opensmile/core/config/compare and then adapt the settings, e.g. https://github.com/audeering/opensmile-python/blob/master/opensmile/core/config/compare/ComParE_2016_core.lld.conf.inc#L51. Then specify the copy of ComParE_2016.conf in the call to opensmile.Smile, as demonstrated at https://audeering.github.io/opensmile-python/usage.html#custom-config.

@Zinc0816
Copy link
Author

Zinc0816 commented Sep 3, 2021

Thank you for your answer again!

I tried what you said without modifying the code first, but I got the following error. What could be the cause of this?
"opensmile.core.SMILEapi.OpenSmileException: Code: 6".
I checked and it seems to be related to threading...

I've attached the contents of the config file I created below. (However, it is a copy of the two contents of ComParE_2016.conf and ComParE_2016_core.lld.conf.inc, the code on GitHub. So it may be a little different from the writing style described in (https://audeering.github.io/opensmile-python/usage.html#custom-config). I'm trying to find the difference myself, but what's wrong...?)

/////////////////////////////////////////////////////////////////////////////////////////////////////////////
[componentInstances:cComponentManager]
instance[dataMemory].type=cDataMemory

;;; default source
{\cm[source{?}:source include config]}

[componentInstances:cComponentManager]
instance[is13_frame60].type=cFramer
instance[is13_win60].type=cWindower
instance[is13_fft60].type=cTransformFFT
instance[is13_fftmp60].type=cFFTmagphase

[is13_frame60:cFramer]
reader.dmLevel=wave
writer.dmLevel=is13_frame60
{\cm[bufferModeRbConf{../shared/BufferModeRb.conf.inc}:path to included config to set the buffer mode for the standard ringbuffer levels]}
frameSize = 0.060
frameStep = 0.010
frameCenterSpecial = left

[is13_win60:cWindower]
reader.dmLevel=is13_frame60
writer.dmLevel=is13_winG60
winFunc=gauss
gain=1.0
sigma=0.4

[is13_fft60:cTransformFFT]
reader.dmLevel=is13_winG60
writer.dmLevel=is13_fftcG60
zeroPadSymmetric = 1

[is13_fftmp60:cFFTmagphase]
reader.dmLevel=is13_fftcG60
writer.dmLevel=is13_fftmagG60

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

[componentInstances:cComponentManager]
instance[is13_frame25].type=cFramer
instance[is13_win25].type=cWindower
instance[is13_fft25].type=cTransformFFT
instance[is13_fftmp25].type=cFFTmagphase

[is13_frame25:cFramer]
reader.dmLevel=wave
writer.dmLevel=is13_frame25
{\cm[bufferModeRbConf]}
frameSize = 0.020
frameStep = 0.010
frameCenterSpecial = left

[is13_win25:cWindower]
reader.dmLevel=is13_frame25
writer.dmLevel=is13_winH25
winFunc=hamming

[is13_fft25:cTransformFFT]
reader.dmLevel=is13_winH25
writer.dmLevel=is13_fftcH25
zeroPadSymmetric = 1

[is13_fftmp25:cFFTmagphase]
reader.dmLevel=is13_fftcH25
writer.dmLevel=is13_fftmagH25

;;;;;;;;;;;;;;;;;;;; HPS pitch

[componentInstances:cComponentManager]
instance[is13_scale].type=cSpecScale
instance[is13_shs].type=cPitchShs

[is13_scale:cSpecScale]
reader.dmLevel=is13_fftmagG60
writer.dmLevel=is13_hpsG60
copyInputName = 1
processArrayFields = 0
scale=octave
sourceScale = lin
interpMethod = spline
minF = 25
maxF = -1
nPointsTarget = 0
specSmooth = 1
specEnhance = 1
auditoryWeighting = 1

[is13_shs:cPitchShs]
reader.dmLevel=is13_hpsG60
writer.dmLevel=is13_pitchShsG60
{\cm[bufferModeRbLagConf{../shared/BufferModeRbLag.conf.inc}:path to included config to set the buffer mode for levels which will be joint with Viterbi smoothed -lagged- F0]}
copyInputName = 1
processArrayFields = 0
maxPitch = 620
minPitch = 52
nCandidates = 6
scores = 1
voicing = 1
F0C1 = 0
voicingC1 = 0
F0raw = 1
voicingClip = 1
voicingCutoff = 0.700000
inputFieldSearch = Mag_octScale
octaveCorrection = 0
nHarmonics = 15
compressionFactor = 0.850000
greedyPeakAlgo = 1

;;;;; Pitch with Viterbi smoother
[componentInstances:cComponentManager]
instance[is13_energy60].type=cEnergy

[is13_energy60:cEnergy]
reader.dmLevel=is13_winG60
writer.dmLevel=is13_e60
; This must be > than buffersize of viterbi smoother
{\cm[bufferModeRbLagConf]}
rms=1
log=0

[componentInstances:cComponentManager]
instance[is13_pitchSmoothViterbi].type=cPitchSmootherViterbi

[is13_pitchSmoothViterbi:cPitchSmootherViterbi]
reader.dmLevel=is13_pitchShsG60
reader2.dmLevel=is13_pitchShsG60
writer.dmLevel=is13_pitchG60_viterbi
{\cm[bufferModeRbLagConf]}
copyInputName = 1
bufferLength=30
F0final = 1
F0finalEnv = 0
voicingFinalClipped = 0
voicingFinalUnclipped = 1
F0raw = 0
voicingC1 = 0
voicingClip = 0
wTvv =10.0
wTvvd= 5.0
wTvuv=10.0
wThr = 4.0
wTuu = 0.0
wLocal=2.0
wRange=1.0

[componentInstances:cComponentManager]
instance[is13_volmerge].type = cValbasedSelector

[is13_volmerge:cValbasedSelector]
reader.dmLevel = is13_e60;is13_pitchG60_viterbi
writer.dmLevel = is13_pitchG60
{\cm[bufferModeRbLagConf]}
idx=0
threshold=0.001
removeIdx=1
zeroVec=1
outputVal=0.0

;;;;;;;;;;;;;;;;;;; Voice Quality (VQ)

[componentInstances:cComponentManager]
instance[is13_pitchJitter].type=cPitchJitter

[is13_pitchJitter:cPitchJitter]
reader.dmLevel = wave
writer.dmLevel = is13_jitterShimmer
{\cm[bufferModeRbLagConf]}
copyInputName = 1
F0reader.dmLevel = is13_pitchG60
F0field = F0final
searchRangeRel = 0.250000
jitterLocal = 1
jitterDDP = 1
jitterLocalEnv = 0
jitterDDPEnv = 0
shimmerLocal = 1
shimmerLocalEnv = 0
onlyVoiced = 0
logHNR = 1
inputMaxDelaySec = 2.0
;periodLengths = 0
;periodStarts = 0
useBrokenJitterThresh = 0

;;;;;;;;;;;;;;;;;;;;; Energy / loudness

[componentInstances:cComponentManager]
instance[is13_energy].type=cEnergy
instance[is13_melspec1].type=cMelspec
instance[is13_audspec].type=cPlp
instance[is13_audspecRasta].type=cPlp
instance[is13_audspecSum].type=cVectorOperation
instance[is13_audspecRastaSum].type=cVectorOperation

[is13_energy:cEnergy]
reader.dmLevel = is13_frame25
writer.dmLevel = is13_energy
log=0
rms=1

[is13_melspec1:cMelspec]
reader.dmLevel=is13_fftmagH25
writer.dmLevel=is13_melspec1
; htk compatible sample value scaling
htkcompatible = 0
nBands = 26
; use power spectrum instead of magnitude spectrum
usePower = 1
lofreq = 20
hifreq = 8000
specScale = mel
showFbank = 0

; perform auditory weighting of spectrum
[is13_audspec:cPlp]
reader.dmLevel=is13_melspec1
writer.dmLevel=is13_audspec
firstCC = 0
lpOrder = 5
cepLifter = 22
compression = 0.33
htkcompatible = 0
doIDFT = 0
doLpToCeps = 0
doLP = 0
doInvLog = 0
doAud = 1
doLog = 0
newRASTA=0
RASTA=0

; perform RASTA style filtering of auditory spectra
[is13_audspecRasta:cPlp]
reader.dmLevel=is13_melspec1
writer.dmLevel=is13_audspecRasta
nameAppend = Rfilt
firstCC = 0
lpOrder = 5
cepLifter = 22
compression = 0.33
htkcompatible = 0
doIDFT = 0
doLpToCeps = 0
doLP = 0
doInvLog = 0
doAud = 1
doLog = 0
newRASTA=1
RASTA=0

[is13_audspecSum:cVectorOperation]
reader.dmLevel = is13_audspec
writer.dmLevel = is13_audspecSum
// nameAppend =
copyInputName = 1
processArrayFields = 0
operation = ll1
nameBase = audspec

[is13_audspecRastaSum:cVectorOperation]
reader.dmLevel = is13_audspecRasta
writer.dmLevel = is13_audspecRastaSum
// nameAppend =
copyInputName = 1
processArrayFields = 0
operation = ll1
nameBase = audspecRasta

;;;;;;;;;;;;;;; spectral

[componentInstances:cComponentManager]
instance[is13_spectral].type=cSpectral

[is13_spectral:cSpectral]
reader.dmLevel=is13_fftmagH25
writer.dmLevel=is13_spectral
bands[0]=250-650
bands[1]=1000-4000
rollOff[0] = 0.25
rollOff[1] = 0.50
rollOff[2] = 0.75
rollOff[3] = 0.90
flux=1
centroid=1
maxPos=0
minPos=0
entropy=1
variance=1
skewness=1
kurtosis=1
slope=1
harmonicity=1
sharpness=1

;;;;;;;;;;;;;;; mfcc

[componentInstances:cComponentManager]
instance[is13_melspecMfcc].type=cMelspec
instance[is13_mfcc].type=cMfcc

[is13_melspecMfcc:cMelspec]
reader.dmLevel=is13_fftmagH25
writer.dmLevel=is13_melspecMfcc
copyInputName = 1
processArrayFields = 1
; htk compatible sample value scaling
htkcompatible = 1
nBands = 26
; use power spectrum instead of magnitude spectrum
usePower = 1
lofreq = 20
hifreq = 8000
specScale = mel
inverse = 0

[is13_mfcc:cMfcc]
reader.dmLevel=is13_melspecMfcc
writer.dmLevel=is13_mfcc1_12
copyInputName = 0
processArrayFields = 1
firstMfcc = 1
lastMfcc = 14
cepLifter = 22.0
htkcompatible = 1

;;;;;;;;;;;;;;;; zcr

[componentInstances:cComponentManager]
instance[is13_mzcr].type=cMZcr

[is13_mzcr:cMZcr]
reader.dmLevel = is13_frame60
writer.dmLevel = is13_zcr
copyInputName = 1
processArrayFields = 1
zcr = 1
mcr = 0
amax = 0
maxmin = 0
dc = 0

;;;;;;;;;;;;;;;;;;;; smoothing

[componentInstances:cComponentManager]
instance[is13_smoNz].type=cContourSmoother
instance[is13_smoA].type=cContourSmoother
instance[is13_smoB].type=cContourSmoother
instance[is13_f0sel].type=cDataSelector

[is13_smoNz:cContourSmoother]
reader.dmLevel = is13_pitchG60;is13_jitterShimmer
writer.dmLevel = is13_lld_nzsmo
{\cm[bufferModeConf{../shared/BufferMode.conf.inc}:path to included config to set the buffer mode for the levels before the functionals]}
nameAppend = sma
copyInputName = 1
noPostEOIprocessing = 0
smaWin = 3
noZeroSma = 1

[is13_f0sel:cDataSelector]
reader.dmLevel = is13_lld_nzsmo
writer.dmLevel = is13_lld_f0_nzsmo
{\cm[bufferModeConf]}
nameAppend = ff0
selected = F0final_sma

[is13_smoA:cContourSmoother]
reader.dmLevel = is13_audspecSum;is13_audspecRastaSum;is13_energy;is13_zcr
writer.dmLevel = is13_lldA_smo
{\cm[bufferModeConf]}
nameAppend = sma
copyInputName = 1
noPostEOIprocessing = 0
smaWin = 3

[is13_smoB:cContourSmoother]
reader.dmLevel = is13_audspecRasta;is13_spectral;is13_mfcc1_12
writer.dmLevel = is13_lldB_smo
{\cm[bufferModeConf]}
nameAppend = sma
copyInputName = 1
noPostEOIprocessing = 0
smaWin = 3

;;;;;;;;; deltas
[componentInstances:cComponentManager]
instance[is13_deNz].type=cDeltaRegression
instance[is13_deA].type=cDeltaRegression
instance[is13_deB].type=cDeltaRegression
instance[is13_def0sel].type=cDeltaRegression

[is13_deNz:cDeltaRegression]
reader.dmLevel = is13_lld_nzsmo
writer.dmLevel = is13_lld_nzsmo_de
{\cm[bufferModeConf]}
onlyInSegments = 1
zeroSegBound = 1

[is13_deA:cDeltaRegression]
reader.dmLevel = is13_lldA_smo
writer.dmLevel = is13_lldA_smo_de
{\cm[bufferModeConf]}

[is13_deB:cDeltaRegression]
reader.dmLevel = is13_lldB_smo
writer.dmLevel = is13_lldB_smo_de
{\cm[bufferModeConf]}

[is13_def0sel:cDeltaRegression]
reader.dmLevel = is13_lld_f0_nzsmo
writer.dmLevel = is13_lld_f0_nzsmo_de
{\cm[bufferModeConf]}
onlyInSegments = 1
zeroSegBound = 1

;ComParE_2016.conf

[componentInstances:cComponentManager]
instance[is13_lldconcat].type=cVectorConcat
instance[is13_llddeconcat].type=cVectorConcat
instance[is13_funcconcat].type=cVectorConcat

[is13_lldconcat:cVectorConcat]
reader.dmLevel = is13_lld_nzsmo;is13_lldA_smo;is13_lldB_smo
writer.dmLevel = lld
includeSingleElementFields = 1

[is13_llddeconcat:cVectorConcat]
reader.dmLevel = is13_lld_nzsmo_de;is13_lldA_smo_de;is13_lldB_smo_de
writer.dmLevel = lld_de
includeSingleElementFields = 1

[is13_funcconcat:cVectorConcat]
reader.dmLevel = is13_functionalsA;is13_functionalsB;is13_functionalsNz;is13_functionalsF0;is13_functionalsLLD;is13_functionalsDelta
writer.dmLevel = func
includeSingleElementFields = 1

;;; default sink
{\cm[sink{?}:include external sink]}

@chausner-audeering
Copy link

ComParE_2016_core.func.conf.inc is also needed because it gets included in ComParE_2016.conf. All three files must be in the same folder.

@Zinc0816
Copy link
Author

Zinc0816 commented Sep 8, 2021

I tried what you said, but it still doesn't work. I don't know anyone around me who is familiar with openSMILE, so please forgive me if I keep asking questions here.

As for the steps to be taken...
(1) Copy the three files mentioned above.
(2) Paste them into the directory of the python file to be executed, and modify the necessary parts.
(3) Run the python file.
Or
(1) Modify and create the three aforementioned files using the method described in (https://audeering.github.io/opensmile-python/usage.html#custom-config).
(2) Make sure that the files are in the python directory to be executed.
(3) Run it.
Is this the correct procedure?

I get the following error in both cases.
'opensmile.core.SMILEapi.OpenSmileException: Code: 1'.
What's wrong?

Also, is there something wrong with the python file I am running?
I have included the basic code below and would like to know if there are any mistakes.

I apologize again and again. Please help me.

//////////////////////////////////////////////////////////////////////////////////////////////
import opensmile

smile = opensmile.Smile(
feature_set='ComParE_2016.conf',
feature_level=opensmile.FeatureLevel.LowLevelDescriptors,
)
x=smile.process_file("wavefile.wav")

@chausner-audeering
Copy link

You may want to enable logging to see the exact error: https://audeering.github.io/opensmile-python/usage.html#logging

@zilunpeng
Copy link

zilunpeng commented Jul 29, 2022

Here is the error I got: (ERR) [1] configManager: cFileConfigReader::openInput : cannot find input file '../shared/BufferModeRbLag.conf.inc'!

BufferModeRbLag.conf.inc' is at https://github.com/audeering/opensmile-python/blob/e3a0f0a4f768b201f58660427cd36a4c762c6867/opensmile/core/config/shared/BufferModeRbLag.conf.inc

@zehuiwu
Copy link

zehuiwu commented Feb 7, 2023

I got the exception "OpenSmileException: Code: 1" after I changed the frameSize directly inside the config file in the package. I found that there are two frameSize and frameStep parameters in the lld config, and I need to make sure they are consistent. Problem solved!

@rob-son01
Copy link

I got the exception "OpenSmileException: Code: 1" after I changed the frameSize directly inside the config file in the package. I found that there are two frameSize and frameStep parameters in the lld config, and I need to make sure they are consistent. Problem solved!

Sorry guys, I used open smile (compare lld feature set) to obtain data from an audio dataset to use it as train set for machine learning. I had to test the performance on different train set using different window sizes to extract festures. I was wondering how can I understand for which features is used value 0.060 or 0.020? Thanks for the reply

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants