-
Notifications
You must be signed in to change notification settings - Fork 331
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Audio evals & data processor for llm_classify() (#5616)
* change 'rails' to 'expected_eval_labels' * wip * wip * revert 'classify.py' * include MultimodalPrompt * further changes to audio.py * getting smth now * moving audio_classify() to classify.py * remove print * move data fetching to where we build the messages * scrapped and redone - using llm_classify within audio_classify * rename to proper * cleanup * merge llm_classify and audio_classify * change back to llm_classify * redo and test file format inference * make data_processor apply to each data element, add a TEXT_DATA prompt part type, since we're adding a data processor, and remove some test cases to implement * ruff + clean * added a test... i have a feeling it needs work. dependency stuff still needs to be sorted out * ruff cleanup * pyright cleanup * allow for users to pass through lists of strings and tuples to llm_classify * unncessary print * add tests for data_processor, allow for multiple types to be passed thru * revise typing - eliminate use of Sequence * typing enhancements * formatting is ruff * more mypy * dependencies * enhance tests + address comments * Add deprecation message when using `dataframe` arg as a kwarg * stash * accidentally modified smth i shouldnt have * ruff * revert a test name * revise tests + delete comments * Tweak series normalization behavior * dustin's change + tweak temp var logic * windows tests failing bc. uv-loop not windows compliant * wrong package * made a mistake making template vars a set - order is not preserved * emotion template * issue w/ import * trying revert * update template * addressing dustin's comments * formatting is ruff * update emotion template * ruff * Update default audio template * Improve error message * Add errors when using unsupported audio format --------- Co-authored-by: Dustin Ngo <dustin@arize.com> Co-authored-by: sallyannarize <sdelucia@arize.com>
- Loading branch information
1 parent
036d2c9
commit 0eda8ce
Showing
10 changed files
with
876 additions
and
27 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -53,6 +53,7 @@ test = [ | |
"nest_asyncio", | ||
"pandas-stubs<=2.0.2.230605", | ||
"types-tqdm", | ||
"lameenc" | ||
] | ||
|
||
[project.urls] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
133 changes: 133 additions & 0 deletions
133
packages/phoenix-evals/src/phoenix/evals/default_audio_templates.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
from phoenix.evals.templates import ( | ||
ClassificationTemplate, | ||
PromptPartContentType, | ||
PromptPartTemplate, | ||
) | ||
|
||
EMOTION_AUDIO_BASE_TEMPLATE_PT_1 = """ | ||
You are an AI system designed to classify emotions in audio files. | ||
### TASK: | ||
Analyze the provided audio file and classify the primary emotion based on these characteristics: | ||
- Tone: General tone of the speaker (e.g., cheerful, tense, calm). | ||
- Pitch: Level and variability of the pitch (e.g., high, low, monotone). | ||
- Pace: Speed of speech (e.g., fast, slow, steady). | ||
- Volume: Loudness of the speech (e.g., loud, soft, moderate). | ||
- Intensity: Emotional strength or expression (e.g., subdued, sharp, exaggerated). | ||
The classified emotion must be one of the following: | ||
['anger', 'happiness', 'excitement', 'sadness', 'neutral', 'frustration', 'fear', 'surprise', | ||
'disgust', 'other'] | ||
IMPORTANT: Choose the most dominant emotion expressed in the audio. Neutral should only be used when | ||
no other emotion is clearly present, do your best to avoid this label. | ||
************ | ||
Here is the audio to classify: | ||
""" | ||
|
||
EMOTION_AUDIO_BASE_TEMPLATE_PT_2 = """{audio}""" | ||
|
||
EMOTION_AUDIO_BASE_TEMPLATE_PT_3 = """ | ||
RESPONSE FORMAT: | ||
Provide a single word from the list above representing the detected emotion. | ||
************ | ||
EXAMPLE RESPONSE: excitement | ||
************ | ||
Analyze the audio and respond in this format. | ||
""" | ||
|
||
EMOTION_AUDIO_EXPLANATION_TEMPLATE_PT_1 = """ | ||
You are an AI system designed to classify emotions in audio files. | ||
### TASK: | ||
First, explain in a step-by-step manner how the provided audio file based on these characteristics | ||
and how they indicate the emotion of the speaker: | ||
- Tone: General tone of the speaker (e.g., cheerful, tense, calm). | ||
- Pitch: Level and variability of the pitch (e.g., high, low, monotone). | ||
- Pace: Speed of speech (e.g., fast, slow, steady). | ||
- Volume: Loudness of the speech (e.g., loud, soft, moderate). | ||
- Intensity: Emotional strength or expression (e.g., subdued, sharp, exaggerated). | ||
Then, classify the primary emotion. The classified emotion must be one of the following: | ||
['anger', 'happiness', 'excitement', 'sadness', 'neutral', 'frustration', 'fear', 'surprise', | ||
'disgust', 'other'] | ||
IMPORTANT: Choose the most dominant emotion expressed in the audio. Neutral should only be used when | ||
no other emotion is clearly present, do your best to avoid this label. | ||
************ | ||
Here is the audio to classify: | ||
""" | ||
|
||
EMOTION_AUDIO_EXPLANATION_TEMPLATE_PT_3 = """ | ||
EXAMPLE RESPONSE FORMAT: | ||
************ | ||
EXPLANATION: An explanation of your reasoning based on the tone, pitch, pace, volume, and intensity | ||
of the audio. | ||
LABEL: "excitement" | ||
************ | ||
Analyze the audio and respond in the format shown above. | ||
""" | ||
|
||
EMOTION_AUDIO_RAILS = [ | ||
"anger", | ||
"happiness", | ||
"excitement", | ||
"sadness", | ||
"neutral", | ||
"frustration", | ||
"fear", | ||
"surprise", | ||
"disgust", | ||
"other", | ||
] | ||
|
||
EMOTION_PROMPT_TEMPLATE = ClassificationTemplate( | ||
rails=EMOTION_AUDIO_RAILS, | ||
template=[ | ||
PromptPartTemplate( | ||
content_type=PromptPartContentType.TEXT, | ||
template=EMOTION_AUDIO_BASE_TEMPLATE_PT_1, | ||
), | ||
PromptPartTemplate( | ||
content_type=PromptPartContentType.AUDIO, | ||
template=EMOTION_AUDIO_BASE_TEMPLATE_PT_2, | ||
), | ||
PromptPartTemplate( | ||
content_type=PromptPartContentType.TEXT, | ||
template=EMOTION_AUDIO_BASE_TEMPLATE_PT_3, | ||
), | ||
], | ||
explanation_template=[ | ||
PromptPartTemplate( | ||
content_type=PromptPartContentType.TEXT, | ||
template=EMOTION_AUDIO_EXPLANATION_TEMPLATE_PT_1, | ||
), | ||
PromptPartTemplate( | ||
content_type=PromptPartContentType.AUDIO, | ||
template=EMOTION_AUDIO_BASE_TEMPLATE_PT_2, | ||
), | ||
PromptPartTemplate( | ||
content_type=PromptPartContentType.TEXT, | ||
template=EMOTION_AUDIO_EXPLANATION_TEMPLATE_PT_3, | ||
), | ||
], | ||
) | ||
""" | ||
A template for evaluating the emotion of an audio sample. It return | ||
an emotion and provides a detailed explanation template | ||
to assist users in articulating their judgment on code readability. | ||
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.