-
-
Notifications
You must be signed in to change notification settings - Fork 143
API ‐ OpenAI V1 Speech Compatible Endpoint
AllTalk provides an endpoint compatible with the OpenAI Speech v1 API. This allows for easy integration with existing systems designed to work with OpenAI's text-to-speech service.
-
URL:
http://{ipaddress}:{port}/v1/audio/speech
-
Method:
POST
-
Content-Type:
application/json
The request body must be a JSON object with the following fields:
Field | Type | Description |
---|---|---|
model |
string | The TTS model to use. Currently ignored, but required in the request. |
input |
string | The text to generate audio for. Maximum length is 4096 characters. |
voice |
string | The voice to use when generating the audio. |
response_format |
string | (Optional) The format of the audio. Audio will be transcoded to the requested format. |
speed |
float | (Optional) The speed of the generated audio. Must be between 0.25 and 4.0. Default is 1.0. |
The voice
parameter supports the following values:
alloy
echo
fable
nova
onyx
shimmer
These voices are mapped to AllTalk voices on a one-to-one basis within the AllTalk Gradio interface, on a per-TTS engine basis.
curl -X POST "http://127.0.0.1:7851/v1/audio/speech" \
-H "Content-Type: application/json" \
-d '{
"model": "any_model_name",
"input": "Hello, this is a test.",
"voice": "nova",
"response_format": "wav",
"speed": 1.0
}'
The endpoint returns the generated audio data directly in the response body.
- There is no capability within this API to specify a language. The response will be in whatever language the currently loaded TTS engine and model support.
- If RVC is globally enabled in AllTalk settings and a voice other than "Disabled" is selected for the character voice, the chosen RVC voice will be applied after the TTS is generated and before the audio is transcoded and sent back out.
Voices can be re-mapped in the Gradio Interface > TTS Engine Settings > {Chosen TTS Engine} > OpenAI Voice Mappings.
You can also remap the 6 OpenAI voices to any voices supported by the currently loaded TTS engine using the following endpoint:
-
URL:
http://{ipaddress}:{port}/api/openai-voicemap
-
Method:
PUT
-
Content-Type:
application/json
curl -X PUT "http://localhost:7851/api/openai-voicemap" \
-H "Content-Type: application/json" \
-d '{
"alloy": "female_01.wav",
"echo": "female_01.wav",
"fable": "female_01.wav",
"nova": "female_01.wav",
"onyx": "male_01.wav",
"shimmer": "male_02.wav"
}'
Note: The Gradio interface will not reflect these changes until AllTalk is reloaded, as Gradio caches the list.
import requests
import json
# Define the endpoint URL
url = "http://127.0.0.1:7851/v1/audio/speech"
# Define the request payload
payload = {
"model": "any_model_name",
"input": "Hello, this is a test.",
"voice": "nova",
"response_format": "wav",
"speed": 1.0
}
# Set the headers
headers = {
"Content-Type": "application/json"
}
# Send the POST request
response = requests.post(url, data=json.dumps(payload), headers=headers)
# Check the response
if response.status_code == 200:
with open("output.wav", "wb") as f:
f.write(response.content)
print("Audio file saved as output.wav")
else:
print(f"Error: {response.status_code} - {response.text}")
// Define the endpoint URL
const url = "http://127.0.0.1:7851/v1/audio/speech";
// Define the request payload
const payload = {
model: "any_model_name",
input: "Hello, this is a test.",
voice: "nova",
response_format: "wav",
speed: 1.0
};
// Set the headers
const headers = {
"Content-Type": "application/json"
};
// Send the POST request
fetch(url, {
method: "POST",
headers: headers,
body: JSON.stringify(payload)
})
.then(response => {
if (response.ok) {
return response.blob();
} else {
return response.text().then(text => { throw new Error(text); });
}
})
.then(blob => {
// Create a link element
const url = window.URL.createObjectURL(blob);
const a = document.createElement('a');
a.style.display = 'none';
a.href = url;
a.download = 'output.wav';
// Append the link to the body
document.body.appendChild(a);
// Programmatically click the link to trigger the download
a.click();
// Remove the link from the document
window.URL.revokeObjectURL(url);
document.body.removeChild(a);
console.log("Audio file saved as output.wav");
})
.catch(error => {
console.error("Error:", error);
});
These examples demonstrate how to use the OpenAI V1 API Compatible Endpoint with AllTalk in both Python and JavaScript environments.