This is example code for a tutorial that shows you how to transcribe a phone call automatically using the Amazon Transcribe API. You'll need two handsets with two different phone numbers to test this.
It uses the Vonage Voice API to initiate and record the call. The call audio is created in your local recordings
folder and uploaded to an S3 bucket.
Amazon Cloudwatch triggers a serverless Lambda function when the transcription job has completed. Transcripts are created in S3 and downloaded to your application's transcripts
folder.
If you're new to Vonage, you can sign up for a Vonage account and get some free credit to get you started.
Run the following at a terminal prompt to install the CLI and configure it with your api_key
and api_secret
, which you will find in the Developer Dashboard:
npm install -g nexmo-cli
vonage config:set --apiKey=<API_KEY> --apiSecret=<API_SECRET>
If you don't already have one, buy a Vonage virtual number to receive inbound calls.
List available numbers (replace GB
with your two-character country code):
vonage numbers:search GB
Purchase one of the numbers:
vonage numbers:buy 447700900001
Use the CLI to create a Voice API application with the webhooks that will be responsible for answering a call on your Vonage number (/webhooks/answer
) and logging call events (/webhooks/event
), respectively. Replace example.com
in the following command with your own public-facing URL host name (consider using ngrok for testing purposes, and if you do use it, run it now to get the temporary URLs that ngrok
provides and leave it running to prevent the URLs changing).
Run the command in the application's root directory:
vonage apps:create "My Echo Server" --voice_answer_url=https://example.com/webhooks/answer --voice_event_url=https://example.com/webhooks/event
Make a note of the Application ID returned by this command. It will also download your private.key
file which the Voice API uses to authenticate your application.
Use the application ID to link your virtual number:
vonage apps:link --number=<NUMBER>
In the application's parent directory and run npm install
:
npm install
Complete the following steps:
- Create an AWS account and create an Adminstrator user. Make a note of your AWS key and secret, because you cannot retrieve the secret later on.
- Install and configure the AWS CLI
- Use the following AWS CLI command to create two new S3 buckets, one for the call audio and the other for the generated transcripts. These must be unique across S3. Ensure that the
region
supports the Amazon Transcribe API and Cloudwatch Events:
aws s3 mb s3://your-audio-bucket-name --region us-east-1
aws s3 mb s3://your-transcription-bucket-name --region us-east-1
Copy example.env
to .env
and configure the following settings:
VONAGE_APPLICATION_ID
: The Vonage Voice Application ID you created earlierVONAGE_PRIVATE_KEY_FILE
: The name of your private key file, which should be in the root of your application directory. E.g.private.key
OTHER_PHONE_NUMBER
: Another phone number you can call to create a conversationAWS_KEY
: Your Amazon Web Services keyAWS_SECRET
: Your Amazon Web Services secretAWS_REGION
: Your Amazon Web Services region, e.g.us-east-1
S3_PATH
: Your path to S3 bucket storage, which should include theAWS_REGION
, e.g.https://s3-us-east-1.amazonaws.com
S3_AUDIO_BUCKET_NAME
: A uniquely-named S3 bucket which will contain call audio mp3 filesS3_TRANSCRIPTS_BUCKET_NAME
=: A uniquely-named S3 bucket which will contain transcripts of the call audio
The serverless function is a Lambda that executes when a transcription job completes. It makes a POST
request to your application's /webhooks/transcription
endpoint.
Deploy this function:
cd transcribeReadyService
serverless deploy
- In the root directory of your application, execute:
node index.js
-
Call your Vonage virtual number from one of your personal numbers. The other number you specified in the
OTHER_PHONE_NUMBER
setting should ring - answer it when it does. -
Speak a few words into each handset. When you're finished, disconnect both.
-
Watch the console as the transcription job is being processed. If everything works properly, you should receive a notification that your job is complete and you should find the call audio file in your
recordings
directory and the corresponding transcript (in JSON format) intranscripts
. Note how the transcript is split into channels, one for each device you used. The application parses the completed transcription and displays the result for each channel in the console.
You can also run the code using Docker Compose using the following command.
docker-compose up
If you have more than two numbers, you can add more callers to the conversation. Simply create a connect
action for each in the /webhooks/answer
NCCO and increase the number of channels in the record
action accordingly.
We love to hear from you so if you have questions, comments or find a bug in the project, let us know! You can either:
- Open an issue on this repository
- Tweet at us! We're @VonageDev on Twitter
- Or join the Vonage Community Slack
- Read the tutorial that accompanies this demo application to learn how it was put together.
- Vonage Voice API
- AWS