Live Captions using AWS Transcribe

Utilizes AWS Transcribe Streaming to create live captions for live video streaming. This solution uses AWS MediaLive as the encoder, AWS MediaPackage, Amazon Translate, Amazon Transcribe Streaming, Amazon S3, and Amazon Lambda.

Navigate to README | Workshop | Uninstall |TranscribeStreamingLambda | CaptionCreationLambda

Architecture Overview

Deployment

The solution is deployed using a CloudFormation template with AWS Lambda backed custom resources. Follow the instructions below to deploy the solution. To deploy this solution use one of the following CloudFormation templates.

Deploy with CloudFront CDN ~20 minutes

Deploy in US-West-2 without CloudFront CDN <3 minutes (Faster to deploy than with CloudFront)

CloudFormation has only been tested in us-west-2, but resources for CloudFormation are duplicated in ap-south-1 eu-west-3 eu-north-1 eu-west-2 eu-west-1 ap-northeast-2 ap-northeast-1 sa-east-1 ca-central-1 ap-southeast-1 ap-southeast-2 eu-central-1 us-east-1 us-east-2 us-west-1 and us-west-2 for your convenience.

Follow these steps to deploy your stack.

Step 1: Start The CloudFormation

Step 2: Setup The Stack

Choose any name for the stack. Then add a Primary and Secondary source URL.

Lastly make sure to choose the languages that you want AWS Transcribe Streaming to generate for the live stream.

Provide comma delimited language codes for the caption languages you want generated for your Live Stream.
For example.

en, es, pt

Here is list of the supported output subtitle languages. English is the only supported input audio language currently.

LANGUAGE_CODES = {
    'ar': 'Arabic',
    'zh': 'Chinese Simplified',
    'zh-TW': 'Chinese Traditional',
    'cs': 'Czech',
    'da': 'Danish',
    'nl': 'Dutch',
    'en': 'English',
    'fi': 'Finnish',
    'fr': 'French',
    'de': 'German',
    'he': 'Hebrew',
    'id': 'Indonesian',
    'it': 'Italian',
    'ja': 'Japanese',
    'ko': 'Korean',
    'pl': 'Polish',
    'pt': 'Portuguese',
    'ru': 'Russian',
    'es': 'Spanish',
    'sv': 'Swedish',
    'tr': 'Turkish'
}

Step 3: Finish Starting the CloudFormation

Click the I accept button then click "Create" to start the stack.

Step 4: Start AWS MediaLive Channel

For cost purposes the CloudFormation does not start the MediaLive Channel. Head over to the MediaLive console page and click the Start Button to start your channel.

The MediaLive Channel has to be started before you can start seeing a video stream.

Step 5: Watch The LiveStream

Under the outputs of the CloudFormation template get the MediaPackage endpoint to view the channel.

After starting the MediaLive Channel and it has been playing for a minute check out your live stream here. Paste the HLS endpoint URL ending in m3u8 from the outputs in CloudFormation into this demo player page.

Video JS HLS Demo Player

When you are done with the live event, stop the MediaLive channel. If you want zero cost after a live event delete the CloudFormation stack that you created in the CloudFormation console. The only items that should be left are a logs S3 Bucket and a captions S3 Bucket. You can delete these two items manually. Even if these buckets are left after a live stream thier cost is low.

Preview

Here is the solution using a free live broadcast as the input feed.

Pricing

You are responsible for the cost of the AWS services used while running this live streaming solution. As of the date of publication, the cost for running this solution in the US East (N. Virginia) Region is shown in the table below. The cost depends on the encoding profile you choose, and does not include data transfer fees, the cost for the demo HTML preview player, or Amazon CloudFront and AWS Elemental MediaPackage costs, which will vary depending on the number of users and the types of end user devices.

Encoding Profile Total Cost/Hour
1080 (8 encodes) $5.00
720 (7 encodes) $4.00
540 (5 encodes) $2.25

Added costs for caption generation.

Total Cost/Hour
English $1.44
Each Additional Language $0.50

Amazon Transcribe Pricing Amazon Translate Pricing

Note that pricing is per minute, with a minimum of 10 minutes. Prices are subject to change. For full details, see the pricing webpage for each AWS service you will be using in this solution.

Encoding Profiles

Encoding Profiles To solution Configures AWS Elemental MediaLive with one of three encoding profiles based on the source resolution defined at launch as a CloudFormation parameter. The three options are 1080, 720, 540 and correspond to the following encoding profiles:

1080p Profile:

1080p@6500kbps, 720p@5000kbps, 720p@3300kbps, 540p@2000kbps, 432p@1200kbps, 360p@800kbps, 270@400kbps, 234p@200kbps.

720p Profile::

720p@5000kbps, 720p@3300kbps, 540p@2000kbps, 432p@1200kbps, 360p@800kbps, 270@400kbps, 234p@200kbps.

540p Profile::

540p@2000kbps, 432p@1200kbps, 360p@800kbps, 270@400kbps, 234p@200kbps.

FAQ

PDF Guide to answer additional questions

Live Streaming AWS PDF Guide

Q. I want to run more than 2 AWS MediaLive channels ?

There is a default limit of 5 Amazon MediaLive channels per AWS account. But that can be increased from support. To run more than 2 channels using subtitles generated with Amazon Transcribe message AWS support to get your Amazon Transcribe limit increased.

Q. What ingest formats does the solution support?

The solution can use all input formats that AWS Elemental MediaLive supports including Real-Time Transport Protocol (RTP) push, Real-Time Messaging Protocol (RTMP) push or pull, and HLS streams pull.

Q. I want to use RTP PULL or RTMP PULL?

In order to use RTP_PULL you need to fill in the Input CIDR block field in the CloudFormation. Use the public ip address where your group RTP stream is originating from. If you want to be able to send an RTP input into your MediaLive from anywhere in the world use 0.0.0.0/0 in the field Input CIDR block

Q. I want low latency?

To minimize latency choose to use an RTP or RTMP input. In addition edit the MediaPackage channel HLS endpoint and change the segment size from 6 seconds to 2 seconds. Making these changes can yeild under 20 seconds of end to end latency, where as without captions it yeilds under 10 seconds of latency. This is best case, different HTML5 players add different ammounts of latency.

Q. Does this solution support digital rights management (DRM)?

The solution does not support DRM at this time but it can be customized to support DRM. For an example of how to integrate DRM using Secure Packager and Encoding Key Exchange with AWS Elemental MediaPackage, see this GitHub repository.

Q. What resolutions does this solution support?

The solution includes the following output resolutions: 1080p at 6500kbps, 720p at 5000kbps and 3300kbps, 540p at 2000kbps, 432p at 1200kbps, 360p at 800kbps, 270p at 400kbps, and 234p at 200kbps.

Additional Resources

License

This library is licensed under the Apache 2.0 License.

Navigate

Navigate to README | Workshop | Uninstall |TranscribeStreamingLambda | CaptionCreationLambda

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github		.github
cloudformation		cloudformation
images		images
source		source
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
UNINSTALL.md		UNINSTALL.md
WORKSHOP.md		WORKSHOP.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Live Captions using AWS Transcribe

On this Page

Architecture Overview

Deployment

Deploy with CloudFront CDN ~20 minutes

Deploy in US-West-2 without CloudFront CDN <3 minutes (Faster to deploy than with CloudFront)

Step 1: Start The CloudFormation

Step 2: Setup The Stack

Step 3: Finish Starting the CloudFormation

Step 4: Start AWS MediaLive Channel

Step 5: Watch The LiveStream

Preview

Pricing

Encoding Profiles

1080p Profile:

720p Profile::

540p Profile::

FAQ

PDF Guide to answer additional questions

Q. I want to run more than 2 AWS MediaLive channels ?

Q. What ingest formats does the solution support?

Q. I want to use RTP PULL or RTMP PULL?

Q. I want low latency?

Q. Does this solution support digital rights management (DRM)?

Q. What resolutions does this solution support?

Additional Resources

License

Navigate

About

Releases

Packages

Languages

License

gpeng/aws-transcribe-captions-live

Folders and files

Latest commit

History

Repository files navigation

Live Captions using AWS Transcribe

On this Page

Architecture Overview

Deployment

Deploy with CloudFront CDN ~20 minutes

Deploy in US-West-2 without CloudFront CDN <3 minutes (Faster to deploy than with CloudFront)

Step 1: Start The CloudFormation

Step 2: Setup The Stack

Step 3: Finish Starting the CloudFormation

Step 4: Start AWS MediaLive Channel

Step 5: Watch The LiveStream

Preview

Pricing

Encoding Profiles

1080p Profile:

720p Profile::

540p Profile::

FAQ

PDF Guide to answer additional questions

Q. I want to run more than 2 AWS MediaLive channels ?

Q. What ingest formats does the solution support?

Q. I want to use RTP PULL or RTMP PULL?

Q. I want low latency?

Q. Does this solution support digital rights management (DRM)?

Q. What resolutions does this solution support?

Additional Resources

License

Navigate

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages