Menu
Products
Products
Video Hosting
Upload and manage your videos in a centralized video library.
Image Hosting
Upload and manage all your images in a centralized library.
Galleries
Choose from 100+templates to showcase your media in style.
Video Messaging
Record, and send personalized video messages.
CincoTube
Create your own community video hub your team, students or fans.
Pages
Create dedicated webpages to share your videos and images.
Live
Create dedicated webpages to share your videos and images.
For Developers
Video API
Build a unique video experience.
DeepUploader
Collect and store user content from anywhere with our file uploader.
Solutions
Solutions
Enterprise
Supercharge your business with secure, internal communication.
Townhall
Webinars
Team Collaboration
Learning & Development
Creative Professionals
Get creative with a built in-suite of editing and marketing tools.
eCommerce
Boost sales with interactive video and easy-embedding.
Townhall
Webinars
Team Collaboration
Learning & Development
eLearning & Training
Host and share course materials in a centralized portal.
Sales & Marketing
Attract, engage and convert with interactive tools and analytics.
"Cincopa helped my Enterprise organization collaborate better through video."
Book a Demo
Resources
Resources
Blog
Learn about the latest industry trends, tips & tricks.
Help Centre
Get access to help articles FAQs, and all things Cincopa.
Partners
Check out our valued list of partners.
Product Updates
Stay up-to-date with our latest greatest features.
Ebooks, Guides & More
Customer Stories
Hear how we've helped businesses succeed.
Boost Campaign Performance Through Video
Discover how to boost your next campaign by using video.
Download Now
Pricing
Watch a Demo
Demo
Login
Start Free Trial
For developers building video applications, accessibility and user engagement are critical. Adding automated captions improves accessibility for hearing-impaired users, enhances SEO, and ensures compliance with regulations like the ADA (Americans with Disabilities Act) . AWS provide a solution for this through Amazon Transcribe , a fully managed Automatic Speech Recognition (ASR) service. Understanding Amazon Transcribe for Video Captioning Amazon Transcribe converts speech in audio/video files into text with high accuracy, supporting multiple languages, speaker identification, and custom vocabulary. For video applications, Transcribe can generate SRT (SubRip Subtitle) or WebVTT (Web Video Text Tracks) files, which can be embedded into video players like Video.js, HLS, or DASH streams . Key features of AWS Transcribe for video captioning include: Workflow: Automated Captioning for Video Files A typical serverless workflow for adding captions to a video involves Uploading the video to S3 (e.g., user-generated content). Extracting audio using AWS MediaConvert or FFmpeg . Sending the audio to Amazon Transcribe for caption generation. Storing the SRT/WebVTT file back in S3 . Embedding captions in a video player . Here’s how to implement this using AWS Step Functions, Lambda, and Transcribe : Step 1: Extract Audio Using AWS MediaConvert MediaConvert can extract audio from a video file and save it as an MP3/WAV file in S3. import boto3 mediaconvert = boto3.client('mediaconvert', endpoint_url='MEDIACONVERT_ENDPOINT') response = mediaconvert.create_job( JobSettings={ 'Inputs': [{ 'FileInput': 's3://your-bucket/input/video.mp4' }], 'OutputGroups': [{ 'OutputGroupSettings': { 'FileGroupSettings': { 'Destination': 's3://your-bucket/output/audio/' } }, 'Outputs': [{ 'AudioDescriptions': [{ 'CodecSettings': { 'Codec': 'AAC', 'AacSettings': { 'Bitrate': 96000, 'SampleRate': 48000 } } }], 'Extension': 'mp3' }] }] } ) Step 2: Transcribe Audio to Generate Captions Once the audio is extracted, invoke Amazon Transcribe to generate an SRT file. transcribe = boto3.client('transcribe') def start_transcription_job(audio_uri): response = transcribe.start_transcription_job( TranscriptionJobName='video-caption-job', Media={'MediaFileUri': audio_uri}, MediaFormat='mp3', LanguageCode='en-US', OutputBucketName='your-bucket', OutputKey='captions/', Subtitles={ 'Formats': ['srt'], 'OutputStartIndex': 1 } ) return response Step 3: Embed Captions in a Video Player After the SRT file is generated, it can be loaded into an HTML5 video player.
Real-Time Captioning for Live Video Streams For live streaming (e.g., using Amazon IVS or MediaLive), AWS Transcribe supports real-time transcription via WebSockets . This is useful for live events, webinars, or broadcasts. Architecture for Real-Time Captioning Capture live audio from the stream (e.g., via Kinesis Video Streams). Stream audio chunks to Amazon Transcribe in real-time. Broadcast transcribed text via WebSocket API to the frontend. Here’s a snippet for real-time transcription: const AWS = require('aws-sdk');const WebSocket = require('ws');const const AWS = require('aws-sdk'); const WebSocket = require('ws'); const transcribe = new AWS.TranscribeService(); const ws = new WebSocket('wss://transcribe-streaming.us-east 1.amazonaws.com'); ws.on('open', () => { const audioStream = getAudioStreamFromLiveSource(); // Custom function const payload = { 'audio_stream': audioStream, 'language_code': 'en-US', 'media_encoding': 'pcm', 'sample_rate': 44100 }; ws.send(JSON.stringify(payload)); }); ws.on('message', (data) => { const transcript = JSON.parse(data).results[0].alternatives[0].transcript; broadcastToClients(transcript); // Send to frontend via Socket.IO }); Optimizing Transcription Accuracy To improve transcription quality, developers can leverage several advanced features offered by AWS Transcribe . One effective approach is using Custom Vocabularies , which allows the inclusion of industry-specific terms such as medical, legal, or technical jargon, significantly enhancing accuracy for specialized content. Another useful feature is Channel Identification , which helps distinguish between multiple speakers in audio files, making it ideal for use cases like podcasts, interviews, or conference recordings where speaker separation is crucial. Additionally, for applications requiring multilingual support, developers can post-process transcriptions with Amazon Translate to generate subtitles in different languages, broadening accessibility for global audiences. These techniques collectively ensure higher precision and adaptability in automated speech recognition workflows. Example of adding a custom vocabulary: transcribe.create_vocabulary( VocabularyName='TechnicalTerms', LanguageCode='en-US', Phrases=['TensorFlow', 'Kubernetes', 'AWS Lambda'] ) Cost Considerations The following table outlines the key factors that influence Amazon Transcribe's pricing. These best practices can help reduce costs when using Amazon Transcribe in your workflows.