Generative AI Startup Twelve Labs Works with AWS to Make Videos as Searchable as Text
Twelve Labs, a startup specializing in multimodal AI for video content understanding, is partnering with AWS to enhance video searchability. The company's foundation models enable natural language searches within videos, analyzing actions, objects, and sounds. Using AWS technologies, Twelve Labs has achieved up to 10% faster training speeds while reducing costs by over 15%.
Their Marengo and Pegasus models can provide text summaries and audio translations in over 100 languages, making video content searchable and accessible. The technology allows developers to create applications for semantic video search across media, entertainment, gaming, and sports industries, enabling precise moment retrieval and automated highlight reel creation.
Twelve Labs, una startup specializzata nell'AI multimodale per la comprensione dei contenuti video, sta collaborando con AWS per migliorare la ricercabilità dei video. I modelli fondanti dell'azienda consentono ricerche in linguaggio naturale all'interno dei video, analizzando azioni, oggetti e suoni. Utilizzando le tecnologie AWS, Twelve Labs ha raggiunto velocità di addestramento fino a 10% più veloci riducendo i costi di oltre 15%.
I loro modelli Marengo e Pegasus possono fornire sintesi testuali e traduzioni audio in oltre 100 lingue, rendendo i contenuti video ricercabili e accessibili. La tecnologia consente agli sviluppatori di creare applicazioni per la ricerca semantica di video in vari settori, tra cui media, intrattenimento, giochi e sport, permettendo il recupero preciso di momenti e la creazione automatizzata di video riassuntivi.
Twelve Labs, una startup especializada en IA multimodal para la comprensión de contenido de video, se está asociando con AWS para mejorar la búsqueda de videos. Los modelos fundamentales de la empresa permiten búsquedas en lenguaje natural dentro de los videos, analizando acciones, objetos y sonidos. Usando tecnologías de AWS, Twelve Labs ha logrado velocidades de entrenamiento hasta 10% más rápidas mientras reduce costos en más de 15%.
Sus modelos Marengo y Pegasus pueden proporcionar resúmenes de texto y traducciones de audio en más de 100 idiomas, haciendo que el contenido de video sea buscable y accesible. La tecnología permite a los desarrolladores crear aplicaciones para la búsqueda semántica de videos en diversas industrias, incluyendo medios, entretenimiento, juegos y deportes, facilitando la recuperación precisa de momentos y la creación automatizada de resúmenes.
Twelve Labs는 비디오 콘텐츠 이해를 위한 다중 모달 AI를 전문으로 하는 스타트업으로, AWS와 협력하여 비디오 검색 기능을 향상시키고 있습니다. 회사의 기반 모델은 비디오 내에서 자연어 검색을 가능하게 하며, 동작, 객체 및 소리를 분석합니다. AWS 기술을 사용하여 Twelve Labs는 훈련 속도를 10% 더 빠르게 달성하고 비용을 15% 이상 줄이는 성과를 올렸습니다.
그들의 Marengo 및 Pegasus 모델은 100개 이상의 언어로 텍스트 요약 및 오디오 번역을 제공할 수 있어 비디오 콘텐츠를 검색 가능하고 접근할 수 있게 만듭니다. 이 기술은 개발자가 미디어, 엔터테인먼트, 게임 및 스포츠 산업 전반에 걸쳐 의미론적 비디오 검색 애플리케이션을 만들 수 있도록 하여 정확한 순간 회수 및 자동 하이라이트 릴 생성이 가능하게 합니다.
Twelve Labs, une startup spécialisée dans l'IA multimodale pour la compréhension du contenu vidéo, s'associe à AWS pour améliorer la recherche de vidéos. Les modèles fondamentaux de l'entreprise permettent des recherches en langage naturel dans les vidéos, en analysant les actions, les objets et les sons. En utilisant les technologies AWS, Twelve Labs a atteint des vitesses d'entraînement jusqu'à 10% plus rapides tout en réduisant les coûts de plus de 15%.
Leurs modèles Marengo et Pegasus peuvent fournir des résumés textuels et des traductions audio en plus de 100 langues, rendant le contenu vidéo recherché et accessible. La technologie permet aux développeurs de créer des applications pour la recherche sémantique de vidéos dans divers secteurs tels que les médias, le divertissement, les jeux et le sport, permettant une récupération précise de moments et la création automatisée de bandes-annonces.
Twelve Labs, ein Startup, das sich auf multimodale KI zur Verständnis von Videoinhalten spezialisiert hat, kooperiert mit AWS, um die Suchfähigkeit von Videos zu verbessern. Die zugrunde liegenden Modelle des Unternehmens ermöglichen natürliche Sprachsuchen innerhalb von Videos und analysieren Aktionen, Objekte und Geräusche. Mit AWS-Technologien hat Twelve Labs Traininggeschwindigkeiten von bis zu 10% schneller erreicht und die Kosten um mehr als 15% gesenkt.
Ihre Modelle Marengo und Pegasus können Textzusammenfassungen und Audioübersetzungen in über 100 Sprachen bereitstellen und machen Videoinhalte durchsuchbar und zugänglich. Die Technologie erlaubt es Entwicklern, Anwendungen für die semantische Videosuche in den Bereichen Medien, Unterhaltung, Gaming und Sport zu erstellen, wodurch eine präzise Momentabfrage und eine automatisierte Erstellung von Highlight-Videos ermöglicht wird.
- 10% faster model training speed achieved through AWS technologies
- 15% reduction in training costs
- Capability to analyze and translate video content in over 100 languages
- Strategic Collaboration Agreement (SCA) with AWS for three years
- Access to AWS Marketplace for global customer reach
- None.
Insights
This strategic collaboration between Twelve Labs and AWS represents a significant technological breakthrough in video content analysis. The implementation of multimodal AI foundation models that can process and understand video content with 10% faster training speeds and 15% cost reduction is particularly noteworthy. This partnership could significantly impact AWS's competitive position in the AI infrastructure market.
The technology's ability to analyze multiple data formats simultaneously (video, audio, text) and provide searchable content in over 100 languages positions AWS to capture a larger share of the rapidly growing video analytics market. The three-year Strategic Collaboration Agreement (SCA) suggests long-term revenue potential from both direct service fees and increased AWS compute usage.
For Amazon shareholders, this partnership strengthens AWS's position in the generative AI space, particularly in video processing - an area where competitors like Google and Microsoft are also investing heavily.
Leading startup makes ‘needle-in-a-haystack’ video searches possible using natural language, turning the world’s largest unsearchable data source—video—into a trove of accessible information
Developers can now find specific movie scenes from decades of video archives, or assess video footage of athletes’ performances, with conversational queries
Twelve Labs uses AWS to train its multimodal foundation models up to
Creating applications that can pinpoint any video moment or frame
Available on AWS Marketplace, these foundation models enable developers to create applications for semantic video search and text generation, serving media, entertainment, gaming, sports, and additional industries reliant on large volumes of video. For example, sports leagues can use the technology to streamline the process of cataloging vast libraries of game footage, making it easier to retrieve specific frames for live broadcasts. Additionally, coaches can use these foundation models to analyze a swimmer’s stroke technique or a sprinter’s starting block position, making adjustments that lead to better performance. Finally, media and entertainment companies can use Twelve Labs technology to create highlight reels from TV programs tailored to each viewer’s interests, such as compiling all action sequences in a thriller series featuring a favorite actor.
“Twelve Labs was founded on a vision to help developers build multimodal intelligence into their applications,” said Jae Lee, co-founder and CEO of Twelve Labs. “Nearly
“AWS has given us the compute power and support to solve the challenges of multimodal AI and make video more accessible, and we look forward to a fruitful collaboration over the coming years as we continue our innovation and expand globally,” added Lee. “We can accelerate our model training, deliver our solution safely to thousands of developers globally, and control compute costs—all while pushing the boundaries of video understanding and creation using generative AI.”
Generating accurate and insightful video summaries and highlights
Twelve Labs’ Marengo and Pegasus foundation models deliver groundbreaking video analysis that not only provides text summaries and audio translations in more than 100 languages, but also analyzes how words, images, and sounds all relate to one other, such as matching what’s said in speech to what’s shown in video. Content creators can also access exact moments, angles, or events within a show or game using natural language searches. For example, major sports leagues use Twelve Labs technology on AWS to automatically and rapidly create highlight reels from their extensive media libraries to improve the viewing experience and drive fan engagement.
“Twelve Labs is using cloud technology to turn vast volumes of multimedia data into accessible and useful content, driving improvements in a wide range of industries,” said Jon Jones, vice president and global head of Startups at AWS. “Video is a treasure trove of valuable information that has, until now, remained unavailable to most viewers. AWS has helped Twelve Labs build the tools needed to better understand and rapidly produce more relevant content.”
Accelerating and lowering the cost of model training
Twelve Labs uses Amazon SageMaker HyperPod to train its foundation models, which are capable of comprehending different data formats like videos, images, speech, and text all at once. This allows its models to unlock deeper insights compared to other AI models focused on just one data type. The training workload is split across multiple AWS compute instances working in parallel, which means Twelve Labs can train their foundation models for weeks or even months without interruption. Amazon SageMaker HyperPod provides everything needed to get AI models up to speed quickly, fine-tune their performance, and scale up operations seamlessly.
Leveraging the scale of AWS to expand globally
As part of a three-year Strategic Collaboration Agreement (SCA), Twelve Labs will work with AWS to deploy its advanced video understanding foundation models across new industries and enhance its model training capabilities using Amazon SageMaker Hyperpod. AWS Activate, a program that helps startups grow their business, has empowered Twelve Labs to scale its generative AI technology globally and unlock deeper insights from hundreds of petabytes of videos—down to split-second accuracy. This support includes hands-on expertise for optimizing machine learning performance and implementing go-to-market strategies. Additionally, AWS Marketplace enables Twelve Labs to seamlessly deliver its innovative video intelligence services to a global customer base.
About Amazon Web Services
Since 2006, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud. AWS has been continually expanding its services to support virtually any workload, and it now has more than 240 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, media, and application development, deployment, and management from 108 Availability Zones within 34 geographic regions, with announced plans for 18 more Availability Zones and six more AWS Regions in
About Amazon
Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Amazon strives to be Earth’s Most Customer-Centric Company, Earth’s Best Employer, and Earth’s Safest Place to Work. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Career Choice, Fire tablets, Fire TV, Amazon Echo, Alexa, Just Walk Out technology, Amazon Studios, and The Climate Pledge are some of the things pioneered by Amazon. For more information, visit amazon.com/about and follow @AmazonNews.
View source version on businesswire.com: https://www.businesswire.com/news/home/20241203833659/en/
Amazon.com, Inc.
Media Hotline
Amazon-pr@amazon.com
www.amazon.com/pr
Source: Amazon.com, Inc.
FAQ
What performance improvements has Twelve Labs achieved using AWS (AMZN) for their AI model training?
How many languages can Twelve Labs' AI models process on AWS (AMZN)?
What is the duration of Twelve Labs' Strategic Collaboration Agreement with AWS (AMZN)?