Cloudflare Enhances AI Inference Platform with Powerful GPU Upgrade, Faster Inference, Larger Models, Observability, and Upgraded Vector Database

Rhea-AI Impact
Rhea-AI Sentiment

Cloudflare has announced significant enhancements to its Workers AI platform, designed to help developers build faster and more powerful AI applications. Key improvements include:

  • Upgraded GPUs in over 180 cities worldwide for faster inference and support for larger models like Llama 3.1 70B
  • New persistent logs in AI Gateway for better monitoring and optimization
  • Enhanced Vectorize database, now supporting up to 5 million vectors per index with reduced query latency

These upgrades aim to minimize network latency and improve AI application performance globally. Cloudflare's CEO, Matthew Prince, emphasized the importance of network speeds and regional availability as AI becomes more integrated into daily life.

Cloudflare ha annunciato significativi miglioramenti alla sua piattaforma Workers AI, progettata per aiutare gli sviluppatori a costruire applicazioni AI più rapide e potenti. Le principali novità includono:

  • GPU aggiornate in oltre 180 città nel mondo per un'inferenza più veloce e supporto per modelli più grandi come Llama 3.1 70B
  • Nuovi log persistenti in AI Gateway per un migliore monitoraggio e ottimizzazione
  • Database Vectorize potenziato, ora in grado di supportare fino a 5 milioni di vettori per indice con latenza di query ridotta

Questi aggiornamenti mirano a ridurre la latenza di rete e migliorare le prestazioni delle applicazioni AI a livello globale. Il CEO di Cloudflare, Matthew Prince, ha sottolineato l'importanza delle velocità di rete e della disponibilità regionale man mano che l'AI diventa sempre più integrata nella vita quotidiana.

Cloudflare ha anunciado mejoras significativas en su plataforma Workers AI, diseñada para ayudar a los desarrolladores a crear aplicaciones de IA más rápidas y potentes. Las mejoras clave incluyen:

  • GPUs actualizadas en más de 180 ciudades en todo el mundo para una inferencia más rápida y soporte para modelos más grandes como Llama 3.1 70B
  • Nuevos registros persistentes en AI Gateway para un mejor monitoreo y optimización
  • Base de datos Vectorize mejorada, que ahora admite hasta 5 millones de vectores por índice con menor latencia en las consultas

Estas actualizaciones tienen como objetivo minimizar la latencia de la red y mejorar el rendimiento de las aplicaciones de IA a nivel mundial. El CEO de Cloudflare, Matthew Prince, enfatizó la importancia de las velocidades de red y la disponibilidad regional a medida que la IA se integra más en la vida cotidiana.

클라우드플레어는 개발자가 더 빠르고 강력한 AI 애플리케이션을 구축할 수 있도록 설계된 Workers AI 플랫폼의 중대한 개선 사항을 발표했습니다. 주요 개선 사항은 다음과 같습니다:

  • 전 세계 180개 이상의 도시에서 업그레이드된 GPU로 더 빠른 추론과 Llama 3.1 70B와 같은 대형 모델 지원
  • AI Gateway의 새로운 지속 로그로 더욱 향상된 모니터링 및 최적화 기능
  • 이제 인덱스당 최대 500만 벡터를 지원하며 쿼리 지연 시간이 줄어든 강화된 벡터화 데이터베이스

이러한 업그레이드는 네트워크 지연 시간을 최소화하고 AI 애플리케이션의 성능을 전 세계적으로 향상시키는 것을 목표로 합니다. 클라우드플레어의 CEO인 매튜 프린스는 AI가 일상 생활에 더욱 통합됨에 따라 네트워크 속도와 지역 가용성의 중요성을 강조했습니다.

Cloudflare a annoncé des améliorations significatives à sa plateforme Workers AI, conçue pour aider les développeurs à créer des applications IA plus rapides et plus puissantes. Les améliorations clés incluent :

  • GPUs mises à niveau dans plus de 180 villes dans le monde pour une inférence plus rapide et un support pour des modèles plus grands comme Llama 3.1 70B
  • Nouveaux journaux persistants dans AI Gateway pour un meilleur suivi et optimisation
  • Base de données Vectorize améliorée, maintenant capable de supporter jusqu'à 5 millions de vecteurs par index avec une latence de requête réduite

Ces mises à jour visent à minimiser la latence du réseau et à améliorer les performances des applications IA à l'échelle mondiale. Le PDG de Cloudflare, Matthew Prince, a souligné l'importance des vitesses de réseau et de la disponibilité régionale alors que l'IA devient de plus en plus intégrée dans la vie quotidienne.

Cloudflare hat bedeutende Verbesserungen seiner Workers AI-Plattform angekündigt, die Entwicklern helfen sollen, schnellere und leistungsfähigere KI-Anwendungen zu erstellen. Zu den wichtigsten Verbesserungen gehören:

  • Aktualisierte GPUs in über 180 Städten weltweit für schnellere Inferenz und Unterstützung größerer Modelle wie Llama 3.1 70B
  • Neue permanente Protokolle im AI Gateway für eine bessere Überwachung und Optimierung
  • Verbesserte Vectorize-Datenbank, die jetzt bis zu 5 Millionen Vektoren pro Index mit reduzierter Abfrage-Latenz unterstützt

Diese Aktualisierungen zielen darauf ab, die Netzwerkverzögerung zu minimieren und die Leistung von KI-Anwendungen weltweit zu verbessern. Der CEO von Cloudflare, Matthew Prince, betonte die Bedeutung von Netzwerkgeschwindigkeiten und regionaler Verfügbarkeit, da KI immer mehr in das tägliche Leben integriert wird.

  • Expanded GPU network to over 180 cities worldwide, enhancing global AI accessibility
  • Support for larger AI models, including Llama 3.1 70B and Llama 3.2 series
  • Introduction of persistent logs in AI Gateway for improved monitoring and optimization
  • Vectorize database capacity increased from 200,000 to 5 million vectors per index
  • Reduced median query latency in Vectorize from 549ms to 31ms
  • None.


Cloudflare's announcement of enhanced AI capabilities is a significant development in the AI infrastructure space. The upgrade to more powerful GPUs and support for larger models like Llama 3.1 70B demonstrates a substantial leap in processing power. This allows for more complex AI tasks and improved user experiences, potentially opening new markets for AI-driven applications.

The expansion of GPU availability to over 180 cities globally is a game-changer for AI inference. By reducing latency and bringing processing closer to end-users, Cloudflare is addressing a critical bottleneck in AI deployment. This could accelerate the adoption of AI in everyday applications, particularly in regions previously underserved by centralized AI infrastructure.

The improvements to Vectorize, with increased vector capacity and reduced query latency, are particularly noteworthy. These enhancements can significantly improve the performance and cost-effectiveness of AI applications, especially in search and recommendation systems. The 94% reduction in median query latency to 31ms is impressive and could lead to more responsive AI-powered services.

From a financial perspective, Cloudflare's AI enhancements position the company strongly in the rapidly growing AI infrastructure market. The expanded capabilities could drive increased adoption of Cloudflare's services, potentially leading to higher revenue streams from AI-related offerings.

The focus on efficiency and cost-effectiveness, particularly with the improvements in Vectorize, could attract cost-conscious enterprises looking to implement AI solutions. This may lead to increased customer acquisition and retention, positively impacting Cloudflare's financial performance.

However, investors should consider the capital expenditure required for the global GPU expansion and ongoing R&D investments. While these investments are likely necessary to maintain competitiveness, they may impact short-term profitability. The long-term potential for market share growth and revenue expansion in the AI sector could outweigh these near-term costs, making this a strategically sound move for Cloudflare's future growth prospects.

Workers AI is the easiest place to build and scale AI applications; can now deploy larger models and handle more complex AI tasks

SAN FRANCISCO--(BUSINESS WIRE)-- Cloudflare, Inc. (NYSE: NET), the leading connectivity cloud company, today announced powerful new capabilities for Workers AI, the serverless AI platform, and its suite of AI application building blocks, to help developers build faster, more powerful and more performant AI applications. Applications built on Workers AI can now benefit from faster inference, bigger models, improved performance analytics, and more. Workers AI is the easiest platform to build global AI applications and run AI inference close to the user, no matter where in the world they are.

As large language models (LLMs) become smaller and more performant, network speeds will become the bottleneck to customer adoption and seamless AI interactions. Cloudflare’s globally distributed network helps to minimize network latency, setting it apart from other networks that are typically made up of concentrated resources in limited data centers. Cloudflare’s serverless inference platform, Workers AI, now has GPUs in more than 180 cities around the world, built for global accessibility to provide low latency times for end users all over the world. With this network of GPUs, Workers AI has one of the largest global footprints of any AI platform, and has been designed to run AI inference locally as close to the user as possible and help keep customer data closer to home.

“As AI took off last year, no one was thinking about network speeds as a reason for AI latency, because it was still a novel, experimental interaction. But as we get closer to AI becoming a part of our daily lives, the network, and milliseconds, will matter,” said Matthew Prince, co-founder and CEO, Cloudflare. “As AI workloads shift from training to inference, performance and regional availability are going to be critical to supporting the next phase of AI. Cloudflare is the most global AI platform on the market, and having GPUs in cities around the world is going to be what takes AI from a novel toy to a part of our everyday life, just like faster Internet did for smartphones.”

Cloudflare is also introducing new capabilities that make it the easiest platform to build AI applications with:

  • Upgraded performance and support for larger models: Now, Cloudflare is enhancing their global network with more powerful GPUs for Workers AI to upgrade AI inference performance and run inference on significantly larger models like Llama 3.1 70B, as well as the collection of Llama 3.2 models with 1B, 3B, 11B (and 90B soon). By supporting larger models, faster response times, and larger context windows, AI applications built on Cloudflare’s Workers AI can handle more complex tasks with greater efficiency – thus creating natural, seamless end-user experiences.
  • Improved monitoring and optimizing of AI usage with persistent logs: New persistent logs in AI Gateway, available in open beta, allow developers to store users’ prompts and model responses for extended periods to better analyze and understand how their application performs. With persistent logs, developers can gain more detailed insights from users’ experiences, including cost and duration of requests, to help refine their application. Over two billion requests have traveled through AI Gateway since launch last year.
  • Faster and more affordable queries: Vector databases make it easier for models to remember previous inputs, allowing machine learning to be used to power search, recommendations, and text generation use-cases. Cloudflare’s vector database, Vectorize, is now generally available, and as of August 2024 now supports indexes of up to five million vectors each, up from 200,000 previously. Median query latency is now down to 31 milliseconds (ms), compared to 549 ms. These improvements allow AI applications to find relevant information quickly with less data processing, which also means more affordable AI applications.

To learn more, please check out the resources below:

About Cloudflare

Cloudflare, Inc. (NYSE: NET) is the leading connectivity cloud company on a mission to help build a better Internet. It empowers organizations to make their employees, applications and networks faster and more secure everywhere, while reducing complexity and cost. Cloudflare’s connectivity cloud delivers the most full-featured, unified platform of cloud-native products and developer tools, so any organization can gain the control they need to work, develop, and accelerate their business.

Powered by one of the world’s largest and most interconnected networks, Cloudflare blocks billions of threats online for its customers every day. It is trusted by millions of organizations – from the largest brands to entrepreneurs and small businesses to nonprofits, humanitarian groups, and governments across the globe.

Learn more about Cloudflare’s connectivity cloud at Learn more about the latest Internet trends and insights at

Follow us: Blog | X | LinkedIn | Facebook | Instagram

Forward-Looking Statements

This press release contains forward-looking statements within the meaning of Section 27A of the Securities Act of 1933, as amended, and Section 21E of the Securities Exchange Act of 1934, as amended, which statements involve substantial risks and uncertainties. In some cases, you can identify forward-looking statements because they contain words such as “may,” “will,” “should,” “expect,” “explore,” “plan,” “anticipate,” “could,” “intend,” “target,” “project,” “contemplate,” “believe,” “estimate,” “predict,” “potential,” or “continue,” or the negative of these words, or other similar terms or expressions that concern Cloudflare’s expectations, strategy, plans, or intentions. However, not all forward-looking statements contain these identifying words. Forward-looking statements expressed or implied in this press release include, but are not limited to, statements regarding the capabilities and effectiveness of Workers AI, AI Gateway, Vectorize, R2, and Cloudflare’s other products and technology, the benefits to Cloudflare’s customers from using Workers AI, AI Gateway, Vectorize, R2, and Cloudflare’s other products and technology, the timing of when Workers AI, AI Gateway, Vectorize, R2, or any of its related features will be generally available to all current and potential Cloudflare customers, Cloudflare’s technological development, future operations, growth, initiatives, or strategies, and comments made by Cloudflare’s CEO and others. Actual results could differ materially from those stated or implied in forward-looking statements due to a number of factors, including but not limited to, risks detailed in Cloudflare’s filings with the Securities and Exchange Commission (SEC), including Cloudflare’s Quarterly Report on Form 10-Q filed on August 1, 2024, as well as other filings that Cloudflare may make from time to time with the SEC.

The forward-looking statements made in this press release relate only to events as of the date on which the statements are made. Cloudflare undertakes no obligation to update any forward-looking statements made in this press release to reflect events or circumstances after the date of this press release or to reflect new information or the occurrence of unanticipated events, except as required by law. Cloudflare may not actually achieve the plans, intentions, or expectations disclosed in Cloudflare’s forward-looking statements, and you should not place undue reliance on Cloudflare’s forward-looking statements.

© 2024 Cloudflare, Inc. All rights reserved. Cloudflare, the Cloudflare logo, and other Cloudflare marks are trademarks and/or registered trademarks of Cloudflare, Inc. in the U.S. and other jurisdictions. All other marks and names referenced herein may be trademarks of their respective owners.

Cloudflare, Inc.

Daniella Vallurupalli

Vice President, Head of Global Communications

Source: Cloudflare, Inc.


What new capabilities has Cloudflare (NET) added to its Workers AI platform?

Cloudflare has added more powerful GPUs for faster inference, support for larger models like Llama 3.1 70B, persistent logs in AI Gateway for better monitoring, and improved Vectorize database with increased capacity and reduced latency.

How many cities does Cloudflare's (NET) GPU network now cover for AI inference?

Cloudflare's GPU network for AI inference now covers more than 180 cities around the world.

What improvements have been made to Cloudflare's (NET) Vectorize database?

Cloudflare's Vectorize database now supports up to 5 million vectors per index, up from 200,000, and has reduced median query latency from 549ms to 31ms.

When will Cloudflare (NET) host its first Builder Day Live Stream?

Cloudflare will host its first Builder Day Live Stream on September 26 at 11am PT.

Cloudflare, Inc.


NET Rankings

NET Latest News

NET Stock Data

Software - Infrastructure
Services-prepackaged Software
United States of America