Datadog Launches New Product to Observe, Troubleshoot and Optimize Data Processing Jobs
Datadog launched Data Jobs Monitoring to help data teams and engineers detect job failures and latency spikes in their data pipelines, focusing on Spark and Databricks jobs. This new product offers real-time alerts, detailed trace views, and resource optimization to enhance job reliability and reduce costs. Users can immediately identify and resolve problematic jobs, improving overall data quality and operational efficiency.
According to Matt Camilli from Rhythm Energy, the tool has enabled a 20% faster resolution of Databricks job failures. Michael Whetten, VP of Product at Datadog, emphasized the importance of visibility into expensive jobs to optimize pipelines and prioritize cost savings. Data Jobs Monitoring is now available for general use.
- Data Jobs Monitoring enables 20% faster resolution of Databricks job failures.
- Provides real-time alerts for job failures and latency spikes, improving operational efficiency.
- Helps identify and optimize overprovisioned compute resources, reducing costs.
- Offers detailed trace views for faster troubleshooting and root cause analysis.
- Potential reliance on the new tool may mask underlying issues within data pipelines.
Insights
The introduction of Data Jobs Monitoring by Datadog marks a significant advancement in how cloud-based data processing jobs are managed. The new product offers an array of features that improve the detection and resolution of job failures and latency spikes, which are common pain points for data engineers and platform teams.
One of the standout features of this product is its ability to provide detailed trace views and compare multiple job runs. This helps engineers quickly pinpoint the root cause of failures and performance issues, significantly reducing downtime and improving overall workflow efficiency. The product's capability to correlate job telemetry with cloud infrastructure also enhances the debugging process, making it more precise and less time-consuming.
Another notable aspect is the focus on cost optimization. By identifying overprovisioned clusters and inefficient job runs, Data Jobs Monitoring can help organizations reduce their cloud infrastructure costs. This is particularly relevant in today's market, where cost efficiency is a priority for many businesses.
In summary, this new tool could lead to higher data quality and cost savings, making it a valuable addition to Datadog's portfolio.
From a market perspective, the launch of Data Jobs Monitoring is likely to bolster Datadog's standing in the cloud application monitoring space. The product addresses critical issues like job failure and resource overprovisioning, which are significant concerns for enterprises managing large-scale data pipelines. By tackling these pain points, Datadog can attract more customers, including those currently using other monitoring solutions.
The product also positions Datadog well against competitors by offering a comprehensive solution that integrates with existing cloud infrastructure. This integration simplifies the monitoring process for customers, potentially driving higher adoption rates. Moreover, the ability to provide real-time alerts and detailed job execution traces enhances the utility of Datadog’s platform, making it more appealing to both new and existing customers.
Additionally, the focus on cost optimization is a strategic move. In an era where companies are increasingly seeking to control and reduce operational expenses, a tool that helps in cost management will likely see strong market demand.
Overall, this launch is a positive development for Datadog, with potential for increased market share and customer acquisition.
Data Jobs Monitoring detects and helps resolve job failures and latency spikes across data pipelines
Data Jobs Monitoring immediately surfaces specific jobs that need optimization and reliability improvements while enabling teams to drill down into job execution traces so that they can correlate their job telemetry to their cloud infrastructure for fast debugging.
"Data Jobs Monitoring enables my organization to centralize our data workloads in a single place—with the rest of our applications and infrastructure—which has dramatically improved our confidence in the platform we are scaling," said Matt Camilli, Head of Engineering at Rhythm Energy. "As a result, my team is able to resolve our Databricks job failures
"When data pipelines fail, data quality is impacted, which can hurt stakeholder trust and slow down decision making. Long-running jobs can lead to spikes in cost, making it critical for teams to understand how to provision the optimal resources," said Michael Whetten, VP of Product at Datadog. "Data Jobs Monitoring helps teams do just that by giving data platform engineers full visibility into their largest, most expensive jobs to help them improve data quality, optimize their pipelines and prioritize cost savings."
Data Jobs Monitoring helps teams to:
- Detect job failures and latency spikes: Out-of-the-box alerts immediately notify teams when jobs have failed or are running beyond automatically detected baselines so that they can be addressed before there are negative impacts to the end user experience. Recommended filters surface the most important issues that are impacting job and cluster health, so that they can be prioritized.
- Pinpoint and resolve erroneous jobs faster: Detailed trace views show teams exactly where a job failed in its execution flow so they have the full context for faster troubleshooting. Multiple job runs can be compared to one another to expedite root cause analysis and identify trends and changes in run duration, Spark performance metrics, cluster utilization and configuration.
- Identify opportunities for cost savings: Resource utilization and Spark application metrics help teams identify ways to lower compute costs for overprovisioned clusters and optimize inefficient job runs.
Data Jobs Monitoring is now generally available. To learn more, please visit: https://datadoghq.com/product/data-jobs-monitoring/.
About Datadog
Datadog is the observability and security platform for cloud applications. Our SaaS platform integrates and automates infrastructure monitoring, application performance monitoring, log management, user experience monitoring, cloud security and many other capabilities to provide unified, real-time observability and security for our customers' entire technology stack. Datadog is used by organizations of all sizes and across a wide range of industries to enable digital transformation and cloud migration, drive collaboration among development, operations, security and business teams, accelerate time to market for applications, reduce time to problem resolution, secure applications and infrastructure, understand user behavior and track key business metrics.
Forward-Looking Statements
This press release may include certain "forward-looking statements" within the meaning of Section 27A of the Securities Act of 1933, as amended, or the Securities Act, and Section 21E of the Securities Exchange Act of 1934, as amended including statements on the benefits of new products and features. These forward-looking statements reflect our current views about our plans, intentions, expectations, strategies and prospects, which are based on the information currently available to us and on assumptions we have made. Actual results may differ materially from those described in the forward-looking statements and are subject to a variety of assumptions, uncertainties, risks and factors that are beyond our control, including those risks detailed under the caption "Risk Factors" and elsewhere in our Securities and Exchange Commission filings and reports, including the Quarterly Report on Form 10-Q filed with the Securities and Exchange Commission on November 7, 2023, as well as future filings and reports by us. Except as required by law, we undertake no duty or obligation to update any forward-looking statements contained in this release as a result of new information, future events, changes in expectations or otherwise.
Contact
Dan Haggerty
press@datadoghq.com
View original content to download multimedia:https://www.prnewswire.com/news-releases/datadog-launches-new-product-to-observe-troubleshoot-and-optimize-data-processing-jobs-302178204.html
SOURCE Datadog, Inc.
FAQ
What is Datadog's new Data Jobs Monitoring product?
When was Datadog's Data Jobs Monitoring product launched?
How does Data Jobs Monitoring benefit Datadog users?
What improvements did Rhythm Energy experience with Data Jobs Monitoring?