Azure – Azure Load Testing: Run tests for up to 24 hours
Azure Load Testing now enables running tests for durations up to 24 hours.
Read More for the details.
Azure Load Testing now enables running tests for durations up to 24 hours.
Read More for the details.
Now you can protect, monitor, and recover your WSFC-clusters as a single unit across its DR Lifecycle, while also generating cluster-consistent recovery points – which are consistent across all the disks (including the Shared Disk) of the cluster.
Read More for the details.
We are excited to announce the launch of a new CloudWatch Logger feature with AWS Amplify, which is available now for Swift and Android developers. This feature empowers developers to log errors from the Amplify libraries to CloudWatch, enhancing the ability to detect production issues. It also enables developers to write custom logs to detect failures in different parts of their applications.
Read More for the details.
Amazon VPC IP Address Manager (IPAM) now supports three new CloudWatch metrics — VpcIPUsage, SubnetIPUsage, and PublicIPv4PoolIPUsage, that allow you to identify underutilized or near full capacity IP address ranges, optimizing your IP address usage on AWS. These metrics proactively track IP address usage across resources such as Amazon Virtual Private Clouds (Amazon VPCs), subnets, and Public IPv4 Pools. You can also set alarms for these metrics in Amazon CloudWatch to receive notifications when an IP address usage threshold is breached. Moreover, for a consolidated view of all IP address-related insights, these metrics are available on the IPAM Dashboard.
Read More for the details.
AWS announces the general availability of Amazon EC2 M7i-flex and EC2 M7i instances powered by custom 4th Gen Intel Xeon Scalable processors (code-named Sapphire Rapids). These custom processors, available only on AWS, offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers.
Read More for the details.
Amazon EventBridge Scheduler now supports the ability to set schedules to automatically delete upon completion of the last invocation. This can be used for one-time, cron, and rate schedules with an end date.
Read More for the details.
AWS Fargate for Amazon Elastic Kubernetes Service (EKS) now lets customers configure the size of ephemeral storage for their workloads up to a maximum of 175 GiB. This enables customers with data intensive workloads to utilize AWS Fargate for Amazon EKS. AWS Fargate for Amazon EKS removes the need to provision and manage servers, lets you specify and pay for resources per application, and improves security through application isolation by design.
Read More for the details.
Today, AWS announced the opening of a new AWS Direct Connect location within the EdgeConnex Herzliya data center in Herzliya, Israel. By connecting your network to AWS at this location, you gain private, direct access to all public AWS Regions (except those in China), AWS GovCloud Regions, and AWS Local Zones.
Read More for the details.
AWS Config now supports 19 more resource types for services, including AWS Amplify, Amazon AppIntegrations, AWS App Mesh, Amazon Athena, Amazon Elastic Compute Cloud (Amazon EC2), Amazon CloudWatch Evidently, Amazon Forecast, AWS IoT Greengrass Version 2, AWS Ground Station, AWS Elemental MediaConnect, AWS Elemental MediaTailor, Amazon Managed Streaming for Apache Kafka (Amazon MSK), Amazon Personalize, Amazon Pinpoint, and AWS Resilience Hub.
Read More for the details.
AWS Batch now supports Linux ARM64 and Windows x86 containers in AWS Fargate via AWS Batch console. This feature helps AWS Batch customers simplify the adoption of modern container technology by expanding their architecture options for scheduling Linux ARM64 and Windows x86 containers in Fargate compute environments. Support for ARM64 architecture also gives customers the benefits of Graviton instances in Fargate which can help improve price/performance over comparable x86-based instances for a variety of workloads including high performance computing.
Read More for the details.
Effectively managing your network’s IP addresses is essential to efficiently operating your enterprise. As a network administrator, observing the usage and growth of your IP address space is essential for capacity planning and proactively avoiding costly downtime. It is also key to identify where IP address allocation is non-optimal and could be resized for better resource utilization.
In order to efficiently manage your network’s IP addresses resources, you should monitor:
1) The current allocation of IP addresses distributed across your subnetworks
2) High subnet IP Utilization to avoid resource exhaustion
Network Intelligence Center Network Analyzer automatically monitors your VPC configurations to surface network and services issues. Network Analyzer proactively powers subnet IP address management workflows through 2 key insights:
1) IP address utilization summary
Streamlines identifying where IP addresses are nearing depletion, and which IP addresses are being inefficiently under-utilized.
2) High IP address utilization of a subnet range
Enables proactive monitoring of subnet ranges nearing IP exhaustion
Our new Network Analyzer insight IP utilization summary is designed to help Network Administrators better understand their IP address utilization in Google Cloud. It enables you to observe IP address utilization of all the VPCs and subnet ranges in your Google Cloud project, empowering the identification of subnets that:
1) Are reaching full IP address utilization in the future
2) May be oversized and underused
Select a Google Cloud project with a VPC network configured, and record the project ID. Replace the <PROJECT_ID> referenced below with the ID of your Google Cloud project.
1) Enabling the recommender.googleapis.com API in this project
2) Getting the IP utilization information for this project
Example of output:
The attribute allocationRatio represents the percentage of used IP addresses compared to the available IP addresses in your subnet range (identified by the subnetRangePrefix and subnetUri). The allocationRatio will contain a value between 0 and 1, with 1 representing 100% of IP utilization. For example, an allocationRatio of 0.5 represents 50% IP utilization by that subnet. Subnet ranges with 0% IP utilization are excluded from this insight.
This insight takes into account the4 reserved IP addresses by Google in IPv4 subnets.
Notice that you can see the last refresh date in the “lastRefreshTime” field. Network Analyzer will automatically refresh the IP utilization information every day. You can also trigger a manual refresh of the Network Analyzer insight information. To trigger a manual refresh, navigate to Network Analyzer in the Google Cloud Console and click the “UPDATE” button:
Every time this IP utilization information is updated (whether automatically, or manually), a log entry will be created. You can use the following query to navigate to such logs:
jsonPayload.causeCode=”IP_UTILIZATION_IP_ALLOCATION_SUMMARY”
You can also use this logging query to understand the historical IP utilization of your subnets at a prior point in time.
The IP Utilization Summary insight is accessible programmatically, via theRecommender API, and/or Network Analyzerlogs. In the future, this insight is planned to be integrated into the Network Analyzer UI.
This IP utilization summary insight supports both subnet primary range and secondary ranges. If you are using secondary ranges, the insight will allow you to monitor how many IP addresses are allocated in the secondary range. For example, if you use GKE, you will see how much of the secondary range for pods is allocated with your current GKE nodes.
To monitor GKE IP utilization, please take a look at additional Network Analyzer insights documentation.
A special thank you to Network Analyzer Engineering Lead Hongkun Yang for his contributions to this blog post.
Read More for the details.
Azure Red Hat OpenShift is now DoD IL4 certified, allows you to deploy infrastructure nodes without an OpenShift subscription fee, and provides increased security by allowing private clusters without public IP.
Read More for the details.
Amazon Elastic Kubernetes Service (Amazon EKS) now supports the Amazon Elastic File System (EFS) Container Storage Interface (CSI) driver as an EKS add-on, making it simpler and easier to use EFS shared file storage with your EKS clusters.
Read More for the details.
Amazon SageMaker Studio is a fully integrated development environment (IDE) for machine learning (ML) that enables Data Scientists and ML practitioners with their end to end machine learning workflow, from preparing data to building, training, tuning, and deploying models. In May 2023, we launched SageMaker Distribution a pre-built docker image which includes the most popular libraries for machine learning as an open-source project at JupyterCon. Today, we are announcing support for SageMaker Distribution in Amazon SageMaker Studio.
Read More for the details.
Starting today, Amazon Relational Database Service (RDS) for Oracle supports read and mounted replicas for instances on the multitenant container database (CDB) architecture running in single-tenant configuration. Amazon RDS for Oracle replicas fully manage the configuration of Oracle Data Guard to create and maintain replicas in the same or different AWS Region as the primary DB instance.
Read More for the details.
Starting today, customers can specify the price-capacity-optimized allocation strategy in AWS Batch. Previously, AWS Batch supported the capacity optimized Spot allocation strategy that is designed to optimize Spot Instance placement based on capacity availability to help reduce the likelihood that workloads are interrupted. The new price and capacity optimized allocation strategy is designed to provide a balance between price and capacity.
Read More for the details.
Database performance is essential for any business because it affects the efficiency and effectiveness of day-to-day operations. A slow database can cause delays in processing transactions, which can have a negative impact on customer satisfaction and profitability.
Cloud SQL for PostgreSQLoffers database observability through Cloud SQL Insights and SQLcommenter, which help customers diagnose, detect, and prevent database performance issues using a developer-first approach.
Cloud SQL for PostgreSQL supports additional database logs that can include metadata about database connections and disconnections, as well as query execution. All of these logs can be configured using database flags. Cloud SQL also supports widely used PostgreSQL Extensions, such as pg_stat_statements. By enabling both Cloud SQL Insights and native options, customers can get the best of both worlds.
This blog post explains how customers who have migrated to Cloud SQL for PostgreSQL can still use familiar PostgreSQL tools such as pgBadger and pg_stat_statements for database observability, in addition to Cloud SQL Insights. Unlike Cloud SQL Insights, generating reports using pgBadger requires additional steps to activate all necessary logging. If logging is not enabled, a partial report will be generated. Additionally, pgBadger requires additional operational steps to set up a server or GCE instance to download and create a report.
This article describes how to configure logging and generate HTML reports using pgBadger. It also emphasizes the value of using pg_stat_statement to capture nested calls within procedural calls.
Cloud SQL for PostgreSQL provides database flags that control the information captured in database logs. First, we will configure flags to enable logging of slow queries, every connection and disconnection. We will enable the following database flags with the values set as an example, not as the recommended values:
Activating the above-mentioned parameters or setting aggressive values for logging can put a strain on the database. We should assess the performance impact on the database before changing it in production. Here we can refer to the complete list of flags that are available for configuration on Cloud SQL for PostgreSQL.
Sample command line for updating database flags.
We are going to activate a new setting to record all nested calls as part of a top-level statement issued by clients. This will help us identify problematic nested calls, i.e. statements invoked within procedural code, in the pg_stat_statements view.
We can verify our set parameter from the database by querying the pg_settings view.
We will use pgbench, a tool for benchmarking PostgreSQL databases, to simulate a performance baseline run on our Cloud SQL database. The loads generated during the mock benchmark run will be used to generate reports from database logs using pgBadger.
We will initialize the database using the pgbench command line available as part of PostgreSQL client libraries on a newly created database “pgbench”.
pgBadger, a tool that processes database log files to generate HTML reports, includes information about connection, session, vacuum, temporary files, and top queries. We have already configured the relevant database flags to log information about each database connection and queries taking more than 300 milliseconds.
We will configure Cloud SQL for PostgreSQL logs to be stored in a Cloud Storage bucket, and then download them to Cloud Shell to process them using pgBadger to generate HTML reports.
As the first step we will install and set up pgBadger on a cloud shell.
Cloud SQL uses Cloud logging to store all instance logs. Cloud logging can be used to view and query database instance logs. We will create log routing sinks to send all required Cloud SQL instance logs to a Google Cloud Storage (GCS) destination. The logs router is a part of Cloud logging. The sink destination is defined as a Google Cloud storage bucket.
We have set up the GCS bucket to be the destination for Cloud SQL logs, and we have filtered the logs to only include those from Cloud SQL databases with the specified ID label.
We have provided the required sink information and set up a new Cloud Storage bucket to route all logs to as an hourly batch process. We can also provide additional filters to include only specific Cloud SQL instances in the logs that are included in the sink.
Inclusion filter for our sample instance is:
Cloud SQL execution logs will be available in hourly batches for generating reports using pgBadger by analyzing logs. We can always use Cloud SQL insights for real time insight to understand load and SQL details.
We have already initialized the pgbench sample database and will now simulate 50 client connections in 5 threads as a benchmark test to generate load on our Cloud SQL database.
We will download Cloud SQL for PostgreSQL logs from Cloud Storage to our cloud shell environment and run pgBadger to generate an HTML report for the mock benchmark run. We can also automate the process of downloading logs from Cloud Storage and executing pgBadger to generate reports on a variety of intervals, from hourly to weekly.
The Compute Engine service account must be granted the necessary roles in order to download logs from Cloud Storage.
Once Cloud SQL logs are downloaded, you can use the pgBadger tool to generate HTML reports from the JSON-based logs.
Initial look on the HTML report will look like the screenshot below.
The HTML report includes information on connections, temp files usage, SQL execution statistics, and more.
For example, we can check the slowest queries and the most resource-intensive queries under the TOP sections.
Thepg_stat_statements extension provides a convenient way to view cumulative query execution statistics, which are displayed as a database view. It is preloaded in shared libraries and can be used in databases with the Create extension command.
The official documentation for PostgreSQL covers the pg_stat_statements extension in great detail. In this blog, we will discuss how to track nested calls as part of procedural calls using the flag pg_stat_statements.track.
During performance investigation, we might encounter a procedural call labelled as problematic or consuming considerable time in terms of execution.It becomes difficult to debug problematic statements or nested statements within procedural code causing or impacting overall performance.
When pg_stat_statements.track is set to all, nested queries executed within a procedural statement can be captured as part of the pg_stat_statements view itself.
Let’s create faulty procedural calls and run it as part of SQL in a loop.
The procedure for checking the existence of records uses count aggregation on an unindexed filter column, bid. If the procedure is called multiple times, it will add overhead to the overall performance of the procedure.
To get a clean result, we will reset pg_stat_statements to capture all SQL as fresh runs.
Let’s execute the faulty procedural block repeatedly and query the pg_stat_statements extension to get the runtime execution time.
It highlights nested SQL calls within faulty procedural calls as a problematic statement. Now that we know which nested calls are problematic, we can provide recommendations and fix the overall procedural executions.
pg_stat_statements.track setting needs to change only for testing and finding problematic nested calls during the quality assurance phase; changing in a high workload environment may have additional overhead.
PostgreSQL-native observability options, such as pgBadger and the pg_stat_statements extension, enable database developers and administrators to continue leveraging tools they are familiar with when using Cloud SQL for PostgreSQL.
Unlike Cloud SQL insights, pgBadger requires all necessary logging to be enabled in order to generate a complete report. A partial report will be generated if this is not done. Additionally, a server or GCE instance is required to download and generate a report with pgBadger.
Cloud SQL for PostgreSQL gives customers the best of both worlds: native PostgreSQL features and Cloud SQL insights. Cloud SQL insights provide intuitive monitoring and root-cause analysis for performance issues. We will cover Cloud Insights in the next part of this Database Observability series. In the meantime, you can learn more about this topic in our documentation.
Read More for the details.
Zone Redundant Storage (ZRS) for Azure Disks is now available on Azure Premium SSD and Standard SSD in East Asia
Read More for the details.
When an incident disrupts a cloud service that you rely on, an effective response starts with identifying the source of that disruption and evaluating the scope of impact. This is crucial to charting a course of action — whether that’s communicating with your stakeholders or deploying a disaster recovery procedure. But when you use a cloud service provider, your ability to mount an effective incident response is dependent on the transparency, timeliness, and actionability of the incident communications provided.
Today, we’re excited to introduce Personalized Service Health, which provides fast, transparent, relevant, and actionable communication about Google Cloud service disruptions. Currently in Preview, you can use Personalized Service Health to receive granular alerts about Google Cloud service disruptions, as a stop in your incident response, or integrated with your incident response or monitoring tools.
Today, when Google detects an incident that could potentially impact you, we publish that information openly with Google Cloud Service Health, our highly reliable public dashboard that delivers information on active incidents that require wide distribution — typically those that tend to be larger in scope or severity. Organized by Google Cloud products and the regions they operate in, Google Cloud Service Health displays real-time information about incidents impacting Google Cloud products and provides mechanisms to download service disruption history.
Personalized Service Health takes these benefits a step further, and is the ideal destination for many customers to start their incident response journey. Personalized Service Health provides:
Controls to decide the service disruptions relevant to you: Google Cloud Service Health posts incidents that affect a broad set of customers, and is not an exhaustive list of incidents. If you prefer to see or be alerted of more incidents, earlier or more often — even smaller-scale ones — you can use Personalized Service Health to configure how and when you are alerted about incidents.
Ability to integrate with your incident management workflow: Personalized Service Health offers multiple integration options with your preferred incident management tools and workflows — for example, you can integrate alerts with PagerDuty to alert the appropriate incident responders when a service disruption begins.
Proactive incident discoverability: Personalized Service Health emits logs and can push customizable alerts to make incidents more discoverable in your workflow.
Let’s take a deeper look at these benefits.
Personalized Service Health can fire an alert to an extensive array of destinations when a Google Cloud service disruption is posted or updated. You can choose which of these you would like to be alerted on, where, and customize the alert content to include critical information about the incident — including the affected Google services and locations, current relevance to your project, observable symptoms, and known mitigations.
You can configure alerts directly in Personalized Service Health, in Cloud Monitoring, or via Terraform. Each alert can be fired to one or more destinations, including email, SMS, Pub/Sub, webhook, or PagerDuty. You can also create multiple alerts for a single project for a higher degree of granularity.
Personalized Service Health is designed to publish information related to disruptions that may affect your projects with various degrees of relevance. By definition, this approach may provide you more information than what you think is strictly necessary. To strike a balance, you can filter the incidents to only see what you may deem relevant, across a variety of integration points:
Dashboard: Filter the incident table by any displayed field and incident recency.
Alerts: You can create a conditional alerting policy with any incident field, including Google Cloud products, locations, or relevance to your project.
API: You can use request filters in your API requests to further filter events programmatically in your application.
Logs: Cloud Logging supports a robust query language to filter logs as they are routed to another destination through a log sink.
Incident response can span many people, teams, and tools in an organization. Personalized Service Health aims to fit into your existing incident response processes by offering several integration options depending on your preference for programmatic access, proactive versus reactive interactions, and existing tools.
You can use Personalized Service Health as a dashboard directly from the Google Cloud console, or fit it into any existing incident response or monitoring tool in your preferred workflow. The Service Health dashboard provides a list of active incidents relevant to your project, and, for each incident, you can see impact details about the incident or track updates from Google Cloud support. This is quick to set up and easy to maintain.
If you’re integrating Personalized Service Health with an external alerting, monitoring, or incident response tool, the Service Health API offers programmatic access to all incidents relevant to a specific project or for all projects across your organization. The API provides programmatic access to the complete list of all relevant incidents, updates from Google Cloud, and description of impact.
When a service disruption begins, Cloud Logging collects Personalized Service Health logs for all updates to the event. To build up a historic record of events, you can retain logs in a storage bucket. You can also use Log Analytics with BigQuery to analyze past service disruptions.
As of today, we are excited to announce Personalized Service Health is integrated with 50+ Google Cloud products & services – including Compute Engine, Cloud Storage, all Cloud Networking offerings, BigQuery, Google Kubernetes Engine, and many more. If any integrated Google Cloud product detects a disruption that may impact you, Personalized Service Health provides an impact assessment, and shares updates including symptoms, known workarounds, or an ETA for resolution.
Some products may offer more advanced capabilities through Personalized Service Health, including faster initial posting, definitive impact signals, and may post small blast-radius incidents not posted on the public Google Cloud Service Health dashboard. Here is the complete list of integrated products and supported capabilities; we expect the list of supported Google Cloud products and capabilities will expand over time.
“The instinct for cloud providers is to be overly cautious about sharing outages too quickly. I’d rather proactively move a workload and learn there was no issue than the workload go down unknowingly. We’re happy to see Google Cloud make this step to be more transparent with customers and look forward to leveraging PSH.”
– Justin Watts, Director Information Services & Technology Strategy, Telus
“Proactive alerts from Personalized Service Health to responders are critical to any enterprise customer’s incident response process. The PagerDuty and Google Cloud partnership is able to provide our customers an essential platform for modern operations that helps them quickly respond to cloud disruptions and deliver seamless digital experiences.”
-Jonathan Rende, SVP Products, PagerDuty
Reliable infrastructure is essential for workloads in the cloud and we’re continuously raising the bar on reliability through technology, product, and process innovation. A key component of reliability is the speed and effectiveness of incident response. During a cloud service incident, however unlikely, excellent communications are vital. Personalized Service Health provides the information you need to take your incident response communications to the next level, so can quickly assess what is happening, take actions to minimize impact to your applications, and keep your stakeholders informed. To get started, enable Personalized Service Health for a project or across your organization.
Read More for the details.
Work with JSON-style documents synced across multiple regions
Read More for the details.