Cloud

2022 03 15

AWS – Amazon RDS for PostgreSQL now supports mysql_fdw extension for Amazon Aurora, MySQL and MariaDB Databases

Amazon RDS for PostgreSQL adds support for mysql_fdw which allows your PostgreSQL database to connect and retrieve data stored in separate Amazon Aurora MySQL-compatible, MySQL, and MariaDB databases.

Read More for the details.

2022 03 15

AWS – Amazon RDS for PostgreSQL supports PostgreSQL minor versions 14.2, 13.6, 12.10, 11.15, and 10.20

AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) for PostgreSQL now supports PostgreSQL minor versions 14.2, 13.6, 12.10, 11.15, and 10.20. We recommend that you upgrade to the latest minor versions to fix known security vulnerabilities in prior versions of PostgreSQL, and to benefit from the numerous bug fixes, performance improvements, and new functionality added by the PostgreSQL community. Please refer to the PostgreSQL community announcement for more details about the release.

Read More for the details.

2022 03 15

Azure – Public preview: Trusted launch support for Virtual Machines using Ephemeral OS disks

Azure, Cloud Azure

Trusted launch virtual machine (VM) support for VMs using Ephemeral OS disks improves the security of generation 2 VMs in Azure.

Read More for the details.

2022 03 15

AWS – Announcing two new HERE map styles for Amazon Location Service

AWS, Cloud AWS

Today, Amazon Location Service added two new HERE map styles for developers, HERE Explore and HERE Explore Truck. With HERE Explore, developers have a new global map that features roads, buildings, landmarks, and water features, including a fully designed map of Japan. With HERE Explore Truck, developers can now display a global map containing truck restrictions and attributes e.g., width, height, HAZMAT, symbolized with highlighted segments and icons on top of HERE Explore to support use cases within transportation and logistics. For example, a customer who manages a fleet of delivery trucks, who needs to calculate the optimal driving route with appropriate restrictions, can use HERE Explore Truck to see driving routes with restrictions shown on the map.

Read More for the details.

2022 03 15

AWS – Announcing AWS AppConfig Feature Flags General Availability

AWS, Cloud AWS

AWS is announcing general availability for AWS AppConfig Feature Flags. Feature flagging allows you to quickly roll out new features safely and with more confidence. AWS AppConfig is a feature of AWS Systems Manager.

Read More for the details.

2022 03 15

AWS – Automate visual inspection of product defects at the edge with Amazon Lookout for Vision – now generally available

AWS, Cloud AWS

Amazon Lookout for Vision is now generally available to AWS customers to use at the edge. You can deploy your trained Amazon Lookout for Vision models to a hardware device of your choice and run inference locally without any cloud dependencies. Your trained models can be deployed on any NVIDIA Jetson edge appliance or x86 compute platform running Linux with an NVIDIA GPU accelerator. You can use AWS IoT Greengrass to deploy and manage your edge compatible customized models on your fleet of devices. AWS IoT Greengrass is an open-source edge runtime and cloud service for building, deploying, and managing device software.

Read More for the details.

2022 03 15

GCP – A handheld lab: How Cue Health is revolutionizing healthcare diagnostics for COVID and beyond

Cloud, Google Cloud gcp

Editor’s note: In response to the COVID-19 pandemic, Cue Health, inventors of the portable Cue Health Monitoring System and Cue COVID-19 At-Home Test, engaged Google Cloud to help quickly transform and scale their back-end operations. The company had largely been in research-and-Development mode since its founding in 2010 and needed to shift to commercial mode, manufacturing and deploying millions of Cue units across the United States, almost overnight.

Cue’s primary goal has been to make simple, remote diagnostic testing more accessible and affordable. Yet an even more important mission was at work, especially during the pandemic: helping public health officials and researchers gain a fuller picture of community health. Cue leveraged Google Cloud’s secure infrastructure to immediately help scale, store, and protect critical data that offered officials a more timely and accurate picture of the COVID-19 pandemic. This data also provides constant feedback to help ensure Cue’s tests continue to be among the most accurate COVID-19 self-test available to users.

This is the story of how they did it.

Modern microbiology began in the 17th century with the discovery of bacteria. Two centuries later, as Louis Pasteur applied the principles of microbiology to explore the relationship between germs and disease, he struggled to trace the cause of rabies—a virus too small to be detected using standard equipment of the day. Through four years of tireless experimentation, however, Pasteur found that a weakened extract of rabies-infected tissue might protect against it. What many consider the first vaccine was born. After this breakthrough, scientists continued to search experimentally for answers until the 1930s, when the invention of electron microscopy allowed them to see and study the contagium vivum fluidum, or virus, so they could develop vaccine improvements.

It’s hard to imagine that conquering COVID-19 might take decades or even centuries, when to a weary public, each additional day of the pandemic seems endless. For modern health warriors to make progress against a virus, they need to do more than see it, identify its properties, and run experiments to find treatments. Collecting and analyzing data is becoming one of the most powerful ways to track the trajectory of a pathogen and to understand which therapies neutralize it most effectively.

Thus far, timely and comprehensive data has proved elusive during the pandemic, helping it rage on for more than two years.

For all the advances made since Pasteur’s time, the reasons for our data being disconnected, even when there is so much of it available, is quite clear: “The majority of the healthcare diagnostics infrastructure we rely on today is built on systems that were established decades ago,” says Chris Achar, chief strategy officer at Cue Health.

Achar describes the traditional diagnostic process: patients make an appointment, visit their healthcare professional, and provide samples that their doctor’s office couriers to a centralized lab. It can take days and even a week for a test result to come back, at which point the doctor informs the patient and recommends a treatment. Diagnostic data is typically centralized with patients having limited to no access.

During the COVID-19 pandemic, time gaps have caused obvious problems for public health officials. The weeks it may take between initial symptoms and final test results became a chasm that left room for mutations to spread before the public even knew they existed. “The average turnaround time for sequencing data in this country is 28 days,” Achar explains. “Delta became the dominant variant within 30 days.”

By the time health officials recognized Delta, it had already taken hold.

As variants like Omicron emerged, people, desperate to stay healthy, scrambled for rapid antigen tests they could take at home. While faster than traditional lab diagnostics, these paper tests aren’t as reliable. Moreover, many people don’t report positive results, leaving public health officials blind to the true positivity rate, location, and velocity of spread. Thus, when mutations arise unexpectedly, the tools officials use to counter them rely on mostly outdated information. “It’s resulted in a huge blind spot across the country,” Achar says.

When people fall ill with COVID-like symptoms, they need fast and accurate testing and diagnosis. Likewise, health officials need fast, accurate and connected data reporting so that they can formulate a response in time to make a difference. Unless we cross the data chasm, society will remain at the mercy of every new phase of this pandemic—and of every pandemic that follows.

Pioneering at-home molecular testing

Cue Health has been pioneering technology to provide at-home molecular testing since 2010. The founders, Ayub Khattak and Clint Sever, started the company after observing the H1N1 pandemic, better known as the 2009 Swine Flu outbreak, and the lack of diagnostic capability that existed at the point of care at that time.

“They wanted to create something that was fast, field deployable, reliable, and had the ability to connect data back into the public healthcare system,” Achar says, ”something that could be used by both healthcare professionals and ultimately consumers alike, with a platform capability to do a number of molecular assays.”

The result of their efforts is the Cue Health Monitoring System based on the Cue Reader, a portable, Bluetooth-operated, rechargeable device. The Cue Test Cartridge, a single-use disposable unit that contains all the chemistry and components needed to run a nucleic acid amplification test (NAAT). Lastly, there’s the Cue Health App, a mobile app that can be downloaded and not only allows users to manage their test results but also conduct a supervised test if needed, connect to a doctor via in-app virtual care, and also enables e-prescription capabilities.

Usage is very easy: simply insert a new cartridge into the Cue Reader, collect a sample with a provided wand, and insert into the cartridge. The Cue App does the rest. “The Cue Reader is connected to the mobile app via bluetooth and integrated into e-prescription and healthcare systems,” Achar explains. Cue Health de-identifies test results to analyze near real-time trends, providing valuable insight for public health officials.

Prior to the COVID-19 pandemic, Cue had already been working with the Biomedical Advanced Research Development Authority (BARDA), a division of U.S. Health and Human Services that focuses on advancing healthcare technology. BARDA was interested in Cue’s technology as part of ongoing U.S. healthcare infrastructure with a focus on pandemic response.

Through this relationship, Cue Health was one of the few companies able to access the genome sequence for COVID-19 as soon as it became available in early 2020. Because the Cue Health Monitoring System was designed from the start to be adaptable, Cue Health engineers only needed to change the chemistry inside the cartridges to create a new set that could detect COVID-19. Within three weeks, they had a high-performing test, which would go on to become the first molecular diagnostic product for consumer home use to be authorized by the U.S. Food and Drug Administration.

Validated by an independent clinical study at the Mayo Clinic, the Cue Health COVID-19 test is 97.8% accurate when compared to lab-based PCR.

“People can now run a COVID-19 test anywhere using the Cue Reader,” Achar says, “and have lab-quality molecular test results delivered digitally to their mobile device in about 20 minutes.”

Meeting Pandemic-level Demand

While developing an accurate and reliable COVID-19 test was the most important step, it was only the first one. Demand for this test could extend far beyond a few people or populations in a few geographic regions. Communities and countries across the planet could all benefit from a new kind of test. The next—and far more daunting—hurdle would be to manufacture, distribute, and process the number of tests that fighting a global pandemic requires.

“We looked at a problem that was playing out in society and said there’s got to be a better way,” Ayub Khattak, one of the co-founders and CEO of Cue Health says. “We set about bringing together and creating the right technologies that would solve the problem.” With a grant from the U.S. Department of Defense to build Cue Health’s pandemic infrastructure, the company sought help exploring and addressing all the issues they would face in transforming their operations from R&D to commercial scale.

“COVID-19 and swine flu shined a light on a very basic problem of healthcare information access,” Khattak says. “We needed to build a product that scaled.” Cue Health knew they couldn’t scale their on-premises infrastructure fast enough to meet the incredible demand headed their way, particularly given healthcare’s stringent requirements around privacy and security. To realize their vision, Cue Health engaged Google Cloud to help move their existing environment to the cloud. They began with a centralized healthcare data lake capable of storing the vast amounts of test result data that would be necessary to create an accurate picture of the pandemic.

“To earn the confidence of Health and Human Services and the Department of Defense, we had to share our data compliance and approach around security, HIPAA, scalability, and redundancy,” Achar explains. “Google Cloud was great for that, but also internationally, because you have to domicile health data in each of the different territories where you get approved. With Google Cloud, we don’t have to recreate the wheel every time. We’re able to separate data and domicile it within Google Cloud’s capability in, say, Canada or Singapore, or wherever we get authorizations.”

Cue Health also chose Google Cloud because of its native support for industry-specific data standards such as HL7v2 and FHIR, plus pre-built HIPAA-compliant environments, which made rapid scaling across environments vastly easier. Google Cloud also simplifies secure access for patients and caregivers to time-sensitive data, as well as making that data de-identified with the Google Cloud Healthcare API, so it’s accessible to researchers to perform near real-time analysis of health-related trends.

“It’s the 21st century,” Achar said. “Our health data, something that’s so personal and that’s ours, should be more easily accessible. Google Cloud gave us the scalability we needed, and also the security we needed, because what happens to people’s data and where it gets reported is such an important topic.”

Using Data to Stay Ahead

What’s more, Cue now offers its customers the ability to have their positive COVID sample sequenced through a seperate in-home Cue Sequencing Collection Kit. Once sequenced, genomic data, stored in Google Cloud, can be analyzed at a vast scale using advanced AI and machine learning tools.

“We have this idea that if you can bring in more sequencing information, and if you can connect it to other important data layers, then you can make more meaning out of the information to predict what’s going to happen in the future,” Khattak says.

Now that their portable testing solution is publicly available, Cue Health hopes that broad adoption will make lab-quality testing available to communities that don’t have access to diagnostic lab facilities, reduce the time it takes to identify a new variant to 10 days or fewer, and help the medical community determine which populations are at higher risk and which types of individuals respond best to specific types of treatments.

“We’re deployed in a number of underserved communities, rural communities, and even tribal locations,” Achar says. “That’s a huge step forward for people who may not otherwise have access to molecular PCR lab-quality testing.”

By crossing the data chasm, public health officials can take precise, preventative action that helps everyone, instead of swinging way too blindly, way too late, for far too many people. “As we’ve announced with Google Cloud, we’re building a dashboard that could be used by public health officials,” Achar says. “We envision a mini-mesh sonar network that can show when positivity is increasing in one region as opposed to another, which will help public health officials decide where to deploy more antiviral and surge response teams.”

To scientists who have walked in Pasteur’s footsteps, disease is a complex structure to unravel, a potentially life-altering human condition that demands relief. But studying disease requires more than learning its biological makeup. We must understand how a disease starts and spreads, and how it affects individuals and populations, so we can learn how to stop it. Every month, week, day, or minute sooner that pioneers like Cue Health can deliver crucial diagnostic data to caregivers and scientists, the more they can help an exhausted medical community get ahead of this public health crisis—and stay ahead for the next.

“The implication of having connectivity, data, and being able to take better action not just for the pandemic, but for health in general, is huge,” Achar says. “This is the tip of the iceberg right now.”

Read More for the details.

2022 03 15

GCP – Customer Care portfolio: Flexible, scalable, robust support

Cloud, Google Cloud gcp

Technical support is now more critical than ever. It’s crucial to keeping your business running smoothly, while rapidly adjusting to an increasingly hybrid workforce that needs to stay connected at all times. Although the scale may vary, organizations of all sizes face similar challenges. We launched the Cloud Customer Care portfolio, a significant evolution in our technical support services, to address your needs with more comprehensive, scalable, and flexible services that can help you focus on your core business and provide you the service you expect from Google Cloud – regardless of the size of your organization.

A reasonably priced technical support service for an unlimited number of users, Standard Support is intended for the general needs of small- to medium-sized organizations that have workloads in development. But as your business looks to build capacity and maintain workloads in production, you’ll need rapid critical-incident response, greater flexibility, and more specialized features. That’s where our Enhanced Support can provide exceptional value.

Enhanced Support

Unplanned downtime, especially during planned events, can be catastrophic. Our Enhanced Support service is designed to keep you up and running with faster response times 24/7, along with direct access to technical support cases, our Cloud Support API to optimize management, and workload-centric support for multitechnology environments.

But special circumstances demand special attention. That’s why we’ve created Value-Add Services for Enhanced Support that can give you the flexibility to:

Receive expert assistance with our Technical Account Advisor Service. This service includes guided onboarding and ongoing hands-on stewardship, as well as monthly, quarterly, and yearly reviews, trend analysis, optimization recommendations, and dedicated case-escalation management for critical incident response.

Get ahead of key business events that drive sudden high-traffic spikes like product launches, grand openings, or data migrations with Planned Event Support. Working with your team, we cover pre-event architecture reviews and accelerated response times, all followed by comprehensive post-event reporting that details pitfalls, successes, and lessons learned.

Add a layer of governance to your support experience with Assured Support. By restricting support services to personnel who meet geographical-location and attribute-based requirements, it helps you ensure compliance with local standards, maintain data integrity and sovereignty, and maximize operational efficiencies.

The combination of Enhanced Support and the Technical Account Advisor Service is the ideal solution for us at Moloco. It is an inexpensive way to access the timely attention we need, when we need it. From the start, we’ve experienced noticeable improvements with response times, technical guidance, and service reviews critical to our business success. Changhoon Kim, VP of Engineering, Moloco

In short, Enhanced Support helps you optimize your cloud experience with high-quality and robust support, fast response times, and additional services for businesses of all sizes. And if you sign up for Enhanced Support now, you’ll receive a 50% discount until March 31, 2022.

What’s next for customers?

Existing Silver, Gold, and Role-Based Support services will end for customers on May 31, 2022. Make the move now to our new Customer Care portfolio and keep your support services running seamlessly – with added capabilities.

What’s next for partners?

Existing Role-Based Support services will end for partners on May 31, 2022. To help ensure services continue to run seamlessly, be sure to move your organization – or, for resellers, your customer’s organization – to our new Customer Care portfolio prior to that date. For more information on partner programs and benefits, please refer to the Partner Advantage portal.

If customers and partners choose not to make the transition, current support services will automatically transition to Basic Support, a nontechnical service for admin and billing inquiries only.

What’s right for you?

To get started, compare support services – including Basic, Standard, Enhanced, and Premium Support – and explore our pricing calculator to find the level that’s best for your needs and budget. Once you’ve selected your service, making the switch is simple, but the process looks a little different depending on your current plan. Check out step-by-step instructions for transitioning from Role-Based Support or transitioning from Silver or Gold Support. You can also sign up through the Google Cloud Console or contact your sales rep.

Questions? Concerns? Suggestions? We want to hear from you.

Your input is critical to how we continue to grow and refine the entire Cloud Customer Care portfolio. That’s why we regularly assess the effectiveness of our support services and base future improvements directly on your feedback. If you have any questions regarding which service is right for you or need assistance making the move, please contact us at Cloud Customer Care Support.

Sign up for Enhanced Support through the Cloud Console or contact your sales rep, and receive a 50% discount until March 31, 2022.

Read More for the details.

2022 03 14

AWS – Announcing Windows support for containerd runtime on EKS starting with Kubernetes 1.21

AWS, Cloud AWS

Amazon Elastic Kubernetes Service now supports the containerd container runtime on Windows worker nodes. Containerd is a lightweight container runtime that manages the complete container lifecycle on its host system, from container image transfer to execution, as well as storage and network attachment. Customers with Windows workloads can now get similar performance, security, and stability benefits from containerd that are available to customers running Linux-based worker nodes.

Read More for the details.

2022 03 14

AWS – AWS Cost Anomaly Detection supports integration with AWS Chatbot

AWS, Cloud AWS

Starting today, you can receive AWS Cost Anomaly Detection alert notifications in Slack and Amazon Chime through AWS Chatbot. Integration with AWS Chatbot allows you to easily configure your cost anomaly alert subscriptions with a Slack or Amazon Chime chat channel. This allows you to receive individual AWS Cost Anomaly Detection alerts within your existing chat channels, supporting improved collaboration and timely resolution of the alerts.

Read More for the details.

2022 03 14

AWS – Amazon ECS now supports on-premises workload orchestration on Windows OS

AWS, Cloud AWS

Amazon Elastic Container Service (Amazon ECS) now supports managing on-premises workloads running on a Windows operating system with Amazon ECS Anywhere. Amazon ECS Anywhere is a capability of Amazon ECS that enables customers to more easily run and manage container-based applications on-premises, including virtual machines (VMs), bare metal servers, and other customer-managed infrastructure. Customers, who need to manage containerized workloads on-premises running on Windows, can now more easily orchestrate them using ECS Anywhere. Developers no longer need to run additional container orchestration software or convert their Windows-based workloads to Linux OS.

Read More for the details.

2022 03 14

AWS – AWS Network Firewall achieves FedRAMP High compliance

AWS, Cloud AWS

AWS Network Firewall has achieved FedRAMP High authorization for the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions. You can now use AWS Network Firewall to protect and control access to and from your Amazon Virtual Private Clouds (VPCs) in these regions for workloads that require FedRAMP High categorization level.

Read More for the details.

2022 03 14

AWS – Amazon Cognito launches support for in-region integration with Amazon SES and Amazon SNS

AWS, Cloud AWS

Amazon Cognito now enables you to use Amazon Simple Email Service (Amazon SES) and Amazon Simple Notification Service (Amazon SNS) in the same region where your Amazon Cognito user pools are configured. By integrating these services in the same region, you can more easily achieve lower latency, and follow best practices in regional configuration.

Read More for the details.

2022 03 14

AWS – Amazon Route 53 Resolver DNS Firewall significantly reduces service cost

AWS, Cloud AWS

Over the course of March 2022, Amazon is reducing the cost of using the Amazon Route 53 Resolver DNS Firewall in all regions. First, Amazon is launching new tiered pricing effective March 1 that provides for reduced query processing fees as your query volume increases. Secondly, Amazon is implementing internal optimizations to reduce the number of DNS queries for which you are charged. Note that these optimizations will not reduce the number of DNS queries that are inspected or introduce any other changes to your security posture. Amazon customers will see these cost reductions automatically reflected in their Route 53 bills going forward. To learn more, please visit the Amazon Route 53 pricing page.

Read More for the details.

2022 03 14

GCP – How the CARTO platform enables the creation of advanced data visualizations with Google Maps Platform and deck.gl

Cloud, Google Cloud gcp

Editor’s note: Today’s blog post is from Alberto Asuero, CTO of CARTO, the location intelligence platform. Today he shares more details about the source of the data for advanced data visualizations created with Google Maps Platform and deck.gl and how the CARTO platform enables this workflow.

During Google Cloud Next in October, the Google team announced the newest release of the deck.gl visualization library, thanks to a collaboration with our geospatial company CARTO and the vis.gl Technical Steering Committee (TSC). The deck.gl release includes a deep integration with the new WebGL-powered features in the Maps JavaScript API that allows deck.gl to render 2D and 3D visualizations directly on the Google basemap.

Our team built an example app that visualizes a variety of data sources that show the potential for electrification of truck fleets in Texas. This app showcases the different types of advanced data visualizations that can be created with Google Maps Platform and deck.gl. Today, I want to share more details about the source of the data for these visualizations and how the CARTO platform enables this workflow.

Google Cloud provides a strong serverless data warehousing solution, BigQuery, with support for geospatial queries. When you are dealing with spatial data, creating maps to explore and visualize these datasets is an important and common need. The CARTO Spatial Extension for BigQuery provides an easy way to create connections to the data warehouse, design a map with data coming from BigQuery tables, and then add these visualizations to a web app using deck.gl.

An example of retrieving a Map ID from CARTO Builder

Different custom styles can be applied to the map directly in CARTO Builder

Making a simple map

To create a simple map using the CARTO platform, you can sign up for a trial account. Once you have signed in, you can set up a connection to your BigQuery instance using a service account. Then, you can go to the Data Explorer and browse the available datasets to find the table you want to use as the datasource in your map. For more information, check out the CARTO documentation.

CARTO Builder’s Data Explorer allows you to preview different geospatial datasets

To create a visualization of power transmission lines in Texas, you can start with the Texas state boundary to provide some context. In the Data Explorer, you can preview the table and click the “Create map” button in the top-right corner to start designing your visualization.

Using the CARTO Builder map making tool, select one of the available Google vector basemap styles and customize the layer style.

The result of an executed geospatial query displayed in CARTO Builder

You can visualize tables and the results from queries executed in the data warehouse, which is a powerful feature because you can also execute spatial analysis functions using SQL, including those from the CARTO Analytics Toolbox. In this case, you can intersect the lines in the table containing all the U.S. transmission lines within the Texas boundary. Click on the “Add source from…” button and select the “Custom Query (SQL)” option to add the following query:

code_block[StructValue([(u’code’, u’SELECT * rnFROM cartobq.nexus_demo.transmission_linesrnWHERE ST_INTERSECTS(rn geometry, rn (SELECT geom FROM cartobq.nexus_demo.texas_boundary_simplified)rn);’), (u’language’, u”)])]

An example of a geospatial query being executed in CARTO Builder

Click the “Run” button, and the query is executed in BigQuery. The results are sent back to the Builder tool. Perform some style customizations in the new layer, and your map is ready.

Results are automatically visualized when they are returned from BigQuery

Before adding the map to the Google Maps Platform application, you’ll need to make it public. Click on the “Share” button and select the “Developers” tab to copy the map ID.

Generate a Map ID for use with Google Maps Platform web and mobile SDKs from the share menu in CARTO Building

Now, you can add the visualization into your Google Maps Platform application, which is as easy as adding these four lines of code:

code_block[StructValue([(u’code’, u”const cartoMapId = ‘b502bf53-877d-4e89-b5ad-71982cac431d’;rndeck.carto.fetchMap({cartoMapId}).then(({layers}) => {rn const overlay = new deck.GoogleMapsOverlay({layers});rn overlay.setMap(map);rn});”), (u’language’, u”)])]

You can use the map ID copied from CARTO Builder to call the fetchMap function. This function connects to the platform and retrieves all the information needed for the visualization, including a collection of deck.gl layers with all the styling properties you’ve specified. Create an instance of the deck.gl GoogleMapsOverlay with this collection of layers and add it to the map.

You can see the full example in this fiddle.

Full example available in JSFiddle

Visualizing very large datasets

One of the main features of BigQuery is the ability to scale processing to massive datasets. With the CARTO platform, you can also visualize very large datasets using tilesets, an optimized data structure containing pre-generated vector tiles for fast visualization. Tilesets are generated within BigQuery using the Analytics Toolbox functions in a parallelized process that can handle billions of points.

For example, you can create a visualization using tilesets with the whole dataset of transmission lines for the U.S., more than 100MB of geometries.

The issue with these large datasets is that they do not fit in memory all at once, so you need to split them into tiles for them to be rendered progressively. CARTO takes care of this, allowing you to create tilesets directly in BigQuery or dynamically generate them on the fly.

A tileset generated in BigQuery and displayed in CARTO Builder

This method for data loading in maps can scale as much as needed; for example, take a look at this 17 billion point visualization of vessel data.

Dataset of 17 billion points, rendered using a custom tileset

What about live data?

BigQuery supports streaming data that is continuously updated. In these scenarios, you want to be able to update your visualization at regular intervals, as the data changes. It’s easy to update this visualization using deck.gl. You just need to set the autoRefresh parameter to true when fetching the map and specify the function you want to execute when new data is downloaded:

code_block[StructValue([(u’code’, u’const {layers} = await deck.carto.fetchMap({rn cartoMapId,rn autoRefresh: true,rn onNewData: (parsedMap) => { u2026 }rn});’), (u’language’, u”)])]

You can add points to a table with an INSERT function on the BigQuery console and see the data updated on the map in real time.

Datasets in BigQuery can be updated on the fly on visualized with CARTO

Going further

In addition to the simple ways to create visualizations shown above, deck.gl has the flexibility to create a wide variety of visualizations. The CARTO platform provides you with the functionality to access data from your data warehouse and create these data visualizations with advanced cartographic capabilities, but you can extend it and go beyond that using any of the advanced visualizations available in the deck.gl layer catalog.

There are two additional options that give you more control over the deck.gl code. The first one is to use the CartoLayer directly without fetchMap. You’ll need to indicate the connection to use from the CARTO platform and the data source type and name or query. Then we can specify the styling properties.

code_block[StructValue([(u’code’, u”const overlay = new deck.GoogleMapsOverlay({rn layers: [rn new deck.carto.CartoLayer({rn connection: ‘bqconn’,rn type: deck.carto.MAP_TYPES.TABLE,rn data: `cartobq.public_account.retail_stores`,rn getFillColor: [238, 77, 90],rn pointRadiusMinPixels: 6,rn }),rn ],rn});”), (u’language’, u”)])]

The second option is to use the fetchLayerData function that allows you to have more control over the format used for data transfer between BigQuery and your application and can be used with advanced visualizations that require an specific data format like ArcLayer, H3HexagonLayer or TripsLayer.

code_block[StructValue([(u’code’, u”deck.carto.fetchLayerData({rn type: deck.carto.MAP_TYPES.TABLE,rn source: `cartobq.geo_for_good_meetup.texas_pop_h3`,rn connection: ‘bqconn’,rn format: deck.carto.FORMATS.JSON,rn credentials: {rn accessToken: ‘eyJhbGciOiJIUzI1NiJ9.eyJhIjoiYWNfbHFlM3p3Z3UiLCJqdGkiOiI1YjI0OWE2ZCJ9.Y7zB30NJFzq5fPv8W5nkoH5lPXFWQP0uywDtqUg8y8c’rn }rn }).then(({data}) => {rn const layers= [rn new deck.H3HexagonLayer({rn id: ‘h3-hexagon-layer’,rn data,rn extruded: true,rn getHexagon: d => d.h3,rn getFillColor: [182, 0, 119, 150],rn getElevation: d => d.pop,rn elevationScale: 2.5,rn parameters: {rn blendFunc: [luma.GL.SRC_ALPHA, luma.GL.DST_ALPHA],rn blendEquation: luma.GL.FUNC_ADDrn }rn })rn ];rn const overlay = new deck.GoogleMapsOverlay({layers});rn overlay.setMap(map);rn });”), (u’language’, u”)])]

For complete code using both options, take a look at these examples.

Example of using the deck.gl Hexagon Layer visualization with Google Maps Platform and CARTO

Learn more

You can access demos and documentation on the deck.gl docs website and the CARTO Documentation Center. If you have questions, you can ping the CARTO team on the CARTO Users Slack workspace.

For more information on Google Maps Platform, visit the Google Maps Platform website.

Read More for the details.

2022 03 14

AWS – Amazon Connect now supports rich formatting in chat messages

AWS, Cloud AWS

Amazon Connect Chat now allows your agents and customers to use rich text formatting when composing a message, enabling them to quickly add emphasis and structure to messages, improving comprehension. The available formatting options include bold, italics, hyperlinks, bulleted lists, and numbered lists.

Read More for the details.

2022 03 14

GCP – Data Governance in the Cloud – part 2 – Tools

Cloud, Google Cloud gcp

This is part 2 of the Data Governance blog series published in January. This blog focuses on technology to implement data governance in the cloud.

Along with a corporate governance policy and a dedicated team of people, implementing a successful data governance program requires tooling. From securing data, retaining and reporting audits, enabling data discovery, tracking lineage, to automating monitoring and alerts, multiple technologies are integrated to manage data life cycle.

Google cloud offers a comprehensive set of tools that enable organizations to manage their data securely, ensure governance, and drive data democratization. These tools fall into the following categories:

Data Security

Data security encompasses securing data from the point data is generated, acquired, transmitted, stored in permanent storage, and retired at the end of its life. Multiple strategies supported by various tools are used to ensure data security, identify and fix vulnerabilities as data moves in the data pipeline.

Google Cloud’s Security Command Center is a centralized vulnerability and threat reporting service. Security Command Center is a built-in security management tool for Google Cloud platform that helps organizations prevent, detect, and remediate vulnerabilities and threats. Security Command Center can identify security and compliance misconfigurations in your Google Cloud assets and provides actionable recommendations to resolve the issues.

Data Encryption

All data in Google cloud is encrypted by default, both in transit and rest. All VM to VM traffic, client connections to BigQuery, serverless Spark, Cloud Functions, and communication to all other services in Google cloud within a VPC as well as between peered VPCs is encrypted by default.

In addition to default encryption which is provided out of the box, customers can also manage their own encryption keys in Cloud KMS. Client side encryption where customers keep full control of the encryption keys at all times is also available.

Data Masking and Tokenization

While data encryption ensures that data is stored and travels in an encrypted form, end users are still able to see the sensitive data when they query the database or read file. Several compliance regulations require de-identifying or tokenizing sensitive data. For example, GDPR recommends data pseudonymization to “reduce the risk on data subjects”. De-identified data reduces the organization’s obligations on data processing and usage. Tokenization, another data obfuscation method, provides the ability to do data processing tasks such as verifying credit card transactions, without knowing the real credit card number. Tokenization replaces the original value of the data with a unique token. The difference between tokenization and encryption is that data encrypted using keys can be deciphered using the same keys while tokens are mapped to original data in the tokenization server. Without access to the token server, data tokens prevent deciphering of the original value even if a bad actor gets access to the token.

Google’s Cloud Data Loss Prevention (DLP) automatically detects, obfuscates and de-identifies sensitive information in your data using methods like data masking and tokenization. When building data pipelines or migrating data into the cloud, integrate Cloud DLP to automatically detect and de-identify or tokenize sensitive data and allow data scientists and users to build models and reports while minimizing risk of compliance violations.

Fine Grained Access Control

BigQuery supports fine grained access control for your data in Google Cloud. BigQuery access control policies can be created to limit access at column and row level controls in BigQuery. The combination of column and row level access control combined with DLP allows you to create datasets that have a safe (masked or encrypted) version of the data and a clear version of the data. This promotes data democratization where the CDO can trust the guardrails of Google cloud to allow access correctly according to the user identity, accompanied by audit logs to ensure a system of record. Data can be shared across the organization to run analysis and build machine learning models while ensuring that sensitive data remains inaccessible to unauthorized users.

Data Discovery, Classification and Data Sharing

Ability to find data easily is crucial to enable an effective data driven organization. Data governance programs leverage data catalogs to create an enterprise repository of all metadata. These catalogs allow data stewards and data users to add custom metadata, create business glossaries, and allow data analysts and scientists to search for data to analyze across the organization. Certain data catalogs also offer users to request access within the catalog to data which can be approved or denied based on policies created by data stewards.

Google cloud offers a fully managed and scalable Data Catalog to centralize metadata and support data discovery. Google’s data catalog will adhere to the same access controls the user has on the data (so users will not be able to search for data they cannot access). Further, Google’s Data Catalog is natively integrated into the GCP data fabric, without the need to manually register new datasets in the catalog – the same “search” technology that scours the web auto-indexes newly created data.

In addition, Google partners with major data governance platforms e.g. Collibra, Informatica to provide unified support for your on-prem and multi-cloud data ecosystem.

Data Lineage

Data lineage allows tracing back the sources of the data, allowing data scientists to ensure their models are trained on carefully sourced data, allowing data engineers to build better dashboards from known data sources, and allows inheriting policies from data sources to derivatives (so if a sensitive data source is used to create an ML model, that ML model can be labeled sensitive as well).

The ability to trace data to the source and keep a log of all changes made as the data progresses in the data pipeline provides a clear picture of the data landscape to the data owners. It makes it easier to identify data not tracked in data lineage and take corrective action to bring it under established governance and controls. When data is scattered across on-prem, cloud or multi cloud environments, a centralized lineage tracking platform gives a single view on where data originated and how data is moving across the organization. Tracking lineage is imperative to control costs, ensure compliance, reduce data duplication, and improve data quality.

Google Cloud’s Data Fusion provides end to end data lineage to help governance and ensure compliance. A data lineage system for BigQuery can also be built using Cloud Audit logs, data catalog, PubSub, and Dataflow. The architecture of building such a lineage system is described here. Additionally, Google’s rich partner ecosystem includes market leaders providing data lineage capabilities for on-prem and hybrid clouds, e.g. Collibra. Open source systems, e.g. Apache Atlas can also be implemented to collect metadata and track lineage in Google Cloud.

Auditing

It is important to keep all data access records for auditing purposes. Audits can be internal and external. Internal audits ensure that the organization is meeting all compliance criteria and take corrective action if needed. If an organization is operating in a regulated industry or keeping personal information, then keeping audit records is a compliance requirement.

Google Cloud Audit Logs can be turned on to ensure compliance with audits in Google Cloud and answer “who did what, where, and when across Google Cloud services?”. Cloud Logging (formerly Stackdriver) aggregates all the log data from your infrastructure and applications in one place. Cloud logging automatically collects data from Google Cloud services and you can feed application logs using Cloud Logging agent, FluentD, or the Cloud logging API. Logs in Cloud logging can be forwarded to GCS for archival, to bigquery for analyses, and also streamed to Pub/Sub to share logs with external third party systems.

Finally, Cloud Log Explorer allows you to easily retrieve, parse, and analyze logs and build dashboards to monitor logging data in real time.

Data Quality

Before data can be embedded in the decision making process, organizations need to ensure data meets the established quality standards. These standards are created by data stewards for their data domains.

Google Dataprep by Trifacta provides a friendly user interface to explore data and visualize data distribution. Business users can use Dataprep to quickly identify outliers, duplicates, and missing values before using data for analysis.

GCP’s Dataplex enables Data Quality assessment through declarative rules that can be executed on Dataplex serverless infrastructure. Data owners can create rules to find duplicate records, ensure completeness, accuracy, and validity (e.g transaction date cannot be in future.) Data owners can schedule these checks using Dataplex’s scheduler or include them in a pipeline by using the APIs. Data quality metrics are stored in a BigQuery table and/or are made available in Cloud logging for further dashboarding and automation.

Additionally, Google’s rich partner ecosystem includes leading data quality software providers, e.g. Informatica, and Collibra. Data quality tools are used to monitor on-prem, cloud, and multi cloud data pipelines to identify quality issues and quarantine or fix poor quality data.

Analytics Exchange

Organizations looking to democratize data, need a platform to easily share and exchange data analytics assets. The dashboard, report or a model that one team has built is often useful to other teams. In large organizations in the absence of an easy way to discover and share these assets, work is replicated leading to higher cost and lost time. Exchanging analytics assets enables teams to discover data issues improving reliability and data quality. Increasingly, organizations are also looking to exchange analytics assets with external partners. These can be used to negotiate better costs with vendors and even create a cash stream depending on the use cases.

Analytics Hub enables organizations to securely share their analytics assets to share and subscribe their analytics assets. Analytics Hub is a critical tool for organizations looking to democratize data and embed data in all decision making across the organization.

Compliance Certifications

Before organizations can migrate data to the cloud, they need to ensure all compliance requirements have been met. An organization may be required to comply with these regulations because of the region they are operating in, e.g. need to comply with CCPA in California, GDPR in Europe, and LGPD in Brazil. Organizations are also subjected to regulations because of their specific industry, e.g. PCI DSS in banking, HIPAA in healthcare, or FedRAMP when working with the US federal government.

Google cloud has over 100 plus compliance certifications that are specific to regions and industries. Google continues to add regulatory and compliance certifications to its portfolio. Dedicated compliance teams help customers ensure compliance as they migrate their data and onboard to Google cloud.

Conclusion

Start your data governance journey by exploring Dataplex: Google’s solution for centrally managing and governing data across your organization. As you look towards implementing data democratization, consider Analytics Hub to build a data analytics exchange to share your analytics assets easily. Security is built into every Google product and compliance certifications across the globe and industries ease data migrations to the cloud. If you have already started your cloud journey, ensure high quality data, secure access to sensitive data attributes by using native Google Cloud and partner products in GCP.

Where to learn more:

Google Data Governance leaders have captured best practices and Data Governance learnings in an O’Reilly publication: Data Governance, The Definitive Guide

Read More for the details.

2022 03 14

GCP – Celebrating Pi Day with Cloud Functions

Cloud, Google Cloud gcp

March 14 is Pi Day, an annual celebration of the mathematical constant π. We’re doing a few experiments with the new Cloud Functions (2nd gen) this year to showcase the new serverless platform.

Serverless π calculation

Can we go serverless to calculate π? There is a relatively new algorithm called the Bailey–Borwein–Plouffe formula (BBP formula) to calculate digits of π without computing the preceding digits, which means we can run many calculations in parallel and gather the results later. We can build a MapReduce pipeline to get the numbers. Cloud Functions (2nd gen) supports larger instances (16 GB memory and 4 vCPUs) and extends the max function execution duration to 60 minutes from 9 minutes so we can push the limits further.

Here’s the architecture we built. We used Node to implement the functions.

The calculation runs as follows.

The processRequest function to create and submit tasks (e.g., calculate X digits from offset Y) to PubSub.

The calculateOffset function gets triggered by PubSub, computes digits, and stores the result to Firestore.

A separate controller function keep-watch monitors Firestore and triggers combineResults once all tasks are complete, storing the final output in Cloud Storage as a text file.

How long does it take? It takes about 45 minutes to calculate 64 million digits.

Unfortunately, the BBP formula is computationally more complex than other algorithms used in record-breaking attempts (such as our last 31.4 trillion record), so it is unlikely we can use this method to establish a new record. However, the extended runtime in Cloud Functions makes running many new applications possible, including this experiment.

Serverless Pi API

We’ve also built a new API to serve digits of π. We previously used Google Kubernetes Engine (GKE), but we have migrated the service to Cloud Functions 2nd gen. We also rewrote the API program to fetch data from Cloud Storage instead of Persistent Disk.

We’ve reduced the total monthly expenses from about 9,000 US dollars to 450 US dollars by changing the architecture.

Storage cost: The most significant cost factor is the disk. The previous configuration used a Zonal SSD Persistent Disk to store the digits of π, and it’s 50 TB with 50 trillion digits in plain text, which costs 8,500 US dollars a month. We’ve moved all the data to Cloud Storage. We’ve also changed the object format to a compressed file so that the 50 trillion digits consume 21 TB instead of 50 TB. With these changes, storage cost is about 400 US dollars a month, which is 95% saving.

Compute cost: Cloud Functions also helps reduce cost. There is no base CPU cost for Cloud Functions as it only charges when processing requests. Previously with GKE, we always had four instances in 2 zones that would cost 391 US dollars per month with the e2-standard-4 instance type. We estimate that the monthly cost would be around 7 US dollars for 1 million requests, achieving 98% saving.

The new service architecture is also faster and more robust as it runs in multiple regions. The global HTTP(S) load balancer chooses the closest region automatically and improves latency. The API function has a small cache inside and returns a response without accessing the Cloud Storage bucket on a cache hit.

Live demo

We have a live demo using the new API where you can listen to music created by the digits of π or watch an animation of transitions of the numerical sequence. We host the entire 50 trillion digits from the2021 world record for the demo. You can also call the API directly from the command line:

curl “https://api.pi.delivery/v1/pi?start=0&numberOfDigits=100”

The source code of the API service and the demo are available on GitHub. We use Terraform to manage the infrastructure, and we’ve published the scripts as well, so it’s a great place to learn with actual examples.

Happy Pi Day!

Read More for the details.

2022 03 14

Azure – General availability: Support for Private Link in Azure Digital Twins

Azure, Cloud Azure

Support for Private Link is now generally available for Azure Digital Twins, a platform that enables you to create digital representation of real-world things, places, and business processes.

Read More for the details.

2022 03 14

Azure – Generally available: Pin analytics tile to dashboards in Azure IoT Central

Azure, Cloud Azure

Quickly build new visuals by simply pinning an analytics tile from data explorer to the dashboard directly.

Read More for the details.

Cloud

Pioneering at-home molecular testing

Meeting Pandemic-level Demand

Using Data to Stay Ahead

Healthcare Industry Trends 2022: Data empowers patients, researchers, partners

Enhanced Support

What’s next for customers?

What’s next for partners?

What’s right for you?

Questions? Concerns? Suggestions? We want to hear from you.

Mission Critical Services: for the most demanding enterprise environments

Data Security

Data Encryption

Data Masking and Tokenization

Fine Grained Access Control

Data Discovery, Classification and Data Sharing

Data Lineage

Auditing

Data Quality

Analytics Exchange

Compliance Certifications

Conclusion

Where to learn more:

Data governance in the cloud – part 1 – People and processes

Serverless π calculation

Serverless Pi API

Live demo

Pi in the sky: Calculating a record-breaking 31.4 trillion digits of Archimedes’ constant on Google Cloud