GCP – How Confidential Accelerators can boost AI workload security
As artificial intelligence and machine learning workloads become more popular, it’s important to secure them with specialized data security measures. Confidential Computing can help protect sensitive data used in ML training to maintain the privacy of user prompts and AI/ML models during inference and enable secure collaboration during model creation.
At Google Cloud Next, we announced two new Confidential Computing offerings specifically designed to protect the privacy and confidentiality of AI/ML workloads: Confidential VMs powered by NVIDIA H100 Tensor Core GPUs, and Confidential VMs with Intel Advanced Matrix Extensions (Intel AMX) support.
A3 Confidential VMs: Securing confidential AI/ML workloads
Google Cloud and NVIDIA have collaborated to bring Confidential Computing to GPUs. The H100 is NVIDIA’s first GPU that supports Confidential Computing, and offers a dedicated Confidential Computing mode. This mode activates robust hardware-based security, ensuring only authorized users can execute data and code within the Trusted Execution Environment (TEE).
With Confidential VMs with NVIDIA H100 Tensor Core GPUs, you’ll be able to unlock use cases that involve highly-restricted datasets, sensitive models that need additional protection, and can collaborate with multiple untrusted parties and collaborators while mitigating infrastructure risks and strengthening isolation through confidential computing hardware.
Fine-tuning and training
When training AI models, you may wish to ensure that your data and code is protected at all times. With A3 Confidential VMs with NVIDIA H100 GPUs, training that involves input data with personally identifiable information (PII), proprietary data labeling, and trade secrets can be done in a TEE so fine-tuning or other AI/ML training is not visible outside the TEE.
For example, a retailer may want to create a personalized recommendation engine to better service their customers but doing so requires training on customer attributes and customer purchase history. By performing training in a TEE, the retailer can help ensure that customer data is protected end to end.
Serving
Often, AI models and their weights are sensitive intellectual property that needs strong protection. If the models are not protected in use, there is a risk of the model exposing sensitive customer data, being manipulated, or even being reverse-engineered. A3 Confidential VMs with NVIDIA H100 GPUs can help protect models and inferencing requests and responses, even from the model creators if desired, by allowing data and models to be processed in a hardened state, thereby preventing unauthorized access or leakage of the sensitive model and requests.
This is especially pertinent for those running AI/ML-based chatbots. Users will often enter private data as part of their prompts into the chatbot running on a natural language processing (NLP) model, and those user queries may need to be protected due to data privacy regulations. If the model-based chatbot runs on A3 Confidential VMs, the chatbot creator could provide chatbot users additional assurances that their inputs are not visible to anyone besides themselves.
Collaborating
Many organizations need to train and run inferences on models without exposing their own models or restricted data to each other. Confidential VMs with NVIDIA H100 Tensor Core GPUs can help organizations collaborate confidentially by enforcing verifiable policies on how the data is processed and outcomes are shared.
This use case comes up often in the healthcare industry where medical organizations and hospitals need to join highly protected medical data sets or records together to train models without revealing each parties’ raw data. With A3 Confidential VMs with NVIDIA H100 GPUs, healthcare companies can collaborate in a data clean room, like Confidential Space, to ensure security and performance.
A3 Confidential VMs extend the Trusted Execution Environment (TEE) beyond the VM itself to encompass the integrated NVIDIA H100 GPUs, ensuring comprehensive data protection. This is achieved through a privacy-centric approach: The CPU hardware can prevent direct GPU access to the Confidential VM’s memory, maintaining a strong security boundary.
To facilitate secure data transfer, the NVIDIA driver, operating within the CPU TEE, utilizes an encrypted “bounce buffer” located in shared system memory. This buffer acts as an intermediary, ensuring all communication between the CPU and GPU, including command buffers and CUDA kernels, is encrypted and thus mitigating potential in-band attacks. In essence, this architecture creates a secured data pipeline, safeguarding confidentiality and memory integrity even when sensitive information is processed on the powerful NVIDIA H100 GPUs.
The best part? The CUDA driver and GPU firmware handle encryption transparently, maintaining performance and ease of use.
Confidential VMs with Intel TDX and Intel AMX
“AI will be transformational for almost every industry, but it is accompanied by critical security, privacy and regulatory requirements,” said Anand Pashupathy, vice president and general manager, Security Software and Solutions Group, Intel. “Google Cloud’s new confidential AI offerings protected with Intel TDX give AI practitioners the ability to protect their data and models, and enhance their compliance posture, whether they are using CPU-based AI enhanced by Intel AMX instructions or AI accelerated by an external GPU.”
The CPU of a machine series determines what the types of security features are available for confidential virtual machines. For example, the general purpose C3 machine series runs on 4th Gen Intel Xeon scalable processors (code named Sapphire Rapids). These processors are capable of a Confidential Computing technology called Intel Trust Domain Extensions (Intel TDX).
Intel TDX creates a hardware-based trusted execution environment that deploys each guest VM into its own cryptographically isolated “trust domain” to protect sensitive data and applications from unauthorized access. Confidential VMs on the C3 machine series with Intel TDX have been available in Preview since March.
However, there is another important CPU feature on C3, called Intel Advanced Matrix Extensions (Intel AMX). This is a new instruction set architecture (ISA) extension designed to accelerate AI/ML workloads, introducing new instructions that can be used to perform matrix multiplication and convolution operations, which are two of the most common operations in AI and ML. C3 Instances with Intel AMX provide higher AI inference performance compared to previous generation instances.
To offer AI/ML workloads a higher level of security, all Confidential VMs on the C3 machines series support Intel AMX instruction sets by default. This means your AI/ML workloads can run confidentially to stay protected against unauthorized access from privileged administrators and cloud operators. Intel AMX is a built-in accelerator that can improve the performance of CPU-based training and inference and can be cost-effective for workloads like natural-language processing, recommendation systems and image recognition. Using Intel AMX on Confidential VMs can help reduce the risk of exposing AI/ML data or code to unauthorized parties.
Thales, a global leader in advanced technologies across three business domains: defense and security, aeronautics and space, and cybersecurity and digital identity, has taken advantage of the Confidential Computing to further secure their sensitive workloads.
“As more enterprises migrate their data and workloads to the cloud, there is an increasing demand to safeguard the privacy and integrity of data, especially sensitive workloads, intellectual property, AI models and information of value. This collaboration enables enterprises to protect and control their data at rest, in transit and in use with fully verifiable attestation. Our close collaboration with Google Cloud and Intel increases our customers’ trust in their cloud migration,” said Todd Moore, vice president, Data Security Products, Thales.
Try it today
Confidential VMs with Intel TDX on C3 machine series have Intel AMX on by default. Simply create a Confidential VM on the C3 machine series to run your AI/ML workloads to try out Intel AMX. See our public documentation here.
To stay updated on the latest developments of Confidential VMs on A3 machines series utilizing performant NVIDIA H100 GPUs, you can sign up on this interest form.
Read More for the details.