GCP – 8 new ways to bridge the gap to geospatial analysis with Earth Engine
Over 10 years ago we launched Earth Engine, Google’s cloud-based service for geospatial processing, to address the greatest sustainability challenges of our time. Since then, it has continually evolved alongside the growing urgency of environmental issues to include 90+ petabytes of analysis-ready geospatial data. With a wider range of options available for analyzing satellite data, we’ve focused our recent efforts on building smart connections between Earth Engine and other critical tools, datasets, and systems.
Here are eight improvements and integrations released in the past few months to make it easier for you to use Earth Engine:
1. BigQuery
BigQuery is Google’s petabyte scale analytical database. Earth Engine focuses on image (raster) processing, whereas BigQuery is optimized for processing large tabular datasets. By using BigQuery and Earth Engine together, users get the best of both worlds. Find out more about the Earth Engine to BigQuery export connector here: Improving sustainability with our Earth Engine and BigQuery connector.
2. Python
Many Earth Engine users want to analyze data in Python because it’s the most commonly used language for machine learning and data analysis. The Python community has developed useful tools for ML and analysis, and these have expanded recently to geospatial workloads. For example, Cloud Optimized GeoTiffs are a file format optimized for remote sensing data, and GeoPandas extends Pandas dataframes with common geospatial functions and plotting.
Earth observation science is an inherently visual experience: panning, zooming, clicking to inspect image band values at a point, and drawing map polygons for zonal statistics are all important parts of science workflows. Until recently, Earth Engine only supported this experience in its JavaScript-based code editor. Today, we’re excited to announce official support for geemap, a Python library that delivers many code editor experiences in a Colab or Jupyter notebook environment created in April 2020 by Dr. Qiusheng Wu, a Google Developer Expert. For more information, see Python Powers Up: The Rise of the Python API for Earth Engine.
3. Data extraction
If you’re training a TensorFlow model or if you want to run hydrology simulations outside Earth Engine, you might want to get data out of Earth Engine into another system. You’ve always been able to use the Earth Engine export API to let Earth Engine do the heavy lifting. But if you’ve ever run into scaling issues or you’re already familiar with a framework like Apache Beam, Spark, or Dask, check out our new data extraction methods. Our Python client library now comes bundled with client-side logic to convert between Earth Engine objects and NumPy, Pandas, and GeoPandas types. For more information, see Pixels to the people!
4. Xarray
Xarray is a popular open-source Python package for working with multidimensional arrays. We believe that Xarray offers the most convenient way to work with pixels from Earth Engine: it allows you to operate on Earth Engine ImageCollection as Xarray Datasets. We recently announced an Earth Engine integration with Xarray called Xee, which integrates closely with Dask to distribute work across multiple processors. Xarray does “lazy evaluation,” which means it only pulls down the data that’s necessary for a calculation, but it connects to many other systems. So for instance, you can use Xee to export Earth Engine data into a Zarr file, a relatively newer data format that is becoming more popular for weather and climate data.
5. Vertex AI
Once you have data outside Earth Engine, you may want to train a deep learning model on it and then get predictions from it. Earth Engine now integrates with Vertex AI (currently in Public Preview). This replaces a prior integration with Google Cloud AI Platform. You can host your model in Vertex AI and get predictions from within the Earth Engine code editor. Vertex supports much larger images for prediction than AI Platform. Vertex also allows for a lot more extensibility. For more information, please see Earth Engine brings Vertex AI to the geospatial party.
6. Erdas LiveLink for Earth Engine
ERDAS IMAGINE is a very popular remote-sensing software package. We were excited to partner with Hexagon to launch LiveLink for Earth Engine this year. LiveLink allows you to combine Earth Engine catalog data with private data on your local computer. You can get the best of both worlds: leverage Earth Engine’s expansive data catalog and backend processing capabilities while developing in a familiar interactive environment.
7. New datasets to streamline analysis
Finally, we routinely add new, broadly useful datasets to Earth Engine’s data catalog so you can join them to the rest of Earth Engine’s data or your own private data. In the last year, we have added over 100 new datasets, including JRC’s Global Map of Forest Cover for 2020, and NASA’s Harmonized Landsat and Sentinel-2 dataset, which creates a seamless surface reflectance record for the globe.
8. Cloud Score+
Anyone who has worked with Sentinel-2 data or remote sensing data in general will tell you that cloud cover is a major problem. The world just doesn’t look like the beautiful imagery in Google Earth all the time – it’s covered in clouds, making analysis challenging. This is why our team built Cloud Score+, which is the first comprehensive QA score for Sentinel-2, powered by a state-of-the-art deep learning approach. Cloud Score+ is now available for the entire Sentinel-2 collection. For more information, please see All Clear with Cloud Score+.
We know that your sustainability efforts make the most impact when you can connect remote sensing data to other parts of the Google Cloud ecosystem. Contact your Google Cloud sales representative for more information on these latest Earth Engine enhancements, and stay tuned for more in 2024.
Read More for the details.