AWS – AWS Glue DataBrew now allows you to configure the size of the dataset when auto-generating data quality statistics
When running profile jobs in AWS Glue DataBrew to auto-generate 40+ data quality statistics like column-level cardinality, numerical correlations, unique values, standard deviation, and other statistics, you can now configure the size of the dataset you want analyzed. This allows you to customize your profile to run on x% of the dataset for really large datasets or focus on a sub-sample of the dataset for faster results.
Read More for the details.