From 8b6745b3f6612d9e61a4cab9e28091ca1f95b3d5 Mon Sep 17 00:00:00 2001 From: max Date: Wed, 12 Oct 2016 18:35:08 -0700 Subject: [PATCH] Added the automatic compression tip to the Redshift section --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 5e54688..62051ec 100644 --- a/README.md +++ b/README.md @@ -1224,6 +1224,7 @@ Redshift - [Top 10 Performance Tuning Techniques for Amazon Redshift](https://blogs.aws.amazon.com/bigdata/post/Tx31034QG0G3ED1/Top-10-Performance-Tuning-Techniques-for-Amazon-Redshift) provides an excellent list of performance tuning techniques. - [Amazon Redshift Utils](https://github.com/awslabs/amazon-redshift-utils) contains useful utilities, scripts and views to simplify Redshift ops. - [VACUUM](http://docs.aws.amazon.com/redshift/latest/dg/t_Reclaiming_storage_space202.html) regularly following a significant number of deletes or updates to reclaim space and improve query performance. +- Redshift provides various [column compression](http://docs.aws.amazon.com/redshift/latest/dg/t_Compressing_data_on_disk.html) options to optimize the stored data size. AWS strongly encourages to use [automatic compression](http://docs.aws.amazon.com/redshift/latest/dg/c_Loading_tables_auto_compress.html) at the COPY stage, when Redshift uses a sample of the data being ingested to analyze the column compression options. However, automatic compression can only be applied to an empty table with no data. Therefore, make sure the initial load batch is big enough to provide Redshift with a representative sample of the data (the default sample size is 100000 rows). ### Redshift Gotchas and Limitations