1
0
Fork 0
mirror of https://github.com/nickpoida/og-aws.git synced 2025-03-09 15:40:06 +00:00

S3 object prefix randomization now unnecessary (#613)

* S3 object prefix randomization now unnecessary
This commit is contained in:
Thanos Baskous 2018-07-17 15:33:06 -04:00 committed by GitHub
parent 119ee9436c
commit c81b5e41d9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -717,9 +717,7 @@ S3
- **Multi-part uploads:** For large objects you want to take advantage of the multi-part uploading capabilities (starting with minimum chunk sizes of 5 MB).
- **Large downloads:** Also you can download chunks of a single large object in parallel by exploiting the HTTP GET range-header capability.
- 🔸**List pagination:** Listing contents happens at 1000 responses per request, so for buckets with many millions of objects listings will take time.
- ❗**Key prefixes:** In addition, latency on operations is [highly dependent on prefix similarities among key names](http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html). If you have need for high volumes of operations, it is essential to consider naming schemes with more randomness early in the key name (first 6 or 8 characters) in order to avoid “hot spots”.
- We list this as a major gotcha since its often painful to do large-scale renames.
- 🔸Note that sadly, the advice about random key names goes against having a consistent layout with common prefixes to manage data lifecycles in an automated way.
- ❗**Key prefixes:** Previously randomness in the beginning of key names was necessary in order to avoid hot spots, but that is [no longer necessary](https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3-announces-increased-request-rate-performance/) as of July, 2018.
- For data outside AWS, [**DirectConnect**](https://aws.amazon.com/directconnect/) and [**S3 Transfer Acceleration**](https://aws.amazon.com/blogs/aws/aws-storage-update-amazon-s3-transfer-acceleration-larger-snowballs-in-more-regions/) can help. For S3 Transfer Acceleration, you [pay](https://aws.amazon.com/s3/pricing/) about the equivalent of 1-2 months of storage for the transfer in either direction for using nearer endpoints.
- **Command-line applications:** There are a few ways to use S3 from the command line:
- Originally, [**s3cmd**](https://github.com/s3tools/s3cmd) was the best tool for the job. Its still used heavily by many.