How it works:
s3m upload the files in different ways:
- If the input file is < than the buffer size, the file will be uploaded in one-show (not capable of resuming).
- if the input file is > than the buffer size, the file will be uploaded in multi parts and in it can be resumed.
The buffer size by default is 10MB and it can be changed using the option
-b
.
s3m calculates the buffer size automatically, but if required and if you know already the file size, you can choose a buffer size, based on your needs.
For example, if you would like to upload 5TB (the current max object size) you will need a buffer size of 512MB:
s3m /path_to/5TB.file <s3>/<bucket>/file -b 536870912
If using buffer of 30MB you could upload up to 300GB:
s3m /path_to_max/300GB.file <s3>/<bucket>/file -b 31457280
The current limits for AWS S3 are:
- Maximum object size: 5 TB
- Maximum number of parts per upload: 10,000
- Part size: 5 MB to 5 GB
If you would like to upload up to 500GB objects:
524288000000 / 10000 = 52428800 (50MB buffer size)
(500GB in bytes) / (max number of parts)
The buffer size to use 50MB:
s3m /path_to_max/500GB.file <s3>/<bucket>/file -b 52428800
As mentioned, this is optional since s3m calculates this automatically
When the size of the input is known in advance, a checksum of the file is created before unloading the file in order to keep track of the uploaded objects, this helps to prevent uploading the same file twice besides "keeping state" when uploading in multi parts so that if the upload gets interrupted it can be resumed later.
To clean up the local database you could use the option --clean
, for example:
s3m --clean
There is no checksum when piping (option -p/--pipe
) / sending the file via STDIN.