Skip to content

Commit

Permalink
Documentation changes and other minor tweaks.
Browse files Browse the repository at this point in the history
  • Loading branch information
nathanpeck committed Oct 20, 2014
1 parent 40f3abe commit f750707
Show file tree
Hide file tree
Showing 6 changed files with 102 additions and 118 deletions.
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,14 @@
Changelog
=========

#### 1.0.6 (2014-10-20)

Removing global state, and adding pause and resume functionality.

#### 1.0.5 (2014-10-13)

Changing how buffers are subdivided, in order to provide support for in browser operation.

#### 1.0.4 (2014-10-13)

Getting rid of the use of setImmeadiate. Also now the MPU is not initialized until data is actually received by the writable stream, and error checking verifies that data has actually been uploaded to S3 before trying to end the stream. This fixes an issue where empty incoming streams were causing errors to come back from S3 as the module was attempting to complete an empty MPU.
Expand Down
138 changes: 59 additions & 79 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,9 @@ A pipeable write stream which uploads to Amazon S3 using the multipart file uplo

### Changelog

#### 1.0.4 (2014-10-13)
#### 1.0.6 (2014-10-20)

Getting rid of the use of setImmediate. Also now the MPU is not initialized until data is actually received by the writable stream, and error checking verifies that data has actually been uploaded to S3 before trying to end the stream. This fixes an issue where empty incoming streams were causing errors to come back from S3 as the module was attempting to complete an empty MPU.

#### 1.0.3 (2014-10-12)

Some minor scope adjustments.
Removing global state, and adding pause and resume functionality.

[Historical Changelogs](CHANGELOG.md)

Expand All @@ -23,6 +19,7 @@ Some minor scope adjustments.
* This package is designed to use the official Amazon SDK for Node.js, helping keep it small and efficient. For maximum flexibility you pass in the aws-sdk client yourself, allowing you to use a uniform version of AWS SDK throughout your code base.
* You can provide options for the upload call directly to do things like set server side encryption, reduced redundancy storage, or access level on the object, which some other similar streams are lacking.
* Emits "part" events which expose the amount of incoming data received by the writable stream versus the amount of data that has been uploaded via the multipart API so far, allowing you to create a progress bar if that is a requirement.
* Support for pausing and later resuming in progress multipart uploads.

### Limits

Expand All @@ -32,14 +29,13 @@ Some minor scope adjustments.
## Example

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk'),
var AWS = require('aws-sdk'),
zlib = require('zlib'),
fs = require('fs');
s3Stream = require('s3-upload-stream')(new AWS.S3()),

// Set the client to be used for the upload.
AWS.config.loadFromPath('./config.json');
s3Stream.client(new AWS.S3());

// Create the streams
var read = fs.createReadStream('/path/to/a/file');
Expand Down Expand Up @@ -84,9 +80,7 @@ read.pipe(compress).pipe(upload);

## Usage

### package.client(s3);

Configures the S3 client for s3-upload-stream to use. Please note that this module has only been tested with AWS SDK 2.0 and greater.
Before uploading you must configures the S3 client for s3-upload-stream to use. Please note that this module has only been tested with AWS SDK 2.0 and greater.

This module does not include the AWS SDK itself. Rather you must require the AWS SDK in your own application code, instantiate an S3 client and then supply it to s3-upload-stream.

Expand All @@ -97,23 +91,25 @@ When setting up the S3 client the recommended approach for credential management
If you are following this approach then you can configure the S3 client very simply:

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk');

s3Stream.client(new AWS.S3());
var AWS = require('aws-sdk'),
s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());
```

However, some environments may require you to keep your credentials in a file, or hardcoded. In that case you can use the following form:

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk');
#!/usr/bin/env node
var AWS = require('aws-sdk');

// Make sure AWS credentials are loaded using on of the following techniques
AWS.config.loadFromPath('./config.json');
AWS.config.update({accessKeyId: 'akid', secretAccessKey: 'secret'});
s3Stream.client(new AWS.S3());

// Create a stream client.
var s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());
```

### package.upload(destination)
### client.upload(destination)

Create an upload stream that will upload to the specified destination. The upload stream is returned immeadiately.

Expand All @@ -122,60 +118,63 @@ The destination details is an object in which you can specify many different [de
__Example:__

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk');

s3Stream.client(new AWS.S3());
var AWS = require('aws-sdk'),
s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());

var read = fs.createReadStream('/path/to/a/file');
var upload = new s3Client.upload({
"Bucket": "bucket-name",
"Key": "key-name",
"ACL": "public-read",
"StorageClass": "REDUCED_REDUNDANCY",
"ContentType": "binary/octet-stream"
var upload = s3Client.upload({
Bucket: "bucket-name",
Key: "key-name",
ACL: "public-read",
StorageClass: "REDUCED_REDUNDANCY",
ContentType: "binary/octet-stream"
});

read.pipe(upload);
```

### package.upload(destination, session)
### client.upload(destination, [session])

Resume an incomplete multipart upload from a previous session by providing a `session` object with an upload ID, and ETag and numbers for each part. `destination` details is as above.

__Example:__

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk');

s3Stream.client(new AWS.S3());
var AWS = require('aws-sdk'),
s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());

var read = fs.createReadStream('/path/to/a/file');
var upload = new s3Client.upload({
"Bucket": "bucket-name",
"Key": "key-name",
"ACL": "public-read",
"StorageClass": "REDUCED_REDUNDANCY",
"ContentType": "binary/octet-stream"
}, {
"UploadId": "f1j2b47238f12984f71b2o8347f12",
"Parts": [
{
"ETag": "3k2j3h45t9v8aydgajsda",
"PartNumber": 1
},
{
"Etag": "kjgsdfg876sd8fgk3j44t",
"PartNumber": 2
}
]
});
var upload = s3Client.upload(
{
Bucket: "bucket-name",
Key: "key-name",
ACL: "public-read",
StorageClass: "REDUCED_REDUNDANCY",
ContentType: "binary/octet-stream"
},
{
UploadId: "f1j2b47238f12984f71b2o8347f12",
Parts: [
{
ETag: "3k2j3h45t9v8aydgajsda",
PartNumber: 1
},
{
Etag: "kjgsdfg876sd8fgk3j44t",
PartNumber: 2
}
]
}
);

read.pipe(upload);
```

### package.pause()
## Stream Methods

The following methods can be called on the stream returned by from `client.upload()`

### stream.pause()

Pause an active multipart upload stream.

Expand All @@ -187,7 +186,7 @@ Calling `pause()` will immediately:

When mid-upload parts are finished, a `paused` event will fire, including an object with `UploadId` and `Parts` data that can be used to resume an upload in a later session.

### package.resume()
### stream.resume()

Resume a paused multipart upload stream.

Expand All @@ -199,19 +198,15 @@ Calling `resume()` will immediately:

It is safe to call `resume()` at any time after `pause()`. If the stream is between `pausing` and `paused`, then `resume()` will resume data flow and the `paused` event will not be fired.

## Optional Configuration

### stream.maxPartSize(sizeInBytes)

Used to adjust the maximum amount of stream data that will be buffered in memory prior to flushing. The lowest possible value, and default value, is 5 MB. It is not possible to set this value any lower than 5 MB due to Amazon S3 restrictions, but there is no hard upper limit. The higher the value you choose the more stream data will be buffered in memory before flushing to S3.

The main reason for setting this to a higher value instead of using the default is if you have a stream with more than 50 GB of data, and therefore need larger part sizes in order to flush the entire stream while also staying within Amazon's upper limit of 10,000 parts for the multipart upload API.

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk');

s3Stream.client(new AWS.S3());
var AWS = require('aws-sdk'),
s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());

var read = fs.createReadStream('/path/to/a/file');
var upload = new s3Client.upload({
Expand All @@ -231,10 +226,8 @@ Used to adjust the number of parts that are concurrently uploaded to S3. By defa
Keep in mind that total memory usage will be at least `maxPartSize` * `concurrentParts` as each concurrent part will be `maxPartSize` large, so it is not recommended that you set both `maxPartSize` and `concurrentParts` to high values, or your process will be buffering large amounts of data in its memory.

```js
var s3Stream = require('s3-upload-stream'),
AWS = require('aws-sdk');

s3Stream.client(new AWS.S3());
var AWS = require('aws-sdk'),
s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());

var read = fs.createReadStream('/path/to/a/file');
var upload = new s3Client.upload({
Expand All @@ -247,19 +240,6 @@ upload.concurrentParts(5);
read.pipe(upload);
```

### Migrating from pre-1.0 s3-upload-stream

The methods and interface for s3-upload-stream has changed since 1.0 and is no longer compatible with the older versions.

The differences are:

* This package no longer includes Amazon SDK, and now you must include it in your own app code and pass an instantiated Amazon S3 client in.
* The upload stream is now returned immeadiately, instead of in a callback.
* The "chunk" event emitted is now called "part" instead.
* The .maxPartSize() and .concurrentParts() methods are now methods of the writable stream itself, instead of being methods of an object returned from the upload stream constructor method.

If you have questions about how to migrate from the older version of the package after reviewing these docs feel free to open an issue with your code example.

### Tuning configuration of the AWS SDK

The following configuration tuning can help prevent errors when using less reliable internet connections (such as 3G data if you are using Node.js on the Tessel) by causing the AWS SDK to detect upload timeouts and retry.
Expand Down
13 changes: 6 additions & 7 deletions examples/upload.js
Original file line number Diff line number Diff line change
@@ -1,20 +1,19 @@
#!/usr/bin/env node
var s3Stream = require('../lib/s3-upload-stream.js'),
AWS = require('aws-sdk'),
var AWS = require('aws-sdk'),
zlib = require('zlib'),
fs = require('fs');

// JSON file containing AWS API credentials.
// Make sure AWS credentials are loaded.
AWS.config.loadFromPath('./config.json');

// Set the client to be used for the upload.
s3Stream.client(new AWS.S3());
// Initialize a stream client.
var s3Stream = require('../lib/s3-upload-stream.js')(new AWS.S3());

// Create the streams
var read = fs.createReadStream('./video.mp4');
var compress = zlib.createGzip();
var upload = new s3Stream.upload({
"Bucket": "storydesk",
var upload = s3Stream.upload({
"Bucket": "bucket",
"Key": "video.mp4.gz"
});

Expand Down
7 changes: 3 additions & 4 deletions lib/s3-upload-stream.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ function Client(client) {

// Generate a writeable stream which uploads to a file on S3.
Client.prototype.upload = function (destinationDetails, sessionDetails) {

var cachedClient = this.cachedClient;
var e = new events.EventEmitter();

Expand All @@ -29,7 +28,7 @@ Client.prototype.upload = function (destinationDetails, sessionDetails) {

// Data pertaining to the overall upload.
// If resumable parts are passed in, they must be free of gaps.
var multipartUploadID = sessionDetails.UploadId;
var multipartUploadID = sessionDetails.UploadId ? sessionDetails.UploadId : null;
var partNumber = sessionDetails.Parts ? (sessionDetails.Parts.length + 1) : 1;
var partIds = sessionDetails.Parts || [];
var receivedSize = 0;
Expand Down Expand Up @@ -363,11 +362,11 @@ Client.client = function (options) {
return Client.globalClient;
};

Client.upload = function (destinationDetails) {
Client.upload = function (destinationDetails, sessionDetails) {
if (!Client.globalClient) {
throw new Error('Must configure an S3 client before attempting to create an S3 upload stream.');
}
return Client.globalClient.upload(destinationDetails);
return Client.globalClient.upload(destinationDetails, sessionDetails);
};

module.exports = Client;
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "s3-upload-stream",
"description": "Writeable stream for uploading content of unknown size to S3 via the multipart API.",
"version": "1.0.5",
"version": "1.0.6",
"author": {
"name": "Nathan Peck",
"email": "nathan@storydesk.com"
Expand Down
Loading

0 comments on commit f750707

Please sign in to comment.