I have a ReadableStream of uncompressed text data that I need to store on a single zip-compressed file on S3.
What I don’t want to do:
- Load all the data into memory.
- Write the data on local disk.
What I would like to do:
- Incrementally read the data from the input stream, zip-compress it and upload to S3 directly (chunk by chunk). The result should be a single zip file.
What I have tried so far:
import {S3} from 'aws-sdk';
import {PassThrough} from 'stream';
import archiver from 'archiver';
const s3 = new S3({region: 'us-east-1'});
(async () => {
const stream = new PassThrough();
const archive = archiver('zip', {
zlib: { level: 9 }
});
archive.pipe(stream);
const upload = s3.upload({
Bucket: 'my-bucket',
Key: 'stream.zip',
Body: stream,
}).promise();
// Assume this is the readable stream we are reading from
for (let i = 0; i < 100; i++) {
const textData = `text-data-${i}`;
archive.append(Buffer.from(`${textData}n`, 'utf8'), { name: `file-${i}.txt` });
}
await archive.finalize();
await upload;
})();
This is not correct because it is generating multiple files in the output archive stream.zip
on S3 (file-1.txt
, file-2.txt
, etc). On the other hand, if I use a single file name when appending data to archive
, I need to buffer all the data into memory before appending which nullifies the purpose of streaming data incrementally.
Does anyone know any solution to this?