I have started a project which will duplicate Dropbox or Google Drive behavior but using Amazon S3 az a backend.
Idea is very simple, a Node.js server that watchs a directory for file changes and PUT them on the S3. Or it will look at S3 for changes and applies them to file system structure. I’be uploaded very early version of my app to Github. You can find it here.
Because I am a web developer, I am using web technologies to solve the problem. I’m afraid of my limited mindset and picking wrong tools for the job. There are other solutions to this problem. One is S3FS which is a FUSE file system for Unix systems. In my opinion that is very hard to use and limited to platform. My solution uses Node.js to overcome cross-platform issues. I can pack my Node.js app with App.js and make it an easy to use software.
To clarify, my questions are:
- Is HTTP/HTTPS good enough for file transfer?
- Is Node.js good enough for working with File System?
- Scalability: can this approach fail in large file sizes?
4
Because I am a web developer, I am using web technologies to solve the
problem. I’m afraid of my limited mindset and picking wrong tools for
the job.
You’re falling into the “when all you have is a hammer, everything looks like a nail” trap. But you’re able to recognize it, which is a very good thing.
Is HTTP/HTTPS good enough for file transfer?
Of course. Billions, if not trillions, of files are transferred via HTTP daily. Any file you transfer to S3 is going to go that way.
Is Node.js good enough for working with File System?
I don’t know a great deal about Node.js, but if your goal is to just have a standalone program that runs on a machine and keeps a directory and S3 in sync, it it doesn’t seem like the right tool for the job.
If you’re willing to learn Java, the Apache Commons Virtual File System can talk to a variety of backends (including the local file system), and VFS-S3 is an add-on that makes it work with S3. VFS includes a class called AbstractSyncTask that takes care of 90% of the grunt work, leaving you to just extend the class and implement your transfer rules. Being Java, it will run anywhere Java does.
Scalability: can this approach fail in large file sizes?
That would depend on the limits of Node.js, although I’d be surprised if someone hasn’t already bumped into that problem and made sure it can transfer arbitrary-length files.
2
- Is HTTP/HTTPS good enough for file transfer?
Many applications rely on HTTP/HTTPS for file transfers, best example is mega.com. So as you mentioned there may be better ways to keep it in sync, but it’s not definitely wrong to use HTTP/HTTPS for file transfers.
- Is Node.js good enough for working with File System?
Node.js is a really great system and of course you can solve that problem, but again is it really the best way to solve it?`A dedicated sync application like S3FS definitely prevents you from doing some big mistakes.
- Scalability: can this approach fail in large file sizes?
I don’t know how you’re doing the file exchange, but think of two most common problems: Too less memory (if you load all the file into the memory) and timeouts, if the download takes too long or the connection is interrupted.
1