Would simply using AWS to build an application make this application a distributed system?
For example if someone uses
RDS for the database server,
EC2 for the application itself and
S3 for hosting user uploaded media,
does that make it a distributed system?
If not, then what should it be called and what is this application lacking for it to be distributed?
Update
Here is my take on the application to clarify my approach to building the system:
- The application I’m building is a social game for Facebook.
- I developed the application locally on a LAMP stack using Symfony2.
- For production I used an a single EC2 Micro instance for hosting the app itself, RDS for hosting my database, S3 for the user uploaded files and CloudFront for hosting static content.
I know this may sound like a naive approach, so don’t be shy to express your ideas.
5
There are (at least) two lines of thoughts regarding distributed systems, and depending on your environment your system may or may not qualify as such:
-
Computer Science – in this case, a distributed system solves an algorithmic problem such that each node does part of the processing, in some instances even without a controller coordinating the task. Usually the goal is to find a distributed algorithm to solve a problem more efficiently.
-
Information Systems – in this case, a distributed system is one which distributes presentation, application and database among multiple autonomous entities that communicate via a network (by passing messages among each-other).
(Note that on an a sincerely abstract level that’s the same definition — any computational problem is in fact an algorithmic problem, but let’s save this argument for a different discussion).
So yes, you can consider yours a distributed system, and we can argue so given the definition as found in Wikipedia:
1. There are several autonomous computational entities, each of which has its own local memory.[7] 2. The entities communicate with each other by message passing.[8]
In your case, the web server along with your application code, the database server and the image server are all autonomous computational entities with their own local memory, and they communicate by message passing (that is, sending messages back and forth via the network).
The same Wikipedia article also lists several architectures for building distributed systems, one of which is the n-tier Architecture:
In software engineering, multi-tier architecture (often referred to as
n-tier architecture) is a client–server architecture in which
presentation, application processing, and data management functions
are logically separated.
Further, your system clearly has multiple tiers (the application, the database and the image store). It probably also has multiple layers, that is if you followed Symfony’s MVC model. MVC by definition seperates presentation and application logic, and since (at least part of) the presentation layer is most certainly run by a web browser, this is in fact a 3-tier architecture:
Three tier systems move the client intelligence to a middle tier so
that stateless clients can be used. This simplifies application
deployment. Most web applications are 3-Tier.
The resources are distributed but the logic is not. Both the database and the S3 repository are controlled by the application, and there is no parallelism in the task the application runs between different machines. According to wikipedia there are two requirements for a distributed system, both of which are missing from the system you described (or not clarified):
- There are several autonomous computational entities, each of which has its own local memory
- The entities communicate with each other by message passing.
In that sense it is not a distributed system, and it would require different EC2 servers running this application to coordinate their actions on how to access and synchronise resources.
Edit: from the question it is not clear if only one EC2 instance is used or many. In my reply above I assume a single instance.
EDIT: updated question from the OP makes clear it’s a single EC2 instance.
Academically such applications are considered multi-tiered. The client tier is at the end user’s machine, the web tier is the EC2 server and the Enterprise Information System (EIS) tier is the database and the S3 repository. Again, for an application to be distributed you need to distribute the processing of the web tier at multiple servers. However, in the industry such applications can be considered distributed due to the separation of locations. So I would say that the app itself that runs in the EC2 server isn’t a distributed system, but the entire setup (client’s browser <-> web app <-> database/s3) can be considered as a distributed architecture/application.
Please also check this question:
https://stackoverflow.com/questions/3542204/multi-tier-vs-distibuted
4
Your design is slightly distributed, but in very simple, minimal ways. You have functionally decomposed your workload across multiple servers. In 1980, that would have been considered a distributed system. But you do not distribute (aka, partition, shard) any of the individual parts of your workload across multiple servers. There is no substantial parceling of work, nor meaningful replication across multiple nodes or datacenters. Your storage (S3 and the CloudFront CDN) is the one exception; those services inherently distribute your storage-access.
Feel free to call it a “cloud app” or “cloud service,” but by today’s standards, it’s not very distributed.
Your example describes delegation, not distribution. Delegation would be assigning a specific task to an appropriate subsystem, where as distribution divides a process into roughly similar parts that can be run in parallel. A simple litmus test: ask your self, if I need to scale this system do I scale it “up” (a bigger computer) or “out” (additional small computers). Could I scale this 10x? 1000x? These are real world concerns that are solved by distributed systems.