Large file / data transfer in a Microservice Architecture

My company is currently working on adopting a microservice architecture but we are encountering some growing pains (shock!) along the way. One of the key contention points we are facing is how to communicate large quantities of data between our different services.

As a bit of background we have a document store that serves as a repository for any document we might need to handle across the company. Interacting with said store is done via a service which provides a client with a unique ID and a location to stream the document. The document’s location can later be accessed via a lookup with the provided ID.

The problem is this – Does it make sense for all our microservices to be accepting this unique ID as part of their API for the purposes of interacting with documents or not? To me this feels inherently wrong – the services are no longer independent and rely upon the document store’s service. While I do acknowledge this might simplify API design and perhaps even have some performance gains the resulting coupling more than counterbalances the benefits.

Does anyone know how the rainbow unicorns (Netflix, Amazon, Google, etc.) handle large files / data exchange between their services?

2

Does anyone know how the rainbow unicorns (Netflix, Amazon, Google, etc.) handle large files / data exchange between their services?

Unfortunately I do not know how they deal with such problems.

The problem is this – Does it make sense for all our microservices to be accepting this unique ID as part of their API for the purposes of interacting with documents or not?

It violates the Single Responsibility Principle, which should be inherently in your microservice’s architecture. One microservice – logically one, physically many instances representing one – should be dealing with one topic.

In the case of your document store, you have one point, where all queries for documents go (of course you could split this logical unit up into multiple document stores for several kinds of documents).

  • If your “application” needs to work on a document, it asks the respective microservice and processes its result(s).

  • If another service needs an actual document or parts of it, it has to ask the document service.

One of the key contention points we are facing is how to communicate large quantities of data between our different services.

This is an architectural problem:

  1. Decrease the need to transfer big amounts of data

    Ideally, each service has all of it’s data and needs no transfer to simply serve requests.
    As an extension of this idea – if you have the need to transfer data, think of redundancy (*in a positive way_): Does it make sense to have the data redundant in many places (where they are needed)? Think of how possible inconsistencies might harm your processes. There is no transfer faster as actually none.

  2. Decrease the size of the data itself

    Think of how you could compress your data: Starting with actual compression algortihms up to smart data structures. The less goes over the wire, the faster you are.

If the ID returned by your document store is the way to reference documents throughout the system, then it makes sense for all services to accept that ‘Document ID’ on their API when the service needs to know which document it needs to work with.

This does not necessarily create a tighter coupling between the services than needed. Services that need to access documents need to access the document-store service anyway and they need that ID to tell the store which document to access.
Services that don’t access documents directly might need to pass the Document ID along, but to those services it would be just an arbitrary string that does not create a dependency.

3

Personally, I’d rather not use a separate document store service and document id, but a URL to access the documents (with proper header authentication). With this approach you won’t need other services to rely on the document service rather it could just use the full URL to access the document.And also it makes sense when it comes to scaling as well, you could use multiple document stores as and when the storage grows and provide the URL.

However you might need a service(s) to upload a document and to obtain it’s URL.

Does anyone know how the rainbow unicorns (Netflix, Amazon, Google, etc.) handle large files / data exchange between their services?

Checkout Amazon S3 REST API specs, seemingly they return full object in bytes. Seems not many options if you are designing a microservice.
Amazon S3 response format link

This is very late, But will leave the answer to others people who encounter the same problem.

Does it make sense for all our microservices to be accepting this unique ID as part of their API for the purposes of interacting with documents or not?

Yes, the only way to reduce data transfer is to store it in the right place and deal with it using ref ID.

I recommend you to have a microservice that works specifically for file sharing. (I’ll call this file sharing microservice)

Every other microservices could add the file sharing microservice as a dependency, But the file sharing microservice can’t have a dependency on others.

You could use S3 or whatever technology behind the file sharing microservice but this microservice provides your best-fit abstraction. And later you could migrate the underlying tech to fit your app scale.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật