Matrix distribution to process grid

I need to write a program, which will perform LU-decomposition, etc. The problem is that I don’t know about the preferred way to distribute the loaded matrix from the root process to other processes. I’m able to create a simple algorithm for some situations, but I really need the solution which works for arbitrary number of processes.

I’d prefer some Coarse-grain distribution according to this presentation (slide 18).

Example

Here I can see, that we can use 1 process or 2 processes (3 or 4 would be “too fine-grain”). But, what’s the correct way with 2 processes? Should the matrix be divided by rows or columns?

Example 2

In this case, it’s even more problematic. I can’t distribute values among 2 processes evenly. Even if I’d distributed it like this:

P1: 2 8 3 4
P2: 5 1 6 2 5

It completely throws the symmetrical distribution into disarray.

Then, with 3 processes it’s exactly the same situation like in Example 1 – rows or columns?

So, I presume there is no way to do this for a completely arbitrary number of processes, because the point is in the even distribution. What’s the approach for this?

I hope I’ve described my problem clearly enough. If not, leave a comment, please, and I’ll improve my question.

Another solution generally used when performing tasks in parallel is that:

You start by giving the first element to the first process, the second element to the second process,… and the n^th element to the n^th process.
Once one of the processed finishes the task for the given element, it is given the task to deal with the next element.
Once there are no elements left and all processes finished, the work is done.

This makes the number of elements irrelevant, and targets specifically the cases where you can’t be sure that the tasks will take exactly the same duration.

You may have two processes and nine tasks and end up giving eleven tasks to the first process and six to the other, because either some tasks given to the first process were simpler, or because the first process had more CPU cycles available to it.

The same issue (different duration of tasks) applies to most situations where parallel computing is required, such as map-reduce with individual tasks given to different machines (what if one machine has much more powerful hardware than the others?)

Often, the solution is to have many parallel tasks. If you have two tasks and one takes longer than another, there is nothing you can do. If you have twenty tasks and one takes longer, you may compensate by giving additional tasks to the process or machine which doesn’t deal with the long-running task.

This also works with tasks which always take the same amount of time, that is a situation where symmetrical distribution is suitable. Imagine, in your example, that each task takes one second. Then the distribution:

P1: 2 8 3 4
P2: 5 1 6 2 5

means that on the overall process which takes five seconds, P1 processing power is wasted during one second, which may be unsuitable (11.1% of wasted processing power). On the other hand, if you have nearly twice as more tasks:

P1: 2 8 3 4 1 9 8 8 4
P2: 5 1 6 2 5 6 3 1 4 1

then P1 wastes one second out of ten seconds, which is slightly better (5.3% waste). Now imagine you have 1 501 equally fast tasks to split between two processes (<0.1%), then the distribution works pretty well, even if it’s not symmetrical.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 18:19

Thẻ: algorithms, distributed-computing, distributed-system, matrices, parallelism

Thiết kế website giá rẻ

Danh mục

Matrix distribution to process grid

Example

Example 2