Functional Methods on Collections

I’m learning Scala and am a little bewildered by all the methods (higher-order functions) available on the collections. Which ones produce more results than the original collection, which ones produce less, and which are most appropriate for a given problem? Though I’m studying Scala, I think this would pertain to most modern functional languages (Clojure, Haskell) and also to Java 8 which introduces these methods on Java collections.

Specifically, right now I’m wondering about map with filter vs. fold/reduce. I was delighted that using foldRight() can yield the same result as a map(…).filter(…) with only one traversal of the underlying collection. But a friend pointed out that foldRight() may force sequential processing while map() is friendlier to being processed by multiple processors in parallel. Maybe this is why mapReduce() is so popular?

More generally, I’m still sometimes surprised when I chain several of these methods together to get back a List(List()) or to pass a List(List()) and get back just a List(). For instance, when would I use:

collection.map(a => a.map(b => ...))

vs.

collection.map(a => ...).map(b => ...)

The for/yield command does nothing to help this confusion. Am I asking about the difference between a “fold” and “unfold” operation?

Am I trying to jam too many questions into one? I think there may be an underlying concept that, if I understood it, might answer all these questions, or at least tie the answers together.

You can roughly emulate map/filter by using foldRight (or foldLeft) with cons as the given reduction function. For example foldRight(f, L, initial) where L = [x,y,z] might be expanded to

f(x, f(y, f(z, initial)))

This means that you can’t process x until you’ve processed f(z, initial) and then f(y, ...). You’re creating a dependency that doesn’t exist with map/filter.

As for map ordering…

(A) collection.map(a => a.map(b => ...))

(A) takes a collection and then applies the map function to each element, this implies that each element is a “mappable” collection. This inner map will then return the processed list (which is a list) and so each element of collection will remain a collection. This is how you would map a function onto each element of a list of lists, and the result would again be a list of lists.

(B) collection.map(a => ...).map(b => ...)

(B) processes each element of a list and forms those results into a new list. This list is then processed again with a second map function giving yet another list.

(A) is for processing the inner elements of a list (“sub-elements” if you like). (B) is for processing a list multiple times. If we write (B) concretely as

collection.map(a => f(a)).map(b => g(b))

we can see it is equivalent to

collection.map(a => g(f(a)))

It might help you to write them out as for loops. (A) will use embedded loops where as (B) will use two sequential loops.

This is not the difference between fold and unfold. Neither (A) nor (B) is a fold as the list structure present, however deeply nested, is preserved. Fold creates a scalar from a list, while unfold (not too familiar with it) takes a scalar and produces a list according to some rule.

EDIT: @Giorgio was right to suggest flatMap. It’s an interesting variation on map. Let’s say we’re mapping a list of Xs to a list of Ys, so we pass map a function f:X->Y. Suppose we have another calculation g that takes an X but returns multiple Ys g:X->[Y]. In this case we would use flatMap. map takes the results of f and puts them into a list. flatMap takes the results of g and concatenates them together.

Ex. Say we have a list of lists L = [[1,2,3],[4,5,6]]

L.map(a => a.map(b => b * 2))

gives [[2,4,6],[8,10,12]]. But say we want each number doubled in one list, no sub-lists. Then we would call

L.flatMap(a => a.map(b => b * 2))

which gives [2,4,6,8,10,12]. Note that our inner function a => a.map(b => b * 2) takes and returns a list.

Fold is more general than map, reduce, and filter. If you take a minute to think about it, their implementation in terms of a fold should be simple. Consider using them first because it conveys more intention and limits what could happen. A map will always produce a list of the same length, filter won’t change values just possibly remove some, reduce turns a list in to a single value, with a fold all three could happen at the same time.

If you go from a List to a List of Lists, then a map(or a fold doing map like things) is returning a list somewhere. List of Lists to a List means a reduce is coming in to play somewhere.

I would also look on stack overflow at the discussions concerning them. It should help with use cases for each.

Concerning the operations that produce an output smaller than their input, a common mistake concerns map itself. By intuition, map takes each of the input elements, computes a transformation of this element, and places the result on the output, so that input and output have the same size.

In fact, in Scala, a map of a set (let’s say, of integers) is a set. So applying a predicate (that is, a boolean function) establishing if the integer is odd or even, the result will be a set of booleans. Therefore the only possible output sizes are 0, 1 and 2, whatever the input size is.

val a = Set(1,2,3,4,5)
a.map(_ % 2 == 0)
res0: scala.collection.immutable.Set[Boolean] = Set(false, true)

Note that is a Scala convention. For instance, in Python, a map of a set is a list containing the same number of elements than the original set.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 20:50

Thẻ: collections, functional-programming, functions, higher-order-functions, scala

Thiết kế website giá rẻ

Danh mục

Functional Methods on Collections