There isn’t any cool LINQ sugar for creating unions. The Enumerable.Union()
method is usually called like this:
var bigList = list1.Union(list2);
The alternative is to call Enumerable.Union()
which can be more readable:
var bigList = Enumerable.Union(list1, list2);
However neither of these methods are very stylish (more importantly, readable) when scaling out The following is probably the best method:
var reallyBigList = list1.Union(list2).Union(list3);
Which can result in some messy method chaining. Alternatives need incidental variables:
var list1and2 = list1.Union(list2);
var reallyBigList = list1and2.Union(list3);
or
var list1and2 = Enumerable.Union(list1, list2);
var reallyBigList = Enumerable.Union(list1and2, list2);
Is there a clean way of setting up these more complex unions? Would an extension like Enumerable.Union(params IEnumerable<T> collections)
(used like var reallyBigList = Enumerable.Union(list1, list2, list3)
) be better?
I don’t consider the method chaining option messy at all. Sure, something like
var reallyBigList = (from ... where ... select ...).Union(from ... where ... select)...
can easily get unreadable, but, on the other hand,
var mp3s = from ... where ... select ...
var videos = from ... where ... select ...
var alreadyProcessed = from ... where ... select ...
var toDo = mp3s.Union(videos).Except(alreadyProcessed)
reads quite naturally. So, when using well-named intermediate variables, the method chaining approach is extremely readable.
The method chaining style is pretty readable, in my opinion. Certainly you could write your own extension method to take many IEnumerable<T>
.
Here’s an example using iteration. Recursion works too, but I didn’t like it as much.
public static IEnumerable<T> MyUnion<T>(
this IEnumerable<T> original, params IEnumerable<T>[] toUnion)
{
var enumerable = original;
foreach (var other in toUnion)
{
enumerable = enumerable.Union(other);
}
return enumerable;
}
4