When you have to iterate a reader where the number of items to read is unknown, and the only way to do is it to keep reading until you hit the end.
This is often the place you need an endless loop.
-
There is the always
true
that indicates there must be abreak
orreturn
statement somewhere inside the block.int offset = 0; while(true) { Record r = Read(offset); if(r == null) { break; } // do work offset++; }
-
There is the double read for loop method.
Record r = Read(0); for(int offset = 0; r != null; offset++) { r = Read(offset); if(r != null) { // do work } }
-
There is the single read while loop. Not all languages support this method.
int offset = 0; Record r = null; while((r = Read(++offset)) != null) { // do work }
I’m wondering which approach is the least likely to introduce a bug, most readable and commonly used.
Every time I have to write one of these I think “there has to be a better way”.
5
I would take a step back here. You’re concentrating on the picky details of the code but missing the larger picture. Let’s take a look at one of your example loops:
int offset = 0;
while(true)
{
Record r = Read(offset);
if(r == null)
{
break;
}
// do work
offset++;
}
What is the meaning of this code? The meaning is “do some work to each record in a file”. But that is not what the code looks like. The code looks like “maintain an offset. Open a file. Enter a loop with no end condition. Read a record. Test for nullity.” All that before we get to the work! The question you should be asking is “how can I make this code’s appearance match its semantics?” This code should be:
foreach(Record record in RecordsFromFile())
DoWork(record);
Now the code reads like its intention. Separate your mechanisms from your semantics. In your original code you mix up the mechanism — the details of the loop — with the semantics — the work done to each record.
Now we have to implement RecordsFromFile()
. What’s the best way of implementing that? Who cares? That’s not the code that anyone is going to be looking at. It’s basic mechanism code and its ten lines long. Write it however you want. How about this?
public IEnumerable<Record> RecordsFromFile()
{
int offset = 0;
while(true)
{
Record record = Read(offset);
if (record == null) yield break;
yield return record;
offset += 1;
}
}
Now that we are manipulating a lazily computed sequence of records all sorts of scenarios become possible:
foreach(Record record in RecordsFromFile().Take(10))
DoWork(record);
foreach(Record record in RecordsFromFile().OrderBy(r=>r.LastName))
DoWork(record);
foreach(Record record in RecordsFromFile().Where(r=>r.City == "London")
DoWork(record);
And so on.
Any time you write a loop, ask yourself “does this loop read like a mechanism or like the meaning of the code?” If the answer is “like a mechanism”, then try to move that mechanism to its own method, and write the code to make the meaning more visible.
7
You don’t need an endless loop. You should never need one in C# read scenarios. This is my preferred approach, assuming that you really do need to maintain an offset:
Record r = Read(0);
offset=1;
while(r != null)
{
// Do work
r = Read(offset);
offset++
}
This approach acknowledges the fact that there is a setup step for the reader, so there are two read method calls. The while
condition is at the top of the loop, just in case there is no data at all in the reader.
Well it depends on your situation. But one of the more “C#-ish” solutions I can think of is to use the built-in IEnumerable interface and a foreach loop. The interface for IEnumerator only calls out MoveNext with true or false, so size can be unknown. Then your termination logic is written once – in the enumerator – and you don’t have to repeat in more than one spot.
MSDN provides an example of IEnumerator< T >. You will also need to create an IEnumerable< T > to return the IEnumerator< T >.
3
When I have initialization, condition and increment operations I like to use the for loops of languages like C, C++ and C#. Like this:
for (int offset = 0, Record r = Read(offset); r != null; r = Read(++offset)){
// loop here!
}
Or this if you think this is more readable. I personally prefer the first one.
for (int offset = 0, Record r = Read(offset); r != null; offset++, r = Read(offset)){
// loop here!
}