Why isn’t software abstract on a grander scale?

Consider the following example:

The user wants a program to calculate a few fibonacci numbers.
Sounds easy enough. pseudocode:

<code>stdout.write("How many fibonacci numbers do you want to calculate? ")

int count = int(stdin.readline())

while count>0:

stdout.writeline(calculate_next_fibonacci_number())

count--

</code>

<code>stdout.write("How many fibonacci numbers do you want to calculate? ") int count = int(stdin.readline()) while count>0: stdout.writeline(calculate_next_fibonacci_number()) count-- </code>

stdout.write("How many fibonacci numbers do you want to calculate? ")
int count = int(stdin.readline())
while count>0:
    stdout.writeline(calculate_next_fibonacci_number())
    count--

Even a very simple program like this is already flawed:

The program writes to stdout. That is not actually what the programmer intended, is it? The intention is to display a bunch of number to the user, not to write some text to stdout – there’s no guarantee the user will ever see the text that’s written to stdout.
Similarly, the user input is read from stdin, which is a text (or file, if you prefer) interface, when in reality, the program requires a number, not text. This seems fundamentally wrong.

Let’s think about what we’re trying to do more abstractly. We want to:

Calculate a bunch of numbers. How many is up to the user.
display the results to the user.

Why, then, don’t we write code exactly like that?
(Most) programming languages provide this thing called a “function”, which accepts parameters and uses them to do something. Does that not sound exactly like what we’re trying to do?

<code>void display_fibonacci_numbers(

int number "how many fibonacci numbers to calculate"):

Sequence<int> numbers

while number>0:

numbers.append(calculate_next_fibonacci_number())

number--

notify(numbers)

display_fibonacci_numbers()

</code>

<code>void display_fibonacci_numbers( int number "how many fibonacci numbers to calculate"): Sequence<int> numbers while number>0: numbers.append(calculate_next_fibonacci_number()) number-- notify(numbers) display_fibonacci_numbers() </code>

void display_fibonacci_numbers(
        int number "how many fibonacci numbers to calculate"):

        Sequence<int> numbers
        while number>0:
            numbers.append(calculate_next_fibonacci_number())
            number--

        notify(numbers)

display_fibonacci_numbers()

This code is, of course, incomplete – the notify function isn’t implemented anywhere, and the user needs some way of inputting a number. I would imagine the user’s operating system or desktop manager or whatever to take care of that – it could display a terminal, it could generate a GUI, it could tell the user do write the number into the air with his nose; either way it does not (should not) concern me, the programmer.

It is my belief that software should be more abstract.

A few more examples.

inter-process communication. How do you do it? Sockets? Signals? DBus? Files? The goal isn’t to use sockets, signals or dbus, it’s to communicate with another process. Why not something like Process.from_name("Music player").play_random_song()?
Downloading files. local_file.write(remote_file.read())? Why not download(url), which could, depending on the system configuration, start the download right away, but with a low priority so as not to slow down other downloads, or add it to the download queue to be downloaded later, or whatever?
File paths. Why on earth are file paths still strings (in most languages, at least)? Why do we have to deal with whether the path separator is / or , whether it’s ., .., or whatever? Why is there no FilePath class that takes care of this?

Going a step further, why not apply the duck-typing concept to modules? For example, in python, HTML is often parsed using the beautifulsoup module:

<code>from bs4 import BeautifulSoup

page= BeautifulSoup(urlopen('http://test.at'))

</code>

<code>from bs4 import BeautifulSoup page= BeautifulSoup(urlopen('http://test.at')) </code>

from bs4 import BeautifulSoup

page= BeautifulSoup(urlopen('http://test.at'))

Again, the programmer’s goal wasn’t to use BeautifulSoup, but to parse HTML. Why, then, would he explicitly tell his code to use the BeautifulSoup module? What he really wants is to

<code>import HTMLParser

page= HTMLParser.parse(urlopen('http://test.at'))

</code>

<code>import HTMLParser page= HTMLParser.parse(urlopen('http://test.at')) </code>

import HTMLParser
page= HTMLParser.parse(urlopen('http://test.at'))

Duck-typing for modules: Who cares what module it is as long as it does what I want?

Why is programming not like this? Am I overlooking something, something that makes all this impossible (or impractical)?

Why is programming not like this?

Why do you think that? Actually, programming is like that. Of course, you can’t do that in assembler. Therefore we invented languages that can be more similiar to the human thoughts.

Good programmers program alike examples you gave. But it is not as easy. First you have to understand, what the requirement is. This is hard enough already and most people can’t elaborate correctly what they need – therefore a good software designer needs to actually understand what people want, not what they tell.

After that is done you need to see the concepts behind the real world things. That is quite philosophical and needs much experience. And then, after this is done, you have to translate these results into a language of your choice, working around the incapabilities that every language has and choose the correct abstraction level (you don’t want to overengeneer anything right?).

Not all people are good at these steps or even aware of them. Even in small examples one can see that it is not trivial. Let’s take your display_fibonacci_numbers function and see what we can improve there (even though its not codereview.stackexchange here).

void display_fibonacci_numbers(

The naming is actually not very precise because you are very specified in the kind of numbers you will display right? That should be described by the function name.

<code> int number "how many fibonacci numbers to calculate"):

</code>

<code> int number "how many fibonacci numbers to calculate"): </code>

    int number "how many fibonacci numbers to calculate"):

Why is it an int? That is waaay to concrete and even wrong, because ints can of course have negative values right? So what you might really want is a type that can be used for calculating the fibonacci numbers. Maybe a natural number?

<code> Sequence<int> numbers

while number>0:

numbers.append(calculate_next_fibonacci_number())

number--

</code>

<code> Sequence<int> numbers while number>0: numbers.append(calculate_next_fibonacci_number()) number-- </code>

    Sequence<int> numbers
    while number>0:
        numbers.append(calculate_next_fibonacci_number())
        number--

Does that really describe what you intend to do? In natural language (which is often a good approximation) we would say “create a list of the specified number of fibonacci numbers”. I can’t read anything about decrementing a number here, I can’t read anything about a static 0 value here. Also nothing about a “calculate_next_fibonacci_number()” function.

Therefore, in Scala I would e.g. write the code as List.fill(number)(randomFibonacci())

It does not read exactly as in natural language, but it at least contains the same information.

I wanted to give you these examples to show you, that it is hard to write software that way. You need to have experiences, spent good amounts of time to think about things, be able to find the correct concepts and abstractions and THEN have a language that does not hinder you to express them in a concise way.
This is, why not many people program like that. And even worse: humans get used to things. So if one always programs in a language that hinders him in doing the last step, he will maybe continue to program in a new language as in the old one.

Why, then, don’t we write code exactly like that?

We do. Your Fibonacci example is a clear violation of the single responsibility principle. One method should be responsible for being a generator that provides the Fibonacci sequence. Other parts are then responsible for taking N values from it, or skipping N values, or getting input, etc.

I would imagine the user’s operating system or desktop manager or whatever to take care of that – it could display a terminal, it could generate a GUI, it could tell the user do write the number into the air with his nose; either way it does not (should not) concern me, the programmer.

Sometimes it does not concern you the programmer. In those cases, you use some UI abstraction that does that glue to turn “get a number” and “print these numbers” into platform specific implementations. These already exist and are commonly used.

But you’re missing the point.

The vast majority of a programmer’s job isn’t making these whole programs, but gluing disparate bits of existing code together. There is already code to make the Fibonacci sequence. There is already code to take the first N of a sequence. There’s already code to print numbers on this platform.

The only reason you’re there is because someone wants all of those bits of code together. Why wasn’t it the first N primes instead? The specifics of how to get input and print output are just as vital requirements as what sequence to use.

Why not something like Process.from_name(“Music player”).play_random_song()

AppleScript already does stuff like this. WCF in many ways already abstracts away the mechanics of the communication from the semantics of the communication.

Why not download(url)

C# (and others) already have libraries to do this. Granted they already have well defined semantics, because configuration is vile. It’s code without any of the framework/process to make code with quality. Worse yet, it removes any quality (and often, performance) from your real code.

Why is there no FilePath class that takes care of this?

C# (and others) already have libraries to do this.

Going a step further, why not apply the duck-typing concept to modules?

Some (more esoteric) languages do. Modules are treated as large objects, which can have their own interfaces, be instantiated, parameterized, etc.

The problem is that the vast majority of “how it works” is defined by the interface to the module. One implementation will have auto-closing file streams, and another will use Open and Close. These are strictly different interfaces.

Why is programming not like this?

There are a boatload of reasons.

Probably the most compelling is this sort of scenario. No matter that having your OS control how to input/output makes programmers make better code faster, some business person will want you to do something different to “differentiate”.

Why aren’t programs more abstract?

Simply put: because when you get down to it, abstractions don’t do anything. Implementations do. And the more abstract your program is–the more divorced it is from what’s actually happening–the harder it is to understand what the code is doing.

At first, to a naive developer, it certainly seems like focusing on the high-level goal of what the code is supposed to be doing and “abstracting away the implementation details” makes the code easier to understand, but I respectfully submit that anyone who believes this has very little experience at debugging. When something is going wrong in your code, especially if it’s code that’s been released already and you have clients who are unhappy about it and want a fix now, the faster you’re able to find where the problem is, the better. And having to dig through layers upon layers upon layers of abstractions makes this more difficult, not easier.

Since the modern software development lifecycle spends far more time in maintenance than in original development, anything that favors quick original development and high abstraction at the expense of maintenance, readability and debuggability should be considered a meta-example of premature optimization.

Don’t get me wrong. I’m not against abstraction where it makes sense. But all too often you see people adding more and more abstraction just for abstraction’s sake, and it turns the code into a mess of complexity rather than reducing complexity. Remember the advice famously attributed to Einstein: Everything should be as simple as possible, but not simpler.

The biggest thing you’re missing is that abstraction has costs as well as benefits.

“All problems in computer science can be solved by another level of indirection, except of course for the problem of too many indirections.” -David Wheeler

Sometimes the cost is really small, like in your fibonacci example, so even if it doesn’t gain us anything, it doesn’t much matter, because it didn’t cost us much either. Wrapping a few lines of code in a function isn’t something that takes a lot of time or effort to do. Sometimes the cost is large, and then we need to be sure we’re getting something significant out of it.

If we were to decide to make an abstract HTML parsing interface, spend time designing it, and then implementing an instance of that interface that used BeautifulSoup, have we wasted our time? Probably, if we only ever use our fancy interface to call BeautifulSoup. (Note that the proposed replacement for BeautifulSoup in your question essentially did nothing for you except change the name from BeautifulSoup to HTMLParser.)

Wrapping something that doesn’t need to be wrapped in the first place is a waste of time.

On the other hand, if we’re going to write a program that uses BeautifulSoup for most parsing, but sometimes really needs the parsing capabilities of UglyOatmeal or SloppySandwich, an interface that wraps them might save us a lot of time and effort.

You seem to think that details should not concern you as a programmer. This is only sometimes true; some details are already handled by the hardware, OS, programming language, and libraries. Other details matter fundamentally in ways that make them impossible to handle automatically — in your fibonacci example, it matters to the behavior of the program whether the number type used is an int or a type that can handle larger numbers. Sometimes the details of the representation of something make a big difference to performance.

It’s nice when things get handled automagically for us, but there is usually a cost of some kind or another. Often the cost is performance, and often it’s lack of control over something.

Even with beautiful languages and awesome libraries, not every detail can be handled automatically, and software fundamentally is about handling enormous numbers of details, so we shouldn’t be too surprised or annoyed to find out that some of them couldn’t be handled for us.

It isn’t that we need more abstraction, we need the right abstractions in the right places. Abstraction is one of our most powerful tools in programming, but there are ways of misusing it or unrealistically expecting it to do magic.

You’ve discovered good factoring. Congratulations.

Why do more people not program like this? Because it’s hard to see the advantage of doing things in a way that seems more effort and will only pay off later, when requirements change. People are short-sighted; even the ones who can think abstractly enough to become programmers are short-sighted. Some of them are a little less short-sighted, and they tend to be the ones inventing the rare trick or pattern that eventually manages to get established and raise overall software quality a tiny bit.

Then someone invents a shiny new distribution channel with a weird ad-hoc kludgy programming language to go with it, and all progress goes promptly out of the window. Don’t feel bad, that’s just the way the world is 🙂

The law of diminishing returns applies to abstraction just like anything else. Is it better to configure a device using jumper wires or DIP switches? Probably DIP switches, which are an abstraction over connecting jumper wires. Is it better to program in machine language or assembly language? The answer is almost certainly assembly language. It’s difficult for me to think of some scenario where you wouldn’t want to at least use an assembler to generate your machine language.

However, we’re well beyond the mild abstractions I just suggested, and I would posit that many, maybe most, programmers are working at a level of abstraction beyond the point of diminishing returns already.

An example: I work with several PAAS systems that attempt to make the difficult task of integration with other systems more abstract and easier. I continually find myself wondering what’s really going on beneath the clean-looking façade presented by such systems:

What does the word “blank value” in that checkbox label mean? Does it mean that the source row is present but the column value is null? Does it mean that the source row is absent? Does it mean both? Does this particular PAAS tool even have a concept of true / false / null? If so, which fields are nullable (and is this setting buried deep in a dialog somewhere, in an effort to make the user experience more abstract)?
When I select the “add records to target database” option, will this update records as well? I don’t see an “update” option… so “add records” must do updates, too… right? (This may seem like a ridiculous example, but it’s a real one. And yes, I did end up on Google trying to determine whether “add” really means “add/update” for one particular setting in one particular PAAS tool- is this your idea of a good developer experience?)

It’s difficult to answer these questions, since all of that programmer stuff has been studiously hidden away, locked into black box implementations, etc., in the name of presenting an abstract user interface. I end up having a hard time even formulating the necessary questions (whether directed at Google or at tech support). I’m wondering about things like nullability and “upsert” operations, but my interlocutors are actively avoiding the use of such low-level technical terms. But these sorts of questions need to be answered to do integration…

And my last point hints at a possibility that, as we develop a more of a shared language of computing, we may be able to work at a more abstract level. If non-programmers (including GUI designers) would use standard terms like nullable, “upsert”, foreign key, etc., then maybe a lot of things currently done using code could be done graphically, or in an otherwise more abstract manner. But when people say “add” instead of “upsert”, because “upsert” is scary programmer language, well, I have to fall back on my old, low-level skills and tools to actually get the job done.

What you’re asking is, “why can’t we use Z or UML to fully-prototype and design a system?”

And the answer to that is, “Because, while you could build something that would build a system based on a UML or Z spec, verifying the spec becomes just as hard (if not more so) than implementing the spec.”

You can buy a lot of pre made parts at the store, but there are still craftsman machinists that are needed to fabricate parts that do not exist. The abstractions are there, but not always for every environment. At some point, someone is going to need to flip a bit in a way that no abstraction in the form of a function is going to be available.

I’m going to use your file path string as an example:

File paths. Why on earth are file paths still strings (in most
languages, at least)? Why do we have to deal with whether the path
separator is / or , whether it’s ., .., or whatever? Why is there no
FilePath class that takes care of this?

Sometimes these abstractions exist, but not in the programming language itself. Microsoft Windows, .NET Framework and the family of programming languages that use .NET and run on Windows has this capability. http://msdn.microsoft.com/en-us/library/system.io.path%28v=vs.110%29.aspx

It’s not really part of the programming languages (VB.NET, C#, F#, etc.), but a framework. I see this .NET framework as an abstraction of all the Windows plumbing that is necessary and not really part of the logic or algorithms we need to create in our code. The programmer can chose to use this object from the .NET framework or create his own. Your argument seems like creating your own is a bad idea because there is something better that is more available, so why give the developer the choice? I’m not sure how you take that choice away. Prevent a file location string from ever being able to interact with the file system in any way shape or form? You can only program for Windows if you use the appropriate framework and family of languages? How do you get the URL from the web? Does it have to come in the form of a .NET IO object?

Many of your examples seem to be problems in search of a solution. In a real-world scenario, your Fibonacci number function would actually be a function that takes 1 or 2 (for arbitrary subsets of the Fibonacci sequence) int parameters and returns a list of integers, and the calling process would handle getting the input and displaying the output. If you have the code you provided in the real world, it’s because the developer never intended it to be anything other than something (s)he ran from the command line.

For many of your other examples, the answer is often simply “because eventually you’re going to need to instantiate something and get things done”. At some point, for code to run, something has to be an actual implementation. In that case, it’s a filesystem library (in addition to the C# library Telastyn listed, there’s the os module in Python, and the Java File-related libraries (in java.io and java.nio). Adding another layer of abstraction isn’t going to make anything in the examples you listed better.

There’s a balancing act that goes into the decision-making process of whether to use some type of abstraction – How much more complicated does this make maintaining the code? (as discussed by Mason Wheeler) How much easier does this make writing the code? Do the benefits I expect to get from this really justify the extra time it takes to put this in place? And perhaps the most important question – Do I really anticipate there ever being more than 1 implementation here?

You find you come to a point, often much sooner than you seem to think, where adding more abstraction just isn’t worth the effort.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 20:54

Thẻ: abstraction

Thiết kế website giá rẻ

Danh mục

Why isn’t software abstract on a grander scale?