Torvalds’ quote about good programmer [closed]

Accidentally I’ve stumbled upon the following quote by Linus Torvalds:

“Bad programmers worry about the code. Good programmers worry about
data structures and their relationships.”

I’ve thought about it for the last few days and I’m still confused (which is probably not a good sign), hence I wanted to discuss the following:

  • What interpretation of this possible/makes sense?
  • What can be applied/learned from it?

13

It might help to consider what Torvalds said right before that:

git actually has a simple design, with stable and reasonably well-documented data structures. In fact, I’m a huge proponent of designing your code around the data, rather than the other way around, and I think it’s one of the reasons git has been fairly successful […] I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important.

What he is saying is that good data structures make the code very easy to design and maintain, whereas the best code can’t make up for poor data structures.

If you’re wondering about the git example, a lot of version control systems change their data format relatively regularly in order to support new features. When you upgrade to get the new feature, you often have to run some sort of tool to convert the database as well.

For example, when DVCS first became popular, a lot of people couldn’t figure out what about the distributed model made merges so much cleaner than centralized version control. The answer is absolutely nothing, except distributed data structures had to be much better in order to have a hope of working at all. I believe centralized merge algorithms have since caught up, but it took quite a long time because their old data structures limited the kinds of algorithms they could use, and the new data structures broke a lot of existing code.

In contrast, despite an explosion of features in git, its underlying data structures have barely changed at all. Worry about the data structures first, and your code will naturally be cleaner.

13

Algorithms + Data Structures = Programs

Code is just the way to express the algorithms and the data structures.

3

This quote is very familiar to one of the rules in “The Art of Unix Programming” which is Torvalds’ forte being the creator of Linux. The book is located online here

From the book is the following quote that expounds on what Torvalds is saying.

Rule of Representation: Fold knowledge into data so program logic can be stupid and robust.

Even the simplest procedural logic is hard for humans to verify, but quite complex data structures are fairly easy to model and reason about. To see this, compare the expressiveness and explanatory power of a diagram of (say) a fifty-node pointer tree with a flowchart of a fifty-line program. Or, compare an array initializer expressing a conversion table with an equivalent switch statement. The difference in transparency and clarity is dramatic. See Rob Pike’s Rule 5.

Data is more tractable than program logic. It follows that where you see a choice between complexity in data structures and complexity in code, choose the former. More: in evolving a design, you should actively seek ways to shift complexity from code to data.

The Unix community did not originate this insight, but a lot of Unix code displays its influence. The C language’s facility at manipulating pointers, in particular, has encouraged the use of dynamically-modified reference structures at all levels of coding from the kernel upward. Simple pointer chases in such structures frequently do duties that implementations in other languages would instead have to embody in more elaborate procedures.

2

Code is easy, it’s the logic behind the code that is complex.

If you are worrying about code that means you don’t yet get that basics and are likely lost on the complex (ie data structures and their relationships).

2

To expand on Morons’ answer a bit, the idea is that understanding the particulars of the code (syntax, and to a lesser extent, structure/layout) is easy enough that we build tools that can do it. Compilers can understand all that needs to be known about code in order to turn it into a functioning program/library. But a compiler can’t actually solve the problems that programmers do.

You could take the argument one step further and say “but we do have programs that generate code”, but the code it generates is based on some sort of input that is almost always hand-constructed.

So, whatever route you take to get to code: be it via some sort of configuration or other input that then produces code via a tool or if you’re writing it from scratch, it’s not the code that matters. It’s the critical thinking of all the pieces that are required to get to that code which matter. In Linus’ world that’s largely data structures and relationships, though in other domains it may be other pieces. But in this context, Linus is just saying “I don’t care if you can write code, I care that you can understand the things that will solve the problems I’m dealing with”.

1

Linus means this:

Show me your flowcharts [code], and conceal your tables [schema], and
I shall continue to be mystified; show me your tables [schema] and I
won’t usually need your flowcharts [code]: they’ll be obvious.

— Fred Brooks, “The Mythical Man Month”, ch 9.

I think he’s saying that the overall high-level design (data-structures and their relationships) is much more important than the implementation details (code). I think he values programmers who can design a system over those who can only focus on details of a system.

Both are important, but I would agree that it’s generally much better to get the big picture and have issues with the details than the other way around. This is closely related to what I was trying to express about breaking up big functions into little ones.

2

Well, I can’t entirely agree, because you have to worry about all of it. And for that matter, one of the things I love about programming is the switches through different levels of abstraction and size that jump quickly from thinking about nanoseconds to thinking about months, and back again.

However, the higher things are more important.

If I’ve a flaw in a couple of lines of problems that causes incorrect behaviour, it probably isn’t too hard to fix. If it’s causing it to under-perform, it probably doesn’t even matter.

If I’ve a flaw in the choice of data structure in a sub-system, that causes incorrect behaviour, it’s a much bigger problem and harder to fix. If it’s causing it to under-perform, it could be quite serious or if bearable, still appreciably less good than a rival approach.

If I’ve a flaw in the relationship between the most important data structures in an application, that causes incorrect behaviour, I’ve a massive re-design in front of me. If it’s causing it to under-perform, it might be so bad that it would almost be better if it it was behaving wrong.

And it’ll be what makes finding those lower-level problems difficult (fixing low-level bugs is normally easy, it’s finding them that can be hard).

The low-level stuff is important, and its remaining importance is often seriously understated, but it does pale compared to the big stuff.

Someone who knows code sees the “trees.” But someone who understands data structures sees the “forest.” Therefore a good programmer will focus more on data structures than on code.

2

Knowing how the data will flow is all important. Knowing flow requires that you design good data structures.

If you go back twenty years, this was one of the big selling points for the object oriented approach using either SmallTalk, C++, or Java. The big pitch — at least with C++ because that’s what I learned first — was design the class and the methods, and then everything else would fall into place.

Linus undoubtedly was talking in broader terms, but poorly designed data structures often require extra rework of code, which can also lead to other problems.

What can be applied/learned from it?

If I may, my experience in the last few weeks. The preceding discussions clarified the answer to my question: “what did I learn?”

I rewrote some code and reflecting upon the results I kept seeing & saying “structure, structure…” is why there was such dramatic difference. Now I see that it was Data structure that made all the difference. And I do mean all.

  • Upon testing my original delivery, the business analyst told me it was not working. We said “add 30 days” but what we meant was “add a month” (the day in the resulting date doesn’t change). Add discrete years, months, days; not 540 days for 18 months for example.

  • The fix: in the data structure replace a single integer with a class containing multiple integers, change to it’s construction was limited to one method. Change the actual date arithmetic statements – all 2 of them.

The Payoff

  • The new implementation had more functionality but the algorithm code was shorter and clearly simpler.

In Fixing the code behavior/results:

  • I changed data structure, not algorithm.
  • NO control logic was touched anywhere in code.
  • No API was changed.
  • The data structure factory class did not change at all.

I like to imagine a very clever team of librarians in a beautifully made library with a million random and brilliant books, it would be quite a folly.

Can’t agree more with Linus. Focusing on the data helps greatly distill a simple and flexible solution to a given problem. Git itself is a proving example — giving so many features supported in the years of development, the core data structure largely remain unchanged. That’s magic! –2c

I’ve seen this is numerous areas.

Think about business analysis… Let’s say you’re analyzing the best way to support Marketing at a consumer products company like Colgate. If you start with fancy windows, or the latest technology, you won’t help the business nearly as much as if you think through the data needs of the business first, and then worry about presentation later. The data model outlasts the presentation software.

Consider doing a webpage. It’s much better to think about what you want to show (the HTML) first, and worry about style (CSS) and scripting (pick your tool) after.

This isn’t to say coding isn’t important too. You need programming skills to get what you need in the end. It’s that data is the foundation. A poor data model reflects either an overly complex or unthought business model.

I find myself writing new functions and updating existing ones a lot more often than having to add new columns or tables to my database schema.
This is probably true for all well designed systems. If you need to change your schema every time you need to change your code, its a clear sign you are a very bad developer.

quality of code indicator = [code changes] / [database schema changes]

“Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.” (Fred Brooks)

It seems like this idea has various interpretations in the various types of programming. It holds true for systems development and also holds true for enterprise development. For example, one could argue that the sharp shift in focus toward the domain in domain-driven design is much like the focus on data structures and relationships.

Here’s my interpretation of it: You use code to create data structures, so the focus should be on the latter. It’s like building a bridge – you should set out to design a solid structure rather than one that looks appealing. It just so happens that well written data structures and bridges alike look good as a result of their efficient designs.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật