Thoughts on good software: 2009

Dienstag, 30. Juni 2009

Classical reports of software engineering (2)

One of the most important distinctions when developing software is the one between accidental and essential complexity. These terms were coined in the landmark article "No silver bullet" (Brooks, 1986)

While essential complexity is the intrinsic complexity of the problem to be solved, accidental complexity is the complexity that has to be accepted because of the technology used to solve the problem (not the actual algorithm itself). This is especially important when evaluating different tools that solve a particular problem. Should you e.g. solve a certain problem in C or in Perl or in Java? Look at the complexities! What accidental complexity would be introduced by either solution? For example string processing is a breeze in perl thanks to regular expressions and easy string handling, while in C it's a real chore.

In a 2004 blog posting, Joel Spolsky considered memory management one of the best improvements of the last years (I agree):

"A lot of us thought in the 1990s that the big battle would be between procedural and object oriented programming, and we thought that object oriented programming would provide a big boost in programmer productivity. I thought that, too. Some people still think that. It turns out we were wrong. Object oriented programming is handy dandy, but it's not really the productivity booster that was promised. The real significant productivity advance we've had in programming has been from languages which manage memory for you automatically. It can be with reference counting or garbage collection; it can be Java, Lisp, Visual Basic (even 1.0), Smalltalk, or any of a number of scripting languages. If your programming language allows you to grab a chunk of memory without thinking about how it's going to be released when you're done with it, you're using a managed-memory language, and you are going to be much more efficient than someone using a language in which you have to explicitly manage memory. Whenever you hear someone bragging about how productive their language is, they're probably getting most of that productivity from the automated memory management, even if they misattribute it."

Before you plan which technology or programming language you use, always think about how much accidental complexity you introduce by either choice.

"No Silver Bullet - Essence and Accident in Software Engineering", Brooks, F. P., Proceedings of the IFIP Tenth World Computing Conference, pp. 1069-1076, 1986.

Dienstag, 5. Mai 2009

Why is good UI design so hard for some Developers?

A user asked on stackoverflow:

"Some of us just have a hard time with the softer aspects of UI design (myself especially). Are "back-end coders" doomed to only design business logic and data layers? Is there something we can do to retrain our brain to be more effective at designing pleasing and useful presentation layers?"

My reply:

"Let me say it directly:

Improving on this does not begin with guidelines. It begins with reframing how you think about software.

Most hardcore developers have practically zero empathy with users of their software. They have no clue how users think, how users build models of software they use and how they use a computer in general.

It is a typical problem when an expert collides with a laymen: How on earth could a normal person be so dumb not to understand what the expert understood 10 years ago?

One of the first facts to acknowledge that is unbelievably difficult to grasp for almost all experienced developers is this:

Normal people have a vastly different concept of software than you have. They have no clue whatsoever of programming. None. Zero. And they don't even care. They don't even think they have to care. If you force them to, they will delete your program.

Now that's unbelievably harsh for a developer. He is proud of the software he produces. He loves every single feature. He can tell you exactly how the code behind it works. Maybe he even invented an unbelievable clever algorithm that made it work 50% faster than before.

And the user doesn't care.

What an idiot.

Many developers can't stand working with normal users. They get depressed by their non-existing knowledge of technology. And that's why most developers shy away and think users must be idiots.

They are not.

If a software developer buys a car, he expects it to run smoothly. He usually does not care about tire pressures, the mechanical fine-tuning that was important to make it run that way. Here he is not the expert. And if he buys a car that does not have the fine-tuning, he gives it back and buys one that does what he wants.

Many software developers like movies. Well-done movies that spark their imagination. But they are not experts in producing movies, in producing visual effects or in writing good movie scripts. Most nerds are very, very, very bad at acting because it is all about displaying complex emotions and little about analytics. If a developer watches a bad film, he just notices that it is bad as a whole. Nerds have even built up IMDB to collect information about good and bad movies so they know which ones to watch and which to avoid. But they are not experts in creating movies. If a movie is bad, they'll not go to the movies (or not download it from BitTorrent ;)

So it boils down to: Shunning normal users as an expert is ignorance. Because in those areas (and there are so many) where they are not experts, they expect the experts of other areas to have already thought about normal people who use their products or services.

What can you do to remedy it? The more hardcore you are as a programmer, the less open you will be to normal user thinking. It will be alien and clueless to you. You will think: I can't imagine how people could ever use a computer with this lack of knowledge. But they can. For every UI element, think about: Is it necessary? Does it fit to the concept a user has of my tool? How can I make him understand? Please read up on usability for this, there are many good books. It's a whole area of science, too.

Ah and before you say it, yes, I'm an Apple fan ;)"

Don't forget to have a look at the wonderful other replies like this one that has a lot of further pointers into UI matters.

Freitag, 30. Januar 2009

Kill productivity now!

Kill productivity now by browsing through this thread: What's your favorite programmer cartoon?

I'm happy to see the great Exploits of a mom (a.k.a. Little Bobby Tables) cartoon at the top. Of course xkcd is your place to go for more of that brand of humour.

Oh and if you understand German don't forget this classic IBM ad with the wonderful line "Where are the web designers?".

Deconstructing Steve Jobs

Tom Junod has written a very unusual article for Esquire called Steve Jobs and the Portal to the Invisible.

It's not average tech writing but a deep-field view of the way Jobs works and thinks.

One really gripping and essentially heartbreaking element is Jobs bashing the Amazon Kindle with the argument that people don't read books anymore anyway when his biological sister Mona Simpson is a writer. Kudos to Junod for writing such a holistic and insightful piece.

Sonntag, 25. Januar 2009

Toughest bug

Some time ago a StackOverflow thread got started with the interesting question about the toughest bug one had fixed. I added this incident:

"This was on Linux but could have happened on virtually any OS. Now most of you are probably familiar with the BSD socket API. We happily use it year after year, and it works.

We were working on a massively parallel application that would have many sockets open. To test its operation we had a testing team that would open hundreds and sometimes over a thousand connections for data transfer. With the highest channel numbers our application would begin to show weird behavior. Sometimes it just crashed. The other time we got errors that simply could not be true (e.g. accept() returning the same file descriptor on subsequent calls which of course resulted in chaos.)

We could see in the log files that something went wrong, but it was insanely hard to pinpoint. Tests with Rational Purify said nothing was wrong. But something WAS wrong. We worked on this for days and got increasingly frustrated. It was a showblocker because the already negotiated test would cause havoc in the app.

As the error only occured in high load situations, I double-checked everything we did with sockets. We had never tested high load cases in Purify because it was not feasible in such a memory-intensive situation.

Finally (and luckily) I remembered that the massive number of sockets might be a problem with select() which waits for state changes on sockets (may read / may write / error). Sure enough our application began to wreak havoc exactly the moment it reached the socket with descriptor 1024. The problem is that select() works with bit field parameters. The bit fields are filled by macros FD_SET() and friends which DON'T CHECK THEIR PARAMETERS FOR VALIDITY.

So everytime we got over 1024 descriptors (each OS has its own limit, Linux vanilla kernels have 1024, the actual value is defined as FD_SETSIZE), the FD_SET macro would happily overwrite its bit field and write garbage into the next structure in memory.

I replaced all select() calls with poll() which is a well-designed alternative to the arcane select() call, and high load situations have never been a problem everafter. We were lucky because all socket handling was in one framework class where 15 minutes of work could solve the problem. It would have been a lot worse if select() calls had been sprinkled all over of the code.

Lessons learned:

even if an API function is 25 years old and everybody uses it, it can have dark corners you don't know yet
unchecked memory writes in API macros are EVIL
a debugging tool like Purify can't help with all situations, especially when a lot of memory is used
Always have a framework for your application if possible. Using it not only increases portability but also helps in case of API bugs
many applications use select() without thinking about the socket limit. So I'm pretty sure you can cause bugs in a LOT of popular software by simply using many many sockets. Thankfully, most applications will never have more than 1024 sockets.
Instead of having a secure API, OS developers like to put the blame on the developer. The Linux select() man page says
"The behavior of these macros is undefined if a descriptor value is less than zero or greater than or equal to FD_SETSIZE, which is normally at least equal to the maximum number of descriptors supported by the system."
That's misleading. Linux can open more than 1024 sockets. And the behavior is absolutely well defined: Using unexpected values will ruin the application running. Instead of making the macros resilient to illegal values, the developers simply overwrite other structures. FD_SET is implemented as inline assembly(!) in the linux headers and will evaluate to a single assembler write instruction. Not the slightest bounds checking happening anywhere.

To test your own application, you can artificially inflate the number of descriptors used by programmatically opening FD_SETSIZE files or sockets directly after main() and then running your application."

Thoughts on good software