Erik Engheim
4 min readJan 26, 2019

--

The Case for Code Comments

The debate regarding commenting code, is one I think we programmers will never resolve. My personal opinion is that those who are against code comments often are so to a degree which borders on religiosity. To me it is rather simple. If you think comments are bad, simply don’t read them. Instead some comment haters will remove them from code even if they are to the benefit of others. I think there should be a reckognition that we are not all created equal. Some people actually benefit greately from comments, and I think they should be accomodated.

Lots of comments are usually a sign that code needs to be refactored for clarity.

I honestly think this is a strawman argument. I’ve been a software developer for close to 25 years now. The only time I saw excessive amounts of code was in some student code at university. It is simply not a regular problem. A far more prevailent problem is too little comments.

It is far more common to see shitty code which needs refactoring but which has no comments at all. Now you have the worst of both worlds.

In fact, one of the first things you’ll learn when working with legacy code is to distrust comments.

Most of my professional life has been with legacy code, and while comments are sometimes off, I’ve never found them to be a negative. For every case where there is a bad comment, there are 20 cases where a comment would have been immensly helpful.

Especially with legacy code, comments are of immense value, because there is often not clear technical reasons for why a code is what it is. Things are often done in strange ways for historical reasons. Comments help document how the code developed that way. It can explain technical choices which are no longer obvious.

For instance a large part of the code I work on has no const correctness. Why? Are there sound technical reasons for that, which somebody maintaining the code has to be aware of? No, it is simply because the code was automatically converted from Java to C++ years ago. That is nice to know.

A lot of the code involves complicated math, based on scientific papers. With only access to the code, it is next to impossible to understand it. No amount of refactoring can change that, if you are not familiar with the math behind it. A comment giving you the link or name to the scientific paper it was based on helps.

I worked on software once with a complicated database. There was no documentation or comment on why that choice was made. It took me a long time to determine that the code and database was entirely useless. I was able to reduce the code and replace it with simple file reading.

In the wild, I’ve rarely seen single letter variable names outside of maybe a counter. Even then, I suspect there’s a clearer name lurking

What is clarity? If clarity was proportional to the length of the variable name, then we would make variable names which are whole sentences long. Sometimes you may actually need a whole sentence to describe what a variable is. But in that case, you are better of writing a sentence long comment, and use a short variable name.

Clarity is not just about understanding individual variables, but also about understanding how they interact. Mathematical equations are not written with single letters without reason. It clarifies relationships between variables and makes it easier to see how they can be rearranged.

Abstraction at its heart is about shortening. By factoring out code in a large function in several smaller functions, we can more easily see the interaction between smaller more well defined chunks of code. However it is not without cost. Abstractions means learning new concepts. It is not possible to look at code using map, reduce and filter and understand what is going on without first having learned those concepts. When I write the function name map, in my code, I am not explaining to the reader of the code exactly what I am doing. Rather I am using a mnemonic device. The code reader can use map, as a key into his memory, to recal the concept of mapping a set of values to another set of values.

The way I look at code, is that you solve your problem by defining a set of concepts. Then you use those concepts to solve your problem. The reader of your code, then has to familiarize himself with these concepts to understand the code. Natural language is a helpful aid in explaining these concepts, just like natural language is used together with mathematical equations to explain the concepts involved in the equations.

Perhaps you can read my thoughts on abstraction, to see my point of view.

--

--

Erik Engheim

Geek dad, living in Oslo, Norway with passion for UX, Julia programming, science, teaching, reading and writing.