Making your Programs Readable
One of the important features in a program — in addition to exhibit the expected behaviour
in all possible situations — is readability. There are two strong reasons for this: In a
real-life situation, many different programmers interact with a given program — perhaps you
wrote the program, but some time later another person has to maintain it (modify it, add new
features, fix newly discovered problems, etc.), or you get to use, for your application,
bits and pieces that other people wrote. The other reason is that even if you are the only
one that is going to interact with a particular program, what we write today may become
less-than obvious when we look at it two or three weeks (or months, or years) later.
The argument can be made a lot more dramatic, to the point of claiming that readability is more
important than functionality (and no, this is not a joke). The argument being that it is easier to
fix a program that is readable and easy to understand, than to modify or maintain a program that is
working perfectly but is written in a cryptic way. You may ask: «if the program is
working perfectly, why would it need to be modified?» The reality is that
programs require change more often than not — requirements change, new versions
are developed with new or modified features, problems (bugs) are found on programs more often
than not, etc.
Of course, by the time that you “ship” your programs, they must be working properly
above everything else. But the thing is, there is a lot of time between the moment that you write
a program (possibly a small part of a bigger software or some other product) and the moment that
the product is shipped, and during that time, programs are tested, modified, fixed or re-written,
and of course, you want these tasks to be as easy and as little error-prone as possible — and
one of the most critical conditions for that is that the programs be readable.
But there is another subtlety here: as programs become more complex (and they necessarily
will, if you plan to do anything useful with programming), making them work correctly becomes
increasingly tougher; readability is essential to reduce the complexity, or rather, to make
complex programs a little less hard to understand and thus to get them working properly.
I will briefly discuss a few of the issues involved in writing readable programs. For a
more advanced discussion, I would recommend you to get a copy of Steve McConnell's Code
Complete book. This book is, in my opinion, a masterpiece, even if it may be considered a little
bit dated; still, I consider it a must-have piece for everyone who wants to be a programmer
or a software designer/architect. As you progress in the area of software, you'll find out that there
are many books on more specific issues of software engineering and quality of the software that we
develop; but this book is definitely a good starting point.
Comments
Comments are fragments of text in the program that are marked so that they are disregarded by the
compiler (that is, they are not really part of the program); they serve the purpose of explaining
(to a human reader) what the program (or more specifically, what that section of the program)
is doing, or why.
In C++, comments are indicated in two possible ways:
- End-of-line comments, indicated with //
Whenever the compiler encounters the sequence //,
the rest of that line is disregarded by the compiler, and has no effect on the
behaviour of the program.
- Block comments, indicated by enclosing between /* and
*/
Brainteaser: There is an obvious exception to the above
rules (that is, a situation where the sequence // or the sequence
/* appears, and yet what follows is not ignored by the compiler).
Can you think of such a situation?
The fragment below shows some examples of comments in a C++ program:
#include <iostream>
using namespace std;
/* Author: Carlos Moreno
Description:
This is just a demo program.
Don't take it too seriously
*/
int main()
{
// Just print a quick message:
cout << "Done!" << endl;
return 0;
}
In the above fragment, I used teal(ish) to indicate the new portions, the new concepts being
illustrated; however, in the future (including other tutorials), I will use green for the comments;
in fact, many text editors oriented to programming use the so-called syntax highlight,
in which different aspects of the program are shown in different colors (literal strings in one
color, variables in another color, comments in another color, etc.), and often, green or gray is
used for comments (when you think about it, syntax highlight is a feature that derives from the
notion of readability, and the importance of readability in programs).
Very important: make good and generous use of comments, but equally important: do not
write excessive comments. An excessive amount of comments makes it hard to read the actual
program. Typical mistakes in this category include writing comments that say exactly the same
as the statement that they're addressing:
These comments are redundant, as they just repeat what is absolutely clear by looking at the
statement. If anything, a complex statement would have to be explained (but explained,
not just “spelled out”). Also, a comment that could make sense (even for a simple and
obvious statement) is one that, instead of repeating what is being done (like the above examples),
says why it needs to be done. For example, there may be a non-obvious reason why
a needs to be assigned with 0 in the above
example; in such case, a good/useful comment would be one that explains why a
is being assigned with 0
Good Variable Names
This aspect can be seen as “do not speak in code” (no pun intended!). If a variable means
something, then you'll want to program to “speak” to you in the right terminology; that
variable should have a name that means that something. If a variable holds a price, it would be
ridiculous to call it product, or a, or
variable (hopefully, you are not one of those persons that use the
word password for their passwords — and if you are, hopefully you won't
bring that truly horrible habit to the world of programming).
Perhaps, a good guideline — and I'm sure this is going to sound like I'm
joking — is the fact that variable names should make comments unnecessary, by
making the statement self-explanatory; if you look at the example of conversion from inches
to centimeters, I think there is no room for discussion: the program is perfectly clear, and
neither line requires a comment to clarify what is being done or why. If, however, instead of
inches and cm the variables had been called
a and b, or measure1
and measure2, then comments would have been needed to exlain the
meaning of something mysterious like a = 2.54 * b;
More in general (this applies to variable names and to other readability aspects as well), we
should aim at writing programs that are “self-commenting”; programs that are so
well-written and so clear that comments are really not needed to explain what the program is
doing (so, it's not that comments should be avoided — simply aim at
writing a program as if you were trying to make comments unnecessary).
Indentation and Spacing
Proper layout helps distinguish the various pieces, sections and blocks of a program. In C++,
spaces and even line-breaks are optional (well, with some obvious exceptions); the entire program
could be written in a single line and with no spaces between statements:
int main(){cout<<"Done!"<<endl;return 0;}
Even this extremely short program becomes sufficiently painful to read and understand to
illustrate the point (even if the example is taken to a somewhat ridiculous extreme).
The aspect of indentation helps to identify and separate the different blocks
in a program. This will become more clear in the upcoming topics (covered in the other tutorials,
on conditional execution, loops, functions, etc.), but do keep in mind that it is an important
feature, in the category of readability.
Constants and Constant Values
The use of named constants has a dual benefit. It can both increase the robustness of
a program, and also helps with readability. Suppose that you see a program, perhaps written by
someone else, containing the following fragment:
total = price + price * 0.07;
total = total + total * 0.08;
If you happen to be familiar with the local tax rates, you may immediately understand
what that fragment is doing. But one should not necessarily expect from all programmers
that they will recognize those numbers. Multiplying times 0.07 and then 0.08 might
strike the reader as mysterious and puzzling. Sure, a comment next to those numbers
would solve the mystery, but then again, we don't want to have to write a comment for
something so trivial as this.
It would be better if we could give a name to those mysterious “magic
numbers”, so that when we read them, it becomes quite obvious what the program is
doing. Using variables that contain those particular values would be a possibility; however,
it's not a very good one, since a variable requires storing the values in memory, and
requires the processor to fetch those values from memory every time — no
big deal, but it does feel wrong (and more importantly, there is a better solution).
Named constants are syntactically similar to variables, except that in general, they
do not require memory storage — the compiler usually treats them as if you wrote
the numbers directly . They are like variables in that they are a named piece of
information — a named constant. Names for these constants follow
the exact same rules as for variables; they also have a data type associated, and must be
declared with a syntax that is identical to a variable declaration, except for one detail:
the const qualifier (in case you have not guessed it by now:
strictly speaking, named constants are variables; a more specific type of
variables that have const-qualification).
The example below should illustrate all of the above (it is not a complete program;
just a fragment showing the relevant details):
const double gst_rate = 0.07;
const double pst_rate = 0.08;
double gst = price * gst_rate;
double pst = price * pst_rate;
When you read the formulas, it is quite clear what the program is doing, even if we
are not seeing the exact value by which we are multiplying — we know that it is the
GST rate, whatever the actual number happens to be.
You could argue that in this case it should have been obvious if we do it the right
way, declaring a variable to hold the GST; naming that variable gst makes
it quite obvious that the “magic number” has to be the GST rate. This does not
negate the general principle, in that it is still preferable to read gst_rate
rather than 0.07.
There are additional reasons why the use of named constants is convenient. Suppose
that a year from now (from the time you wrote the program), the government changes
the tax rates. That means that you have to modify the program. If there are, say,
10 or 20 places in your program where you do tax calculations, you are going to have
a lot of unnecessary work in changing all those 10 or 20 fragments of the program!
But the extra work that you could have saved is not the strongest argument; the
real issue here is: what if you overlook one of them? You read through
the program changing all the occurrences of 0.07 and 0.08, and you might end up
changing 19 of them, because you overlooked one (trust me, it can and
does happen!). If you have a named constant, all you have to do is change
that constant at the point of its declaration — it is hard to make a mistake in
such a simple and to-the-point maneuver, and the change will be automatically
reflected in every statement that uses the named constant!
It is often stated as a good programming practice that a program should never have
any “magic numbers”. That is, that a program should never have an actual number
in it, other than 0, 1, or initializing a named constant. This is not to be taken
literally and too strictly, but it is a good guideline. Virtually every number that
we use in our programs has a meaning: we always want to read the meaning of the
number, rather than the actual number.
Protecting Against Mistakes
Another use of the const qualifier is to add a non-mutable constraint to
certain variables. Quite often, we use a variable to store a temporary or partial
result; we want to use the content of that variable later in the program, but we
know that the variable is not going to be modified. It is a good idea to
const-qualify such variable — we make the promise (to the compiler) that that
variable is not going to be modified. If we accidentally write some
statement that attempts to modify the variable (again I must say trust
me — really trust me: this does happen!), the compiler will detect it, and
will realize it is a mistake only if we const-qualified the variable (otherwise
the compiler can not know if we legitimately want to modify the variable or not).
As a general rule, simply add a const in front of any declaration of
a variable that you know will not be modified. This, of course, requires that
the variable be initialized in the declaration (otherwise the first assignment
to that variable would constitute a violation of the constness of that
variable — yes, I know constness is not a correct English word, but it is
used in C++ to refer to “non-mutability” of a variable).
The following program illustrates the use of const-qualification in a relatively
simple programming situation:
#include <iostream>
using namespace std;
int main()
{
const double gst_rate = 0.07;
const double pst_rate = 0.08;
double price;
cout << "Enter product's price: " << flush;
cin >> price;
const double gst = price * gst_rate;
const double pst = (price + gst) * pst_rate;
const double total = price + gst + pst;
cout << "Sub-Total:\t" << price
<< "\nGST:\t" << gst
<< "\nPST:\t" << pst
<< "\n\nTOTAL:\t" << total << endl;
return 0;
}
Perhaps surprisingly, most of the time, variables get one initial value and never
require modification — in the example above you notice that, of a total of four
variables, three of them get an initial value and do not require further modification
(and that is without counting the named variables — it should really be five out of
six); actually, price follows that same rule, in a sense,
but in that case, there is no way that we can get away with making that one const,
since it must be modified after its declaration, given that it receives input from
the user.
|