Contents:
Everyone has opinions about coding style. This document contains some high-level advice. It doesn't go into minutiae like how many spaces per indentation level or whether curly braces belong on the same line as a conditional or on their own line. Details like that don't matter, so long as they are consistent. (If they are inconsistent, then the code becomes much harder to read!) Rather, this document focuses on more important issues.
Many other coding convention documents are available. Even more valuable are descriptions of good ways to design and write code. For Java programmers, I highly recommend Josh Bloch's book Effective Java.
When you add a new procedure to a file, don't just type it wherever your cursor happens to be. Instead, place related procedures together. It is often helpful to put a block comment (e.g., starting with a row of 75 asterisks, or whatever style you use, so long as you are consistent) at the beginning of each group of related procedures. Such block comments divide the file into sections that are readily apparent to readers.
In general, put public methods before private ones in your files. Organize your file with helper methods (whether public or private) after the main entry points. This permits readers to read your code top-down, which is more comprehensible: the purpose of each piece of code, and how it fits into the whole, is obvious. A reader can forward-reference to just the specification, not the whole implementation, of a helper method. This doesn't mean you necessarily have to write your in a code top-down order, but do organize it that way for readers.
Every code file that you write (Java classes, Perl scripts, etc.) needs to have a comment at the top explaining exactly what it does and, if applicable, how to run it. Otherwise it will be a mystery to others — and perhaps to you when you return to it. In some cases one or two sentences will do; in many other cases, the description needs to be more complete. Every non-trivial procedure should also contain a brief comment saying what it does. In the case of Java, there should be valid Javadoc comments for every method (both public and private). Each parameter should be described as well unless they are completely obvious. Don't add useless comments that just repeat the name of the parameter and its type, or in which the @returns clause is essentially identical to the procedure summary. Comments should enlighten, not merely repeat.
When a comment is a sentence, start it with a capital letter, end it with a period, and use correct grammar. Strive to keep comments and code to 80 characters whenever possible. (Don't be slavish — an exception here and there is OK — but lots of violations lead to less-readable code. Don't assume that everyone uses the same width screen as you do — I assure you they do not — but 80 columns is a generally-accepted industry standard.) This makes it possible to print the code in a readable fashion and also to read the code in a standard-sized window.
Do not comment out large blocks of code with /*
... */
;
instead, prefix each line by //
. Among other things, this does
not lose the indentation of the original code; without that indentation,
the commented-out code is much too hard to read and understand. It also
makes it clear what is commented out and what is not, even when the code
is printed or is viewed without color highlighting.
In general, you should not copy code. It is easy to make a mistake when copying, even easier to forget to update some of the copies when editing other copies, and difficult for readers to understand the distinction (or lack thereof) among the versions. Rather than copying, it is often better to use hooks or to generalize the original version.
If you are forced to copy code, then it is essential that you indicate where you got the original version from; this is important for understanding the code and for giving credit where credit is due, and to not do so is intellectually dishonest. Furthermore, you should indicate the reason for the copying and how this version differs from the original, and clearly indicate every change that you have made, perhaps with a distinctive comment that is indicated in the prefatory comment, or perhaps by giving a command that can be run to get a diff of your version of code against the original. If the original code is still being maintained, you should periodically update your code with respect to the upstream version, and should document how to do this.
Local variables should have the most restrictive scope possible. For instance, don't do this:
int x; ... for () { ... x = ...; ... }
Instead, do this:
for () { ... int x = ...; ... }
A loop-carried dependence is when a variable is (sometimes) set on one loop iteration and used on the next iteration. A loop-carried dependence is the only reason to declare, external to a loop, a variable that is set in the loop. Reducing scopes makes it clear that there are no loop-carried dependences. Likewise, if two loops both use a temporary variable, you should declare two separate variables rather than reusing the same one, to indicate that there are no inter-loop dependences.
Every variable and field should be explicitly initialized (set to an initial value). However, it should never be redundantly initialized to a temporary value that will not be read.
If a variable is initialized after its declaration but before it is used, it should not be initialized to a temporary value that will never be read. An example is
int x; // it would be bad style to initialize x to a dummy value
if (p) {
x = someValue;
} else {
x = otherValue;
}
It is clearer, when possible, not to reassign values immediately. Prefer the above
construct with an else
clause over
int x = otherValue; // this is confusing; put it in an else clause instead
if (p) {
x = someValue;
}
Fields should also be initialized exactly once. If a field is initialized by the constructor, then its declaration should not initialize it to a temporary value that will never be read.
In some languages (for example, Java), it is possible to omit the
initializer for a field:
boolean myField;
is equivalent to boolean myField =
false;
. The short version is no more efficient, but it is more
confusing.
A reader must waste time searching the code (including in subclasses) to determine
the initial value. The code is clearer if the initializer is explicit. Do so for all
datatypes, including objects whose default value (in the absence of an
initializer) is null.
When code has a consistent style, particularly within a single file but also over an entire project, the code is much easier to read and understand. I don't wish to spend an excessive amount of time or energy promulgating coding guidelines, but here are a few things you should pay attention to.
Use a consistent indentation style. When editing an existing file, adopt its indentation style rather than writing your additions in an incompatible style. This means that you must set your editor to respect the current indentation style. You can do this by hand, but that's error-prone and programmers hate to perform tasks manually. You should be able to find a customization package that does this for you. For example, Emacs users can use dtrt-indent, which causes Emacs to set its indentation parameters to whatever the file already happens to use (for C and Java code).
In general, do not re-format existing files to suit your own personal indentation style. That does you very little good (you should be comfortable using any consistent indentation style), it destroys the version control history information by modifying every line of code, and it annoys others who wrote or maintain the code.
Ensure that whitespace makes keywords easy to read: do not jam punctuation against other entities, which makes the program hard to read. Here are some examples of this rule:
if (foo) bar;
" rather than "if(foo)bar;
".} else {
" rather than "}else{
".//
that starts a comment.
Additionally, make keywords and procedure calls visually distinct. Do not
place whitespace between a procedure name and its arguments. Thus, you
would write "while (x != 0)
" but "myProcedure(x != 0)
".
Do not use tabs in code files. They display differently in different editors, and they often print differently than they display in an editor. Always use spaces instead.
~/.emacs
file:
(defun unset-indent-tabs-mode () (setq indent-tabs-mode nil)) (add-hook 'java-mode-hook 'unset-indent-tabs-mode) (add-hook 'c-mode-hook 'unset-indent-tabs-mode) (add-hook 'perl-mode-hook 'unset-indent-tabs-mode) (add-hook 'cperl-mode-hook 'unset-indent-tabs-mode)
Before you commit a change, you should always run the
status
and diff
command, such as svn status
or
hg status
and svn diff
or hg diff
, to see
exactly what changes you have made. It is far too easy to inadvertently
check in changes that you didn't intend to, and it is far to easy to fail
to check in part of a larger change. (This advice is equally applicable to
papers as it is to code.)
When editing a file of LaTeX source that is under version control, you
should ordinarily not refill paragraphs (e.g., M-q
in Emacs),
particularly toward the end of the edit cycle or when others might want to
see what you have done. Refilling paragraphs makes the diffs large, and
readers must examine whole paragraphs that may or may not contain a change.
If you must refill paragraphs, make a separate checkin that changes no content except for paragraph formatting, so that readers can ignore that checkin (only).
If it is early in the editing cycle, or if you are not collaborating with anyone, or if every line of a given paragraph has changed, or if for some other reason readers will have to reread the entire document (rather than viewing the diffs), then refilling paragraphs is fine.
Java code should compile without
warnings using javac -g -Xlint
.
In general, do not use \n
in strings in Java code; when output,
\n
produces a line separator on Unix, but \n
is
not the line separator on other
platforms. If you wish to output a line separator, use
println
, or use printf
with the %n
specifier. If you need the platform-specific line separator (e.g., because
you are building a string that will be output),
use String.format
with the %n
specifier, or use
private static final String lineSep = System.lineSeparator();
#!/usr/bin/env perl
as the first line, to permit
independence from the specific location where perl is installed. If
you need a particular version of Perl, then use the appropriate
command, such as use 5.6.0;
or require 5.6.0;
.
Do not hard-code a path to the perl executable; that path may not
exist on other systems and may change even on a given system.
use English; use strict; $WARNING = 1;
system_or_die.pm
,
which automatically do the checking. To use them, put use
system_or_die;
near the top of your Perl script (after the use strict;
block).
@files = glob("*.c");
in
preference to @files = split('\n', `ls
*.c`);
.
checkargs
facility to ensure that you have passed the
correct number of them. Put use checkargs;
near the top of
your Perl script, then, as the first line of user-written subroutine
foo, do one of the following:
my ($arg1, $arg2) = check_args(2, @_); my ($arg1, @rest) = check_args_range(1, 4, @_); my ($arg1, @rest) = check_args_at_least(1, @_); my @args = check_args_at_least(0, @_);
Back to Advice compiled by Michael Ernst.
Michael Ernst