wiki:Development/Code/CodingStyle

Coding style on Quattor modules

Introduction

All major software projects have a set of guidelines for their code. It helps people to understand each other's code, and even your own code a few months (or days!) after you wrote it for the first time.

There are lots of documents motivating this, feel free to review them. In short:

  • I want to read your code.
  • I want you to read my code.
  • I want to fix your bugs.
  • I want you to fix my bugs.

Further reading

The basics

Indentation

Use 4 white spaces. No tabs, no 8 white spaces, no funny things. Fix your editor's configuration, or use a decent one. There are plenty of them.

Capitalisation

Use lowercase for local variables, uppercase for constants.

We have no convention for function names, and we have functionNamesLikeThis (whichIHate), FunctionNamesLikeThis (WhichIDontLikeEither) or function_names_like_this_one (which I prefer). Do we need such a convention?

Use meaningful names for globals, short names for locals

i is a perfectly valid identifier for a loop variable. However, you deserve nasty punishment if you use it as a function name.

On the opposite, this_is_a_temporary_counter is plain bad for a loop variable.

So, some good examples:

 sub write_user_credentials
 {
     # ...
 }
 for my $i (0..10) {
     # ...
 }

And very bad examples:

 sub wuc
 {
     # WTH???
 }
 for my $counter_from_0_to_10_included (0..10) {
     # Excuse me???
 }

Be modular

Really, it matters. Have you ever tried to understand a module which is 700 lines long? Did you ever try to understand what's the purpose of a variable that was defined 5 screens ago? Or the meaning of the 30th variable on this block? Or the purpose of that line that starts and finishes beyond the screen?

Here are the classic metrics for modularity:

  • No lines longer than 80 columns.
    • Split or rearrange longer strings
    • Split longer statements
  • If you have more than 3 levels of indentation, split your block
  • If your function goes beyond the screen, split it
    • And don't try to reduce the font. ;)
  • If you have more than 7 local variables, split your function.
    • Sometimes it's OK to have 10, but if you have 15 your code is a problem.

Don't use magic numbers

The constant pragma will give you meaningful names for any values other than 0 and 1 you need. They'll help you to understand why you chose such values on the past.

Good example:

 use constant PI => 3.141592;
 my $circle_area = $radio * PI * PI;

And the bad example

 # Oops! I missed a decimal somewhere!
 my $circle_area = $radio * 3.14159 * 3.141592;

Comments

When in doubt, don't use them.

Comment all Pan data structures, or at least provide a link to their full description. Try to use the new annotation syntax, which will be (some day) imported by Doxygen, Javadoc or something similar.

Comment the purpose of each file, probably using POD syntax. Write a comment before the beginning of each function, telling what it does and how to use it. But don't annoy the reader with the internals.

Don't comment function bodies. If your code is so complex that it needs further explanations, you should probably split it in several functions and comment those functions. Of course, sometimes you have to work around some broken API, or some corner case. In such case, please comment why you are doing it, but not how. And don't comment the obvious.

Bad examples:

 ############################################################
 #
 # Increment by 1
 #
 ############################################################
 $i++;
 sub do_something
 {
    my @args = @_;
 
    # I do foo and bar here
    ...
    ...
    ...
    # And now, let's do bar again, but with a small difference
    ...
    ...
 }

These are best done like this:

 $i++;
 # Performs task foo...
 sub foo {...}
 # Performs task bar...
 sub bar {...}
 # Performs task baz, which is quite similar to bar
 sub baz {...}

 # Performs a set of tasks, returning blah blah with arguments
 # $arg1:
 # $arg2:
 sub do_something
 {
     my @args = @_;
     foo (@args);
     bar (@args);
     baz (@args);
 }

Finally, don't put any credits on your comments, excepting on file headers.

Don't use vars

This pragma has been deprecated since Perl 5.6, and that's a long time ago. Instead, use the our declaration for package-wide variables:

Good:

 our @ISA = ...;

Bad:

 use vars qw (@ISA);
 @ISA = ...;

Curly bracket position

Follow Kernighan-Ritchie's convention: open curly brackets on the same line as the sentence they belong to and close them on a line for their own, excepting when it's an else or a do-while block:

 if ($foo) {
     ...
 }
 while ($bar) {
     ...
 }
 if ($foo) {
     ...
 } else {
     ...
 }
 do {
     ...
 } while ($bar);    

This way it is perfectly clear where each block starts, finishes and continues.

The only exception to this are the curly brackets that open a function. They should be on a different line, and have nothing else on the same line:

 sub foo
 {
     my @args = @_;
 }

The reason for this is that decent editors (f.i, Emacs) are aware that such curly braces mean something special, and allow you to move to the beginning and the end of a function with a single key stroke. This is a great help for navigating code.

Parenthesis

Use them a lot. When in doubt, use them. Especially:

  • Always use parenthesis on function calls
  • Always use them on function calls even when there are no arguments

The reason is that:

 foo();

Is easier to understand than:

 foo;

Prefer methods over plain functions

Especially when writing components, you want to log unexpected things. It's free to have a $self object, and let it log.

Only use plain functions if you want them exported to some other module.

The Quattor library (AKA perl-CAF and perl-LC)

Running commands

Do not use backticks or system, nor use open for pipes. The CAF::Process module has all you need, and it won't spawn new subshells, which is much safer.

Another advantage of the CAF::Process module is that your command line will be automatically logged at verbose level, which saves you snippets like this:

 $self->debug (5, "Going to run ls -l");
 system ("ls", "-l");

This will become inconsistent with different uses. Instead, do:

 # Or any reporter object.
 my $proc = CAF::Proces->new (["ls", "-l"], log => $self);
 $proc->run();
 # One-line version:
 CAF::Process->new (["ls", "-l"], log => $self)->run();

The log option is any CAF::Logger object, for instance the component you are writing.

Confidential information on the command line

Sometimes you pass confidential data to your commands. For instance, an encrypted password to usermod. In this cases, you don't want your command logged. Just don't pass any log argument to CAF::Process::new:

 my $proc = CAF::Process->new (["ls", "-l"]);
 $proc->run();

Examples

The following are extracted from CAF::Process man page, and assume you don't want to log the commands.

Piping to a process' stdin

Create the contents to be piped:

 my $contents = "Hello, world";

Create the command, specifying $contents as the input, and "execute" it:

my $proc = CAF::Process->new (["cat", "-"], stdin => $contents);
$proc->execute();
Piping in and out

Suppose we want a bi-directional pipe: we provide the command's stdin, and need to get its output and error:

 my ($stdin, $stdout, $stderr) = ("Hello, world", undef, undef);
 my $proc = CAF::Process->new (["cat", "-"], stdin => $stdin,
                                stdout => \$stdout
                                stderr => \$stderr);
 $proc->execute();

And we'll have the command's standard output and error on $stdout and $stderr.

Emulating backticks to get a command’s output

Create the command:

 my $proc = CAF::Process->new (["ls", "-lh"]);

And get the output:

 my $output = $proc->output();

File handling

Writing to files is not as simple as one could think: there are risks that you should be aware of. For instance, the following code is an example of what shouldn't be done:

 open (FH, ">/tmp/foo");

If /tmp/foo already exists and is a symbolic link to /etc/shadow, you just lost all accounts on your system.

Writing to a file

Don't open files directly. Use the CAF::FileWriter module to do the work for you:

 my $fh = CAF::FileWriter->open ("/my/file", log => $self);
 print $fh "Hello, world!\n";
 $fh->close();

This module will perform all security checks for you, and will log what files you open for writing, so that your configuration is easier to debug.

You can specify the permissions you want at instantiation, too:

 my $fh = CAF::FileWriter->open ("/my/path", mode => 0400, log => $self );
 print $fh "Hello, world!\n";
 $fh->close();

If you find any errors and have to discard your contents, just call the cancel method, and the existing file will be preserved:

 my $fh = CAF::FileWriter->open ("/my/path", mode => 0400, log => $self);
 print $fh "Hello, world!\n";
 if ($error) {
     $fh->cancel();
 }
 $fh->close();

Finally, if the contents you generate are the same as on the original file, nothing will be done, which will keep timestamps, among other properties.

The man page has all the details and more examples.

Temporary files

Don't use them. If you don't use File::Temp, you'll use predictable filenames, and that's just bad. Then, most implementations make temporary files world readable, and you usually don't want that. If you need temporary storage for some text, use an array, IO::String, in-memory files, a CAF::FileWriter or anything like that.

So you want to run a command which needs a file name as an argument, right? Easy. Just pipe to that command, as shown above. And pass /dev/fd/0 as the file name.

Finally, if all these options are not good enough (what are you trying to do?), use File::Temp::tmpfile, which will provide you an anonymous file handle. But please, use this only if you are convinced there is no other way to keep your temporary data.

Modifying an existing file

Sometimes you just want to add a couple of lines to an existing file, without destroying its current contents. Or to remove them. The module CAF::FileEditor is just for that. It is a subclass of CAF::FileWriter, and when you call the string_ref method you'll get a reference to the file's contents so that you can freely update them.

For instance, you only want to change the first "a" for a b on the first line of a file:

 my $fh = CAF::FileEditor->open ("/foo/bar",
                                 log => $self);
 my $ref_to_contents = $fh->string_ref;
 $$ref_to_contents =~ s{a}{b};
 $fh->close();

If you want to add text at the beginning of the file, the head_print method is what you are looking for.

Other tasks with files

When in doubt, use LC:: modules and functions, over standard ones. For instance, use LC::File::copy, LC::File::move instead of File::Copy and friends.

LC:: functions perform more security checks and are less prone to symlink attacks.

Input handling

Make your code run in tainted mode.

This is, don't trust any inputs, even if they are from the profile. During the workshop, Nick proposed to sign profiles (it was proposed 2 years ago, too). While this gets (or not) implemented, profiles are not to be trusted. Sanitise everything just after reading it, and that way you'll be able to use it freely.

Also, when you print files that will be sourced by shell scripts, be sure to print all values between single quotes. This is true for almost every file you have under /etc/sysconfig.

Quality of messages

Don't annoy the user

The main principle is not to annoy the user. Use info and ok methods only for very important messages. A small description of the tasks you are going to perform, if they are complex enough.

Unless he asks to

Provide a detailed trace of any task you perform with verbose messages. Trace DNS queries, URLs being retrieved, services you have to enable or disable...

But you don't need to log files you open or commands you run. The appropriate CAF objects will do so for you.

Debugging output

Don't print user-relevant messages with debug. This should be used only for developer-relevant stuff, such as tracking temporary contents or so.

We don't have any convention for debug levels, is it needed?

Use error only for fatal errors

Use the error method only if this error will make the entire component fail. If you can handle it, or the fail is not really important for the component's results, use info instead. If failing to download a file is OK because you can work around it, or you know you may have no rights to write on AFS, but that's OK, use an info message.

Otherwise, the component will be re-run with no need, and will keep failing and triggering alarms on the user's monitoring system.

Conclusions

Code quality matters. It will reduce bugs, and will make everybody's life easier.

All these conventions can be improved, so feedback will be appreciated. Especially, the CAF library can be extended with whatever task we repeat over and over (f.i, do we need any especial code to do DNS queries or download files from the Internet?).

Last modified 12 years ago Last modified on Apr 23, 2012, 9:25:00 PM