Living without Getopt(s)

Living without Getopt(s)

Parsing command lines can be tricky sometimes. There are arguments and options (the things that begin with a dash or two). Options can have arguments themselves.[1] For some options, the arguments may be optional. Options come in long (more than one letter) and short (just one letter) forms. Usually, the long form starts with two dashes. But some programs also accept long options that start with one dash. And so on.

Some programming languages have luxurios command line parsing packages. Python, for example, has argparse. The package makes it comparatively easy to provide your programs with a sophisticated command line which even includes subcommands and lets you mix options and arguments freely.

The shell has getopt and getopts. I won’t go into details here about which one is better and which one you should use. They both support parsing of command lines, although in slightly different ways. To find out if you like them, just try them out.

As I always try to improve my shell skills[2] by looking at code written by real pros, I took a look at the source code of Git.[3] To my surprise, I found out that Git doesn’t use these two, except in a few cases. The Git code parses the command line manually. It does this by giving up a bit of the variability of the command line syntax, to gain predictability.

On one hand, Git[4] scripts use long options as flags. That is, an option does not have an argument, it just switches something on or off. An example for this is the flag --interactive, which triggers an interactive rebase.

On the other hand, if an option has an argument, it must be connected to the option by an equals sign. This makes parsing a lot easier. When an option has an argument, you normally have to look up the next item on the command line and handle the different errors that can occur. When the argument is missing, you will find out when instead of an argument you get an option again (because instead of the argument, the next option is read). If an option is followed by more arguments than expected, you don’t get the next option, as expected, but instead an argument where there should be none. This all means that you have to jump around when parsing and can find errors only by looking at multiple items in the command line. Which can become quite tricky quickly.

If the option and the argument must be connected by an equals sign, parsing it in a case statement simply reduces to

 --strategy=*)
   strategy="${1#--strategy=}"
   ...
   ;;

You just swallow the option --strategy=somestrategy as a whole.

The next line contains a nice trick. It’s called pattern matching operator. Although it looks fancy, it is not a feature of newer shells like Bash or Zsh. The operator is available in every POSIX-compatible shell. It removes the option part from the option plus argument and thus leaves the strategy. In the example above, this would be somestrategy.

Besides from that, command line parsing is done as usual in shell scripts. You parse the variable $1, which is the first argument ($0 is the name of your script, as always). After parsing, you shift the command line by one. This fills $1 with the next item on the command line. When there are no more options (things that begin with at least one dash), you know you’re finished with the options. Now you can handle the arguments.[5]

As you can see, a little less flexibility in the command line syntax makes parsing a lot easier. That’s pragmatism at work. And it proves that you can have decent command line handling without using getopt(s).


1. Not to be confused with the arguments to a script. Those are all items on the command line that are not preceded by a dash and that don’t belong to an option.
2. And, of course, other programming skills.
3. Version 2.12.0.rc1, to be exact.
4. To be precise, I should say: The scripts that I examined.
5. Which is the reason why options have to come before the scripts arguments in shell scripts. This also applies when you use getopt or getopts. Command line parsing libraries that allow you to mix options and arguments are a lot more tricky to implement. For an example, see Pythons argparse.

links

social