Perl CLI: Subcommands With Getopt::Long

Recently, I’ve been working on a CLI tool for Webmin (cleverly called webmin, to match the existing virtualmin and cloudmin CLI tools) that supports sub-commands. This is an extremely common user interface concept, and most developers have probably used one (git being an extremely popular example). But, in a cursory googling it wasn’t obvious how to do it with Perl using only core modules. There are several CPAN modules that’ll do it, including GetOpt::Long::Subcommand and App::Cmd, and they both look good for their intended audience, but I didn’t want to pull in any extra dependencies and I had somewhat different requirements that made it simpler to just use GetOpt::Long (which has been a core Perl module for many years).

The docs only hint at it, but it already has the ability to handle subcommand style options, though you have to write a tiny bit of extra code, and call GetOptions more than once. But, frankly, it’s not as complicated as I expected…the GetOpt::Long developers provided a very good baseline set of functionality, and provided a couple of clever escape hatches for folks who need something more.

Requirements

As mentioned, I have somewhat uncommon requirements for my program:

  • Short (-h) and long (--help) options, obviously
  • Subcommands with independent subcommand options
  • Subcommands are not hard-coded, they are discovered at run-time
  • Subcommands are independent programs (runnable directly or loadable as a library by the top-level command)
  • Subcommands receive the global options, for things like config file paths
  • No non-core modules, though sometimes a pure-Perl module may be included in the Webmin distribution if there’s no other good option (and Webmin does rely on a few other modules that have to be installed separately, like Net::SSLeay for SSL/TLS support, but those are widely available from most native package managers)
  • It needs to work in quite old Perl versions, at least back to 5.10 (the version available from the standard yum repositories on CentOS/RHEL 6)

The “no non-core modules” requirement often seems weird to seasoned Perl folks, since the CPAN contains so much great stuff, but Webmin has such a huge installed base, across such a huge number of Linux and UNIX versions, and needs to be robust to things like upgrading Perl and library versions, migrations, etc. every new dependency potentially has a pretty big impact when spread across a few million systems we have no control over, so Jamie is slow to add dependencies when they can be avoided.

Basic Subcommand Handling

My command needs to be able to handle something like this:

# webmin --config /usr/local/etc/webmin list-config --module apache --option start_cmd

Where --config is a global option, applying to all commands and parsed by the top-level command, and list-config is the command to execute and the --module and --option options are passed into the list-config command, where they will hopefully be validated and acted on.

This is surprisingly complicated to think about and deal with, but Getopt::Long is up to the task! First we need to include a couple of options when we load the module.

use Getopt::Long qw(:config permute pass_through);

The permute option just allows mixing options and arguments, and is enabled by default but can be disabled by an environment variable, and we want to make sure we can mix it up and let the subcommand deal with whatever comes after it on the command line (this particular example doesn’t need this option, but it’s useful to have the ability to handle it).

The pass_through option is the really important one for allowing subcommands. It lets us stop processing options and arguments at the first unrecognized one. In our case, that means any bareword will escape the options parsing and continue the program. We can then check the subcommand (validating that it exists), and bundle up the global options and the rest of the command line and send it to the subcommand for further validation and processing.

Passing Arguments to the Subcommand

I decided to pack all of the globals into an %opt hash, so those options can be acted on within the webmin command itself, and optionally passed on to the subcommand (if it’s loaded as a module, it could receive the whole %opt hash, or if it’s a standalone command, some or all of it can be passed along). In my case, the config option needs to be known to both commands, since it tells the command where Webmin is installed and where its configuration and log files can be found. The webmin command needs to know how to find subcommand paths, and the subcommand likely needs to be able to load the Webmin standard library and the library for the module it works with.

Since there are only a tiny number of global options, it wouldn’t be crazy to pass them as individual scalar variables in the cases where we load the subcommand as a module, but if we add more global options in the future, we’d probably need to change all of the commands, too. So, it is more future-proof to pass a hash ref.

Another option, which I’ve considered and may still revert to, is to put the --config value, and any other variables the subcommands might need, into environment variables rather than passing it as a command line argument. This has the benefit of leaving the subcommand options unmodified, which may make it easier to port existing Webmin CLI commands to run under the new webmin command runner.

So, let’s write some code to deal with our global options and subcommand:

#!/usr/bin/env perl
# Webmin CLI - Allows performing a variety of common Webmin-related
# functions on the command line.
use strict;
use warnings;
BEGIN { $Pod::Usage::Formatter = 'Pod::Text::Color'; }
use 5.010; # Version in CentOS 6

use Getopt::Long qw(:config permute pass_through);
use Pod::Usage;
use Term::ANSIColor qw(:constants);

sub main {
    my ( %opt, $subcmd );
    GetOptions(
        'help|h' => \$opt{'help'},
        'config|c=s' => \$opt{'config'},
        'list-commands|l' => \$opt{'list'},
        '<>' => sub {
            # Handle unrecognized options, inc. subcommands.
            my($arg) = @_;
            if ($arg =~ m{^-}) {
                say "Usage error: Unknown option $arg.";
                pod2usage(0);
            } else {
                # It must be a subcommand.
                $subcmd = $arg;
                die "!FINISH";
            }
        }
    );

    $opt{'config'} ||= "/etc/webmin";

    my @remain = @ARGV;
    # List commands?
    if ($opt{'list'}) {
        list_commands(\%opt);
        exit 0;
    } elsif ($subcmd) {
        run_command( \%opt, $subcmd, \@remain );
    }

    exit 0;
}
exit main( \@ARGV ) if !caller(0);

# ...some other functions to handle listing and running...

At this point, I have nearly everything I need from my command parsing except the parsing that happens in the subcommands themselves (which, so far, are very standard Getopt::Long usage). So, let’s break it down.

GetOptions first parses any options (stuff that starts with - or --). Then after the defined options there’s an escape hatch, of sorts, indicated by the special option <>, which says “anything we don’t recognize, do this”, and it hands it to an anonymous function, which could do just about anything, but our case is pretty simple. If it gets to here, it means there’s either an option it doesn’t know about it, or there’s a subcommand (a bareword).

If it’s an unknown option, it exits and prints the usage summary using pod2usage (there’s POD at the end of the file, of course). But, if it finds a non-option (something that doesn’t start with - or -- it puts it into the $subcmd variable and calls die with another magic value !FINISH, which is specially handled by GetOpt::Long.

Now we may have some options in the %opt hash, and we may have a subcommand in $subcmd, and we may also have some remaining options or whatever in @ARGV. This is exactly the flexible sort of handling I needed. The main command can perform actions and accept options, but it can also hand off control and the rest of the command line to a subcommand. And, we didn’t need any non-core modules. Success!

To see the full code for the Webmin CLI, and some of its subcommands, it’s availbe on the Webmin github](https://github.com/webmin/webmin/tree/master/bin).

And, for completeness, I’ll mention that if you know in advance all of your subcommands (i.e. you aren’t pulling them in at run-time like I am) you can just call GetOptions again. @ARGV contains whatever was left after the first call, so, you can validate $subcmd however you like and then accept whatever other options you need after.