Starling Software K.K Language:

QAM Principles and Purpose

Introduction

QAM is a framework designed to aid in developing, building, installing testing (including automated testing) and running a software package or group of packages and their associated open-source dependencies. This document will provide an overview of what QAM does and why it does it that way.

The standard use case is that you run the top-level script called Test, which installs code from extsrc and src into release, leaving intermediate files in build, runs all of the unit tests, and then loads up databases, starts up servers, and runs a functional test of the whole system.

QAM is intended to support five main areas:

Supported Platforms and Requirements

QAM is primarially designed for use on Unix systems such as BSD and Linux systems. However, we are trying to provide as much Windows support as reasonably possible; depending on the software in the system, a project can generally be compiled and installed on Windows, and we intend to maintain that.

QAM is written purely in Ruby, in order to maximize portability whilst still using a language that’s reasonably powerful, concise and clear. There are places where Unix system utitilies (such as sh or tar) are invoked by Ruby code; these should be minimized as much as possible.

Directories

The remainder of this overview is organized around the various directories in a QAM project. You may find it convenient, when reading this document, to take a checkout of a QAM-based project, build and test it using the top-level test script (“./Test” in the base directory) and examine the various directories and files as we go along.

One good example to use is qam itself, which can be checked out with:

git clone git://git.cynic.net./qam

This has several examples of servers which you would not normally include when importing QAM into your own project. Because of this, you will need to have lighttpd installed on your system to be able to run the tests in this checkout. If you don’t have lighttpd installed, running “./Install” at the top level should produce most of the files you’d want to inspect.

Here is the set of directories we discuss:

/			Project root, or "base" directory.
/release		Installed code and data
/build		Intermediate build files and data
/extsrc		Source code for programs supporting the project
/src		Source code for the project
/instance		Data used by running instances of programs.

The Base Directory

The base directory, or project directory, is where everything related to the project is kept, including source, intermediate build files, installed files, and application data. Normally this directory and its contents would be checked into a version control system, such as Subversion or Darcs. Checkouts of this are used in the same way by both developers of the system and production versions of it; a production version, a staging version, and several different developer versions can all exist on a single machine without conflict (albeit no servers must try to listen on the same port–more on this later).

The Release Directory

The release directory (under the base directory) contains the installed version of the system, all of its supporting software, and some data. It is laid out in the same way that a Unix /usr or /usr/local directory tree is laid out, with programs under release/bin, libraries under release/lib, and so on. Note that this is no release/var directory, for reasons described in the next paragraph.

The data stored under the release directory are those that are specific to a particular release of the program and unchanging for that release. For example, the template files used to build configuration files for web (and other) servers are stored under release/lib/server/server-name/. Changing data, or data specific not to any instance of an application but to a particular instance of an application, are stored in the instance directory (see below).

The idea behind the principle above is to allow for “binary” releases. The reason we do not insist that this work in a different location in the filesystem is that it’s very difficult to support ELF’s RPATH mechanism if the absolute path to a shared library changes, and ELF is an extremely popular format for shared libraries.

The Instance Directory

The instance directory is used for storing information specific to a particular instance of a running (or runable) server or other program. (You may, in one project, run multiple instances of a server, say, on different ports; the test system takes advantage of this to avoid disturbing test servers you may be using for manual testing.) Typically a directory under the instance directory will be named something like instance/web.8080, indicating that this is an instance of src/server.web (see below) that runs on port 8080. This number need not be a port number; in particular, daemons that don’t listen on a port may just use 0, particularly on production servers.

Each server instance directory, as we call things such as instance/web.8080, has a fairly standard layout, though particular projects may modify this as they see fit. This layout is designed to partition carefully what data need to be backed up and restored when recreating a server instance (e.g., after loss of a host on which the server instance was running).

When an instance directory is initially created, the owner and group (which will default to the user running the program) will be given read/write permissions, and others will be given no permissions. This default avoids exposure of sensitive data, control sockets, and the like, so long as you ensure that the user’s primary group is not shared with unrelated users. However, the conf directory under the instance directory, since it often contains especially sensitive information (e.g., passwords), will always have ‘other’ permissions removed whenever ‘server setup’ is run, which includes when it’s started and stopped.

In a particular subdirectory of the instance directory you will normally find:

* `db`: A directory for "permenant" information generated by the
server, such as the users who have registered at a web site. In a
production environment, this directory should always be backed up,
and restored when recreating the server instance.

* `conf`: Locally modifable configuration files. As with the `db`
directory, this directory should always be backed up, and restored
when recreating the server instance. See below for further details
on how these files are set up. Note that this directory is always
reset to have no permissions for "other"; see above.

* `conf/qserver.conf`: The configuration file for QAMs server
configuration/startup/shutdown program. This is used to determine
which server (e.g., lighttpd or apache) to run, and configure
the generation of the files under `run/genconf`. For the latter,
a typical entry might indicate whether the server should be
password-protected or not, or set up DBMS server passwords.

* `log`: For storage of log and other files. This would normally be
backed up (http server logs, for example, are valuable) but it is
not necessary to restore it in order to restore a server after a
disaster.

* `run`: Transient files, such as PID files, that need not be
backed up.

* `run/genconf`: Configuration files generated by the server
configure/start/stop system. The server startup script automatically
generates these from data in the `conf/qserver.conf` file and
templates in `release/libdata/server/{name}/genconf-template` (which
in turn is copied from `src/server.{name}/genconf-template`. These
files are regenerated every time, and as with anything else under
`run`, need not be backed up.

The files in the conf directory are rather special in that they are installed from default versions when not present, but are never modified or overwrittenby the system thereafter. The default versions are installed from src/server.{name}/conf-default into release/libdata/server/{name}/conf-default. The default versions must be the correct configuration of the system for use in automated tests; the automated test system relies on this property.

When modified by an administrator (say, because they are to be used in production, and so password protection should be disabled), the server config/start/stop system will inform the administrator of differences between the default configuration and the actual configuration when the server is configured or started. This aids the sysadmin in noting what the configuration of the server is, and also when configuration parameters have been added or changed in new versions of the software.

The Build Directory

The build directory is for storage of intermediate files used during compilation. These may be used by developers for debugging and other purposes1 but should never be used by anything expected to be running in a production system (see the principle under The Release Directory section above).

Generated files should never be stored under any directories but release, instance and build; removal of these three directories followed by a rebuild should result in a clean rebuild of every part of the system.

The Extsrc Directory

The extsrc directory is used for (generally unmodified) software packages and libraries that the project uses. Software from the lib, bin, mod, haskell, perl, python, ruby and perhaps other subdirectories will be built and installed in release before going on to building the project itself. Examples might be lib/libpng, bin/lighttpd, mod/nagiosplugins, haskell/QuickCheck, and ruby/maruku.

This is intended in particular to minimize configuration of a host and disruption of other applications running on the host. For example, by including Apache Httpd, PHP and some non-standard PHP libraries in extsrc, one can first avoid installing any of that on a new host one wishes to use for development or production, and can also co-exist with any existing systems using different versions and configurations, such as a legacy web system using Apache Httpd 1.x and PHP with a customized system-wide configuration file.

What you do and don’t chose to put into extsrc is a matter of judgement and the particular situations in which you find yourself deploying your project.

In the future we intend to implement a mechanism that can check the versions and configurations of software already installed by the operating system’s native package mechanism (or just installed by hand in /usr/local) and use those if available and appropriate, thus decreasing build time.

The Src Directory

The src directory is where the project itself lives, with various modules each having their own subdirectories under src. This is also frequently used for modified versions of open source or other applications and libraries. src/qam contains the copy of QAM used by the project, and is updated with the qu tool. This should be written up one day.

Not Covered Yet

This document does not yet cover:

* The systems for starting and stopping servers.
* The automated test framework.
* Probably a bunch of other stuff.

You can contact cjs@starling-software.com if you have comments or suggestions.


  1. For example, the wrappers for the Glasgow Haskell Compiler’s runghc and ghci programs will use compiled versions of files from this directory if they are available and are not older than the source files from which they were presumably generated.