Developers:BuildBot
From OctopusWiki
Contents |
Generalities
BuildBot is a Python program to automate software builds and test. It operates a server that triggers jobs on a number of slaves. These slaves may be running on the same machine as the server but also on different ones. This opens the possibility to compile and check on different architectures and operating systems.
We now have BuildBot running on www.tddft.org. Currently, it is configured for two cases:
- It tracks trunk of the Subversion repository and on each commit, it triggers several slaves to do a quick configure and compilation (without optimizations) of the code. This way, we get an instant response if a commit broke something in a different environment.
- Each night, the code is build in different settings (with a lot of debugging turned on) and the testsuite is run.
For a failing build, an e-mail is send to octopus-notify@tddft.org as with the old nightly builds. This e-mail contains a link to the output of all commands run by the build. In additon to this, there is also a so-called waterfall page in our Trac that lists all BuildBot activities, and that can be used to have a look at the build and test results.
Installation of BuildBot
To install BuildBot the packages
and Python are required, at least version 2.3.
Either install this software via your distribution's package system or simply by hand: After unpacking run
$ python ./setup.py bdist $ python ./setup.py install --prefix=DIR
in the package's source directory. This will install the files in DIR in a hierarchy like found below /usr/lib/python. Note that it might be necessary to set the PYTHONPATH environment variable to DIR/lib/pythonX.Y/site-packages (no slash at the end!) that the Python system can find the libraries.
Setup of the BuildBot master
The BuildBot master is installed in /home/buildbot/python on www.tddft.org and runs under the account of the buildbot user. It is started by cron via a @reboot line in this user's crontab. Its configuration and data files are in /server/www/tddft.org/programs/octopus/buildbot-master so that all developers have access to the configuration and can add new slaves (see below).
The Trac plugin (http://www.trac-hacks.org/attachment/ticket/339/TracBB-0.1.2.tar.gz), I had to hack a little bit to make it work. There are still some glitches but for simply browsing the waterfall it is ok. I also changed added a bbheader.cs Trac template that includes the CSS file buildbot.cs to tune the appearance. Therefore, I changed the tracbb.cs in the plugin's source to use the new bbheader.cs. The modified sources are in the buildbot user home directory.
Setup of a BuildBot slave
There are two steps necessary to set up a new slave:
- install BuildBot on the machine in question, and
- add the appropriate entries to the master configuration.
Installing a slave
Run the steps to install BuildBot as described above. Then run
$ DIR/bin/buildbot create-slave BASEDIR MASTER NAME PASSWORD
with BASEDIR being the directory where all the builds of this slave shall take place, MASTER being www.tddft.org:9989 in our case, NAME and PASSWORD are from the master configuration (see below).
Now insert your name and e-mail address into BASEDIR/info/admin and some machine info, e. g. uname -a output, into BASEDIR/info/host.
As last step, you have to start the slave:
$ DIR/bin/buildbot start BASEDIR
If you want to be sure that it is restarted when the machine reboots, the simplest thing is a crontab entry like
@reboot DIR/bin/buildbot start BASEDIR
but this one only works if you have Vixie cron running, usually the case on Linux distributions but commonly not on System V (check out man 5 crontab).
Please also note that the environment set up by cron might be very different from your login environment. On the slaves, which I have set up, I had to include a line
PYTHONPATH=DIR/lib/python2.4/site-packages
in the crontab.
Registering a slave with the master
The entire configuration of the BuildBot master is done via the file master.cfg that lives in /server/www/tddft.org/programs/octopus/buildbot-master/. It is a Python file, so for those of you that unlike me know this language, it will be easy to understand what's going on.
The following steps have to be undertaken:
Add a slave entry to the c['bots'] dictionary. The dictionary looks like this at the moment:
c['bots'] = [("octopus", "secret1"), # AMD Opteron, Fedora
("pepita", "secret2"), # Sparc, Solaris
("g22", "secret3") # i386, Debian
]
After adding your new slave, it might look like
c['bots'] = [("octopus", "secret1"), # AMD Opteron, Fedora
("pepita", "secret2"), # Sparc, Solaris
("g22", "secret3"), # i386, Debian
("wheel", "secret4") # Alpha, OSF1
]
with wheel being the new slave's name and secret4 the password you have to give on the create-slave command line (see above). The name of the slave should not contain dots (at least I had trouble with that) and thus should not be the fully qualified domain name but something else which is also unique.
Next, you have to add a builder that knows how to compile (and test if you want) the code on the new slave. Go to the HOW TO BUILD THE CODE section of the file and add an entry like
x86_64_gfortran_build = octopusVpathBuild(
var={"FC" : "/home/lorenzen/opt/gcc-4.2.0/bin/gfortran",
"FCFLAGS" : "-Wall -I/home/lorenzen/opt/gfortran-4.2.0/netcdf/include"},
flags=["--with-blas=-lblas -L/home/lorenzen/opt/gfortran-4.2.0/blas/lib",
"--with-lapack=-llapack -L/home/lorenzen/opt/gfortran-4.2.0/lapack/lib",
"--with-fft-lib=-lfftw3 -L/home/lorenzen/opt/gfortran-4.2.0/fftw/lib",
"--with-sparskit=-lskit -L/home/lorenzen/opt/gfortran-4.2.0/sparskit/lib",
"--with-arpack=-larpack -L/home/lorenzen/opt/gfortran-4.2.0/arpack/lib",
"--with-netcdf=-lnetcdf -L/home/lorenzen/opt/gfortran-4.2.0/netcdf/lib",
"--with-gsl-prefix=/home/lorenzen/opt/gsl",
"--disable-gdlib"])
The name of the builder x86_64_gfortran_build and the octopusVpathBuild create a builder that uses VPATH capabilities of make to find more bugs in the build system (octopusVpathBuild is actually just a Pyhton function defined above the current section). The nomenclature for builders is arch_compiler_options_type with
-
archbeing the target architecture likesparc,i386,ppc, -
compilerthe Fortran 90 compiler used likegfortran,ifort,nag,pgi,g95,sunf90, -
optionsspecial build characteristics likempich2, and -
typethe kind of build,build,test, andfullfor the moment.buildonly compiles and links the code,testadditionally runs the short testsuite,fullthe long one.
The two keyword arguments var and flags are for the configure invocation:
Setting the arguments to
var={"A" : "a", "B" : "b"},
flags=["--with-bar=/usr/lib/bar",
"--without-foo"]
results into the configure line
$ A=a B=b ./configure --with-bar=/usr/lib/bar --without-foo
In order to run the testsuite, it might be sufficient to add the line
octopusAddTest(x86_64_gfortran_build)
This causes a make -C testuite check to be issued after the compilation. For special builds and environment, this might not be enough, e. g. MPI builds which require additional setup.
For those cases, the testing step can be added by
x86_64_gfortran_mpich2_test.addStep(shell.ShellCommand,
description="testing",
descriptionDone="test",
command="cd _build && COMMAND")
with COMMAND being a shell command line that does the required steps. The cd _build is to get into the build tree, which is necessary because a VPATH build is being performed. The exit code of COMMAND determines the testresult. So, if you append several command by && or ; be sure that the correct exit code is returned, e. g. by
command="cd _build && do_preparations && do_tests; err=$?; do_cleanup; exit $?"
After specifying the build, you have to say in the AND WHERE section which slave shall perform the build.
Add an entry
bot_BUILDNAME = {'name' : "BUILDNAME",
'slavename' : "SLAVE",
'builddir' : "BUILDNAME",
'factory' : BUILDNAME}
for the new build.
If several slaves are able to do the build, instead of 'slavename' : "SLAVE" a list 'slavenames' : ["SLAVE1", "SLAVE2"] can be given. This allows for some load-balancing in the scheduler.
The bot_BUILDNAME has to be added to the c['builders'] list:
c['builders'] = [bot_x86_64_gfortran_build,
bot_x86_64_ifort_build,
bot_x86_64_gfortran_mpich2_build,
bot_x86_64_ifort_mpich2_build,
bot_sparc_sunf90_build,
bot_x86_64_ifort_test,
bot_x86_64_ifort_mpich2_test,
bot_i386_gfortran_test,
bot_BUILDNAME
]
As a last step, the build has to be added to one or more schedulers. Schedulers are defined in the SCHEDULERS section and are responsible for actually triggering the builds. Currently, there is the buildonly scheduler that triggers rather quick compilations on each commit, the nightly scheduler to which all BUILDNAME_tests go, and the every2days scheduler that runs four times a week the full builds. Simply add your build to the builderNames entry of the appropriate scheduler specification.
After saving the configuration file, it takes a little while until the master rereads its configuration (up to an hour because of the @hourly crontab entry I have set) until the new build appears on the waterfall page. If it does not appear, have a look at the twistd.log file in the master's directory, perhaps you rendered the Python file invalid. In such cases, the master simply ignores the change in configuration and continues with the old one.
Also add your new slave to the table below.
List of slaves
We currently have the following slaves running:
| Name | Architecture | OS | Location |
|---|---|---|---|
| g75 | i386 | Debian/GNU Linux | FU Berlin |
| aramis | x86_64 | Debian/GNU Linux | FU Berlin |
| corvo | x86_64 | Debian/GNU Linux | EHU San Sebastian |
| nowii | powerpc64 | openSUSE | EHU San Sebastian |

