Typical use of Paver or Fabric to automate repeating tasks

by (September 26, 2016)

Posted in Tools  Tags:Paver, Fabric

Why its important

The core concept is to save time for every typical operation developer would do manually otherwise. I hardly imagine C/C++ developer to be working without Makefiles, but the situation when Python/Ruby/JS developers are doing things without automation is pretty common among the teams we’ve seen.

This is where the tools like Paver and Fabric tools found their place and shine. The bigger your team, more reasons to automate things. Lets take for example obtain latest database task. Here is how it is usually done:

  1. Find out where is the latest database dump stored.
  2. Download it to your computer
  3. Unpack, drop & re-create database
  4. Restore database

Lets assume that it takes 20 minutes in average (assume everyone is already trained how to do that, while its not a usual fact). Next, if your team consists from 5 developers, you will cost 1 hour per day, when the database refresh is needed by development reasons.

Lets compare 4 manual actions above with this simple Fabric command:

fab get_db import_db

which would not switch your context and make things going smoothly.

What are typical tasks

Usually the following operations are automated:

  • setup Various project setup related tasks (like installing Virtualenv for Python project)
  • run Running ‘development’ server.
  • get_db Getting the latest development database dump (and sometimes other files)
  • import_db Import or re-import latest downloaded databasedump
  • test Running automated test suite locally
  • quality Running code quality checks
  • docs Building documentation
  • … and a lot more different tasks which are suitable for your project.

Typical implementation

Lets examine how it works on typical Django project example.

Initial configuration and common functions

Before starting writing common functions we should define a subset of common variables and function to be used among the automation file.

Fabric:

from fabric.api import local, cd, run, get, env, lcd, task

env.host = 'some.host.com'
env.remote_dir = 'my_project'
@task
def local_manage(runstring):
    """
    Run django management command
    """
    with lcd(env.local_project_root):
        local('./manage.py %s' % runstring, capture=False)

Some explanations over Fabric syntax:

  • @task - decorator that this function is a Fabric tasks. Functions without this decorator won’t be valid fabric commands
  • lcd - function, which does chdir to local directory. Fabric ‘cd’ command does the same on remote host.
  • local - running local command

Paver:

from paver.easy import task, consume_args, options, sh, path, Bunch, needs, no_help
from paver.virtual import virtualenv
import paver.doctools

options(
    dict(
        virtualenv_dir='env',
        project_name='my_project',
        install_path=path(__file__).abspath().dirname(),
        default_port=9001,
        sphinx=Bunch(builddir="_build"),
    )
)


def call_manage(*args):
    """
    A convenient way to call any of Django manage.py commands
    """
    manage_py = path.joinpath(options.install_path, options.project_name, 'manage.py')
    try:
        sh(cmd('python', manage_py, *args))
    except KeyboardInterrupt:
        pass

Runserver command

Running django runserver. Usually it makes sense to run it on some unique port in your organisation so running several projects on same desktop would not cause any port clash.

Fabric:

@task
def runserver(server_address='127.0.0.1', port='9001'):
    """
    Django development runserver
    """
    local_manage('runserver %s:%s' % (server_address, port))

Paver:

@task
@virtualenv(dir=options.virtualenv_dir)
@consume_args
@needs('setup_local_settings')
def runserver(args):
    """
    Passes all arguments to manage.py runserver command. If there
    are no arguments: server is started on localhost:default_port
    """
    command = cmd('runserver', *args if args else ['localhost:{}'.format(options.default_port)])
    call_manage(command)

This task is first adding local settings if there were none, and runs the server on default port.

Getting latest development database copy

Getting & installing database dump largely depends on:

  • Where backups are done, how developer fetches them. Usually it is SSH or S3 access to a folder with recent database dumps. Sometimes those are clean dumps of production database, sometimes some security related data is being stripped (customer emails, credit card numbers etc).
  • Database type. If its SQLIte3, then getting the file locally is all you need to do. With other database you will need to run dump/restore tools from your database toolbox.

Fabric example for SQLite database:

@task
def get_db():
    """
    Getting database dump
    """
    with lcd(env.local_project_root):
        get('%s/%s' % (env.project_path, env.project_name),
            env.local_project_root)

For paver we usually develop Django custom management command, which take S3 credentials and downloads S3 database dump from Amazon S3 bucket:

@task
@virtualenv(dir=options.virtualenv_dir)
@consume_args
def get_pgdump(args):
    """
    Getting database dump from S3 using custom project management command
    """
    call_manage(cmd('get_pgdump', *args))

Installing database

There is no specific installation steps for SQLite, while MySQL and PostgreSQL could either call django management command via pipe or call it directly using custom project command. The second method is usually better as Django has all the infromation on database credentials, but for MySQL the simple method also works well, like this:

def _sqldumps_path(filename):
    """
    Return dumps path with side effect - create directory if it does
    not exists
    """
    path = os.path.normpath(os.path.join(env.local_project_root, '../dumps/sql/'))
    if not os.path.exists(path):
        os.mkdir(path)
    if filename:
        return os.path.join(path, filename)
    return path

@task
def get_db():
    """
    Download latest database dump from remote server
    """
    filename = run('ls -1tr {backup}sql/*.bz2|sort|tail -n1'.format(
        backup=env.remote_dir))
    get(filename, _sqldumps_path())

@task
def import_db(filename):
    if filename:
        import_dump = filename
    else:
        import_dump  = local('ls -1tr *.sql.bz2|tail -n1' %
                              _sqldumps_path(), capture=True)
    local('echo "{dropcreate}"|fab dbshell && bzcat {dump}|fab dbshell'.format(
        dropcreate=dropdb_cmd,
        dump=latest_dump),
          shell='/bin/bash')

Running tests

Running tests is usually pretty simple task, but not always. Sometimes you need to run tests only for some named applications, or use some specific testing tools to run test suite. So, normally the the Fabric or Paver procedure get defined under this name, but the content can largely differ:

@task
def tests(what=''):
    """ Apply tests for portal app"""
    _local_manage('test %s' % what)

What tool to use

We only described in a short Fabric and Paver tools which are mostly used for Python development. Other language platforms have their build and configuration tools.

Fabric Paver
Easy to start More complex
Running commands on several hosts -
Difficult to implement complex logic Easy to implement complex logic: paver provides mechamisms for sharing options, task dependencies and much more.

Fabric provides tools to manage several remote hosts and command to upload/download data. While Paver gets better with task dependencies, modules.

Rerefences

Let us know!

Contact details:

 

Services you are interested in: