Taskgraph#

Taskgraph is a Python library to generate DAGs of tasks for Taskcluster, the task execution framework which underpins Mozilla’s CI. The nodes of the DAG represent tasks, while the edges represent the dependencies between them.

Taskgraph is designed to scale to any level of complexity. From a handful of tasks, to the over 30,000 tasks and counting that make up Firefox’s CI.

Installation#

Taskgraph is on Pypi and can be installed via:

pip install taskcluster-taskgraph

This provides the taskgraph binary, see taskgraph --help for available commands.

Alternatively you can install it by cloning the repo. This is useful if you need to test against a specific revision:

git clone https://github.com/taskcluster/taskgraph
cd taskgraph
python setup.py develop

Getting Started#

Once installed, you can quickly bootstrap a new Taskgraph setup by running the following command within an existing repository:

taskgraph init

Warning

Taskgraph currently only supports repositories hosted on Github or hg.mozilla.org.

You should notice a couple changes:

  1. A new file called .taskcluster.yml. This file is rendered via JSON-e with context passed in from a Github webhook event (if your repository is hosted on Github). The rendered result contains a task definition for a special task called the Decision Task.

  2. A new directory called taskcluster. This contains the kind definitions and transform files that will define your tasks.

In short, the Decision task will invoke taskgraph in your repository. This command will then process the taskcluster directory, turn it into a graph of tasks, and submit them all to Taskcluster. But you can also test out the taskgraph command locally! From the root of your repo, try running:

taskgraph full

You’ll notice that taskgraph init has created a couple of tasks for us already, namely build-docker-image-linux and hello-world.

Note

By default the taskgraph command will only output task labels. Try adding --json to the command to see the actual definitions.

See if you can create a new task by editing taskcluster/kinds/hello/kind.yml, and re-run taskgraph full to verify.

How It Works#

Taskgraph starts by loading kinds, which are logical groupings of similar tasks. Each kind is defined in a taskcluster/kinds/<kind name>/kind.yml file.

Once a kind has been loaded, Taskgraph passes each task defined therein through a series of transforms. These are a functions that can modify tasks, or even split one task into many! Transforms are defined in the kind.yml file and are applied in order one after the other. Along the way, many transform files define schemas, allowing task authors to easily reason about the state of a task as it marches towards its final definition.

Transforms can be defined within a project (like the hello.py transforms the init command created for us), or they can live in an external module (like Taskgraph itself). By convention, most tasks in Taskgraph end with the transforms defined at taskgraph.transforms.task, or the “task” transforms. These special transforms perform the final changes necessary to format them according to Taskcluster’s task definition schema.

Taskgraph’s combination of static configuration with logic layered on top, allows a project’s CI to grow to arbitrary complexity.

Next Steps and Further Reading#

After you have a working Taskgraph setup, you’ll still need to integrate it with Taskcluster. See Configuring your Project for more details.

Here are some more resources to help get you started:

Table of Contents#

Indices and tables#