Taskgraph#
Taskgraph is a Python library to generate DAGs of tasks for Taskcluster, the task execution framework which underpins Mozilla’s CI. The nodes of the DAG represent tasks, while the edges represent the dependencies between them.
Taskgraph is designed to scale to any level of complexity. From a handful of tasks, to the over 30,000 tasks and counting that make up Firefox’s CI.
Installation#
Taskgraph is on Pypi and can be installed via:
pip install taskcluster-taskgraph
This provides the taskgraph
binary, see taskgraph --help
for available
commands.
Alternatively you can install it by cloning the repo. This is useful if you need to test against a specific revision:
git clone https://github.com/taskcluster/taskgraph
cd taskgraph
python setup.py develop
Getting Started#
Once installed, you can quickly bootstrap a new Taskgraph setup by running the following command within an existing repository:
taskgraph init
Warning
Taskgraph currently only supports repositories hosted on Github or hg.mozilla.org.
You should notice a couple changes:
A new file called
.taskcluster.yml
. This file is rendered via JSON-e with context passed in from a Github webhook event (if your repository is hosted on Github). The rendered result contains a task definition for a special task called the Decision Task.A new directory called
taskcluster
. This contains the kind definitions and transform files that will define your tasks.
In short, the Decision task will invoke taskgraph
in your repository. This
command will then process the taskcluster
directory, turn it into a graph
of tasks, and submit them all to Taskcluster. But you can also test out the
taskgraph
command locally! From the root of your repo, try running:
taskgraph full
You’ll notice that taskgraph init
has created a couple of tasks for us
already, namely docker-image-linux
and hello-world
.
Note
By default the taskgraph
command will only output task labels. Try
adding --json
to the command to see the actual definitions.
See if you can create a new task by editing taskcluster/kinds/hello/kind.yml
,
and re-run taskgraph full
to verify.
How It Works#
Taskgraph starts by loading kinds, which are logical groupings
of similar tasks. Each kind is defined in a taskcluster/kinds/<kind
name>/kind.yml
file.
Once a kind has been loaded, Taskgraph passes each task defined therein through
a series of transforms. These are a functions that can
modify tasks, or even split one task into many! Transforms are defined in the
kind.yml
file and are applied in order one after the other. Along the way,
many transform files define schemas, allowing task authors to easily reason
about the state of a task as it marches towards its final definition.
Transforms can be defined within a project (like the hello.py
transforms the init
command created for us), or they can live in an
external module (like Taskgraph itself). By convention, most tasks in Taskgraph
end with the transforms defined at taskgraph.transforms.task
, or the
“task” transforms. These special transforms perform the final changes necessary
to format them according to Taskcluster’s task definition schema.
Taskgraph’s combination of static configuration with logic layered on top, allows a project’s CI to grow to arbitrary complexity.
Next Steps and Further Reading#
After you have a working Taskgraph setup, you’ll still need to integrate it with Taskcluster. See Configuring your Project for more details.
Here are some more resources to help get you started:
Learn about task graphs and how they are generated.
Create a new Taskgraph setup from scratch to better understand what
taskgraph init
accomplished for us by following the tutorials:Read more details about kinds and transforms.
Learn more advanced tips for running and debugging Taskgraph locally.