taskgraph.util package

taskgraph.util package#

Submodules#

taskgraph.util.archive module#

class taskgraph.util.archive.HackedType#: Bases: bytes

class taskgraph.util.archive.TarInfo(name='')#

Bases: TarInfo

chksum#: Header checksum.

devmajor#: Device major number.

devminor#: Device minor number.

gid#: Group ID of the user who originally stored this member.

gname#: Group name.

linkname#: Name of the target file name, which is only present in TarInfo objects of type LNKTYPE and SYMTYPE.

mode#: Permission bits.

mtime#: Time of last modification.

name#: Name of the archive member.

offset#: The tar header starts here.

offset_data#: The file’s data starts here.

pax_headers#: A dictionary containing key-value pairs of an associated pax extended header.

size#: Size in bytes.

sparse#: Sparse member information.

tarfile#

type#

REGTYPE, AREGTYPE, LNKTYPE, SYMTYPE, DIRTYPE, FIFOTYPE, CONTTYPE, CHRTYPE, BLKTYPE, GNUTYPE_SPARSE.

Type:: File type. type is usually one of these constants

uid#: User ID of the user who originally stored this member.

uname#: User name.

taskgraph.util.archive.create_tar_from_files(fp, files)#

Create a tar file deterministically.

Receives a dict mapping names of files in the archive to local filesystem paths or mozpack.files.BaseFile instances.

The files will be archived and written to the passed file handle opened for writing.

Only regular files can be written.

FUTURE accept a filename argument (or create APIs to write files)

taskgraph.util.archive.create_tar_gz_from_files(fp, files, filename=None, compresslevel=9)#

Create a tar.gz file deterministically from files.

This is a glorified wrapper around create_tar_from_files that adds gzip compression.

The passed file handle should be opened for writing in binary mode. When the function returns, all data has been written to the handle.

taskgraph.util.archive.gzip_compressor(fp, filename=None, compresslevel=9)#

Create a deterministic GzipFile writer.

This is a glorified wrapper around GzipFile that adds some determinism.

The passed file handle should be opened for writing in binary mode.

taskgraph.util.attributes module#

taskgraph.util.attributes.attrmatch(attributes, **kwargs)#

Determine whether the given set of task attributes matches.

The conditions are given as keyword arguments, where each keyword names an attribute. The keyword value can be a literal, a set, or a callable:

A literal must match the attribute exactly.

Given a set or list, the attribute value must be contained within it.

A callable is called with the attribute value and returns a boolean.

If an attribute is specified as a keyword argument but not present in the task’s attributes, the result is False.

Parameters:

attributes (dict) – The task’s attributes object.
kwargs (dict) – The conditions the task’s attributes must satisfy in order to match.

Returns:

Whether the task’s attributes match the conditions or not.

Return type:

bool

taskgraph.util.attributes.keymatch(attributes, target)#: Determine if any keys in attributes are a match to target, then return a list of matching values. First exact matches will be checked. Failing that, regex matches and finally a default key.

taskgraph.util.attributes.match_run_on_git_branches(git_branch, run_on_git_branches)#: Determine whether the given project is included in the run-on-git-branches parameter. Allows ‘all’.

taskgraph.util.attributes.match_run_on_projects(key, run_on)#: Determine whether the given parameter is included in the corresponding run-on-attribute.

taskgraph.util.attributes.match_run_on_tasks_for(key, run_on)#: Determine whether the given parameter is included in the corresponding run-on-attribute.

taskgraph.util.attributes.sorted_unique_list(*args)#: Join one or more lists, and return a sorted list of unique members

taskgraph.util.cached_tasks module#

taskgraph.util.cached_tasks.add_optimization(config, taskdesc, cache_type, cache_name, digest=None, digest_data=None)#

Allow the results of this task to be cached. This adds index routes to the task so it can be looked up for future runs, and optimization hints so that cached artifacts can be found. Exactly one of digest and digest_data must be passed.

Parameters:

config (TransformConfig) – The configuration for the kind being transformed.
taskdesc (dict) – The description of the current task.
cache_type (str) – The type of task result being cached.
cache_name (str) – The name of the object being cached.
digest (bytes or None) – A unique string identifying this version of the artifacts being generated. Typically this will be the hash of inputs to the task.
digest_data (list of bytes or None) – A list of bytes representing the inputs of this task. They will be concatenated and hashed to create the digest for this task.

taskgraph.util.dependencies module#

taskgraph.util.dependencies.get_dependencies(config: TransformConfig, task: dict) → Iterator[Task]#

Iterate over all dependencies as Task objects.

Parameters:

config (TransformConfig) – The TransformConfig object associated with the kind.
task (Dict) – The task dictionary to retrieve dependencies from.

Returns:

Returns a generator that iterates over the Task objects associated with each dependency.

Return type:

Iterator[Task]

taskgraph.util.dependencies.get_primary_dependency(config: TransformConfig, task: dict) → Task | None#

Return the Task object associated with the primary dependency, which is assumed to be available in the primary-dependency attribute. (Which is always the case for tasks created with from_deps.)

Parameters:

config (TransformConfig) – The TransformConfig object associated with the kind.
task (Dict) – The task dictionary to retrieve the primary dependency from.

Returns:

The Task object associated with the: primary dependency or None.

Return type:

Optional[Task]

taskgraph.util.dependencies.group_by(name, schema=None)#

taskgraph.util.dependencies.group_by_all(config, tasks)#

taskgraph.util.dependencies.group_by_attribute(config, tasks, attr)#

taskgraph.util.dependencies.group_by_single(config, tasks)#

taskgraph.util.docker module#

class taskgraph.util.docker.HashingWriter(writer)#

Bases: object

A file object with write capabilities that hashes the written data at the same time it passes down to a real file object.

hexdigest()#

tell()#

write(buf)#

class taskgraph.util.docker.VoidWriter#

Bases: object

A file object with write capabilities that does nothing with the written data.

write(buf)#

taskgraph.util.docker.create_context_tar(topsrcdir, context_dir, out_path, args=None)#

Create a context tarball.

A directory context_dir containing a Dockerfile will be assembled into a gzipped tar file at out_path.

We also scan the source Dockerfile for special syntax that influences context generation.

If a line in the Dockerfile has the form # %include <path>, the relative path specified on that line will be matched against files in the source repository and added to the context under the path topsrcdir/. If an entry is a directory, we add all files under that directory.

If a line in the Dockerfile has the form # %ARG <name>, occurrences of the string $<name> in subsequent lines are replaced with the value found in the args argument. Exception: this doesn’t apply to VOLUME definitions.

Returns the SHA-256 hex digest of the created archive.

taskgraph.util.docker.docker_image(name: str, by_tag: bool = False) → str | None#

Resolve in-tree prebuilt docker image to <registry>/<repository>@sha256:<digest>, or <registry>/<repository>:<tag> if by_tag is True.

Parameters:

name (str) – The image to build.
by_tag (bool) – If True, will apply a tag based on VERSION file. Otherwise will apply a hash based on HASH file.

Returns:

Image if it can be resolved, otherwise None.

Return type:

Optional[str]

taskgraph.util.docker.generate_context_hash(topsrcdir, image_path, args=None)#: Generates a sha256 hash for context directory used to build an image.

taskgraph.util.docker.image_path(name, graph_config)#

taskgraph.util.docker.image_paths(graph_config)#: Return a map of image name to paths containing their Dockerfile.

taskgraph.util.docker.parse_volumes(image, graph_config)#: Parse VOLUME entries from a Dockerfile for an image.

taskgraph.util.docker.stream_context_tar(topsrcdir, context_dir, out_file, args=None)#: Like create_context_tar, but streams the tar file to the out_file file object.

taskgraph.util.hash module#

taskgraph.util.hash.hash_path(path)#

Hash a single file.

Returns the SHA-256 hash in hex form.

taskgraph.util.hash.hash_paths(base_path, patterns)#

Give a list of path patterns, return a digest of the contents of all the corresponding files, similarly to git tree objects or mercurial manifests.

Each file is hashed. The list of all hashes and file paths is then itself hashed to produce the result.

taskgraph.util.keyed_by module#

taskgraph.util.keyed_by.evaluate_keyed_by(value, item_name, attributes, defer=None, enforce_single_match=True)#

For values which can either accept a literal value, or be keyed by some attributes, perform that lookup and return the result.

For example, given item:

by-test-platform:
    macosx-10.11/debug: 13
    win.*: 6
    default: 12

a call to evaluate_keyed_by(item, ‘thing-name’, {‘test-platform’: ‘linux96’) would return 12.

Items can be nested as deeply as desired:

by-test-platform:
    win.*:
        by-project:
            ash: ..
            cedar: ..
    linux: 13
    default: 12

Parameters:

value (str) – Name of the value to perform evaluation on.
item_name (str) – Used to generate useful error messages.
attributes (dict) – Dictionary of attributes used to lookup ‘by-<key>’ with.
defer (list) – Allows evaluating a by-* entry at a later time. In the example above it’s possible that the project attribute hasn’t been set yet, in which case we’d want to stop before resolving that subkey and then call this function again later. This can be accomplished by setting defer=[“project”] in this example.
enforce_single_match (bool) – If True (default), each task may only match a single arm of the evaluation.

taskgraph.util.keyed_by.iter_dot_path(container: dict[str, Any], subfield: str) → Generator[tuple[dict[str, Any], str], None, None]#

Given a container and a subfield in dot path notation, yield the parent container of the dotpath’s leaf node, along with the leaf node name that it contains.

If the dot path contains a list object, each item in the list will be yielded.

Parameters:

container (dict) – The container to search for the dot path.
subfield (str) – The dot path to search for.

taskgraph.util.parameterization module#

taskgraph.util.parameterization.resolve_task_references(label: str, task_def: dict[str, Any], task_id: str, decision_task_id: str, dependencies: dict[str, str]) → dict[str, Any]#: Resolve all instances of {'task-reference': '..<..>..'} `` and ``{'artifact-reference`: '..<dependency/artifact/path>..'} in the given task definition, using the given dependencies.

taskgraph.util.parameterization.resolve_timestamps(now, task_def)#: Resolve all instances of {‘relative-datestamp’: ‘..’} in the given task definition

taskgraph.util.path module#

Like os.path, with a reduced set of functions, and with normalized path separators (always use forward slashes). Also contains a few additional utilities not found in os.path.

taskgraph.util.path.abspath(path)#

taskgraph.util.path.ancestors(path)#

Emit the parent directories of a path.

Parameters:: path (str) – Path to emit parents of.
Yields:: str – Path of parent directory.

taskgraph.util.path.basedir(path, bases)#: Given a list of directories (bases), return which one contains the given path. If several matches are found, the deepest base directory is returned.

basedir('foo/bar/baz', ['foo', 'baz', 'foo/bar']) returns 'foo/bar' (‘foo’ and ‘foo/bar’ both match, but ‘foo/bar’ is the deepest match)

taskgraph.util.path.basename(path)#

taskgraph.util.path.commonprefix(paths)#

taskgraph.util.path.dirname(path)#

taskgraph.util.path.join(*paths)#

taskgraph.util.path.match(path, pattern)#

Return whether the given path matches the given pattern. An asterisk can be used to match any string, including the null string, in one part of the path:

foo matches *, f* or fo*o

However, an asterisk matching a subdirectory may not match the null string:

foo/bar does not match foo/*/bar

If the pattern matches one of the ancestor directories of the path, the patch is considered matching:

foo/bar matches foo

Two adjacent asterisks can be used to match files and zero or more directories and subdirectories.

foo/bar matches foo/**/bar, or **/bar

taskgraph.util.path.normpath(path)#

taskgraph.util.path.normsep(path)#: Normalize path separators, by using forward slashes instead of whatever os.sep is.

taskgraph.util.path.realpath(path)#

taskgraph.util.path.rebase(oldbase, base, relativepath)#: Return relativepath relative to base instead of oldbase.

taskgraph.util.path.relpath(path, start)#

taskgraph.util.path.split(path)#: Return the normalized path as a list of its components.

split('foo/bar/baz') returns ['foo', 'bar', 'baz']

taskgraph.util.path.splitext(path)#

taskgraph.util.python_path module#

taskgraph.util.python_path.find_object(path: str)#

Find a Python object given a path of the form <modulepath>:<objectpath>. Conceptually equivalent to

def find_object(modulepath, objectpath):
import <modulepath> as mod return mod.<objectpath>

taskgraph.util.python_path.import_sibling_modules(exceptions=None)#

Import all Python modules that are siblings of the calling module.

Parameters:: exceptions (list) – A list of file names to exclude (caller and __init__.py are implicitly excluded).

taskgraph.util.readonlydict module#

class taskgraph.util.readonlydict.ReadOnlyDict(*args, **kwargs)#

Bases: dict

A read-only dictionary.

update([E, ]**F) → None. Update D from mapping/iterable E and F.#: If E is present and has a .keys() method, then does: for k in E.keys(): D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

taskgraph.util.schema module#

class taskgraph.util.schema.ArtifactReferenceSchema(artifact_reference: str)#

Bases: Schema

Reference to a task artifact (msgspec version).

artifact_reference: str#

class taskgraph.util.schema.IndexSearchOptimizationSchema(index_search: list[str])#

Bases: Schema

Search the index for the given index namespaces.

index_search: list[str]#

class taskgraph.util.schema.LegacySchema(*args, check=True, **kwargs)#

Bases: Schema

Operates identically to voluptuous.Schema, but applying some taskgraph-specific checks in the process.

extend(*args, **kwargs)#

Create a new Schema by merging this and the provided schema.

Neither this Schema nor the provided schema are modified. The resulting Schema inherits the required and extra parameters of this, unless overridden.

Both schemas must be dictionary-based.

Parameters:

schema – dictionary to extend this Schema with
required – if set, overrides required of this Schema
extra – if set, overrides extra of this Schema

class taskgraph.util.schema.OptimizationTypeSchema(index_search: list[str] | None = None, skip_unless_changed: list[str] | None = None)#

Bases: Schema

Schema that accepts various optimization configurations.

index_search: list[str] | None#

skip_unless_changed: list[str] | None#

class taskgraph.util.schema.Schema#

Bases: Struct

Base schema class that extends msgspec.Struct.

This allows schemas to be defined directly as:

class MySchema(Schema):
foo: str bar: int = 10

Instead of wrapping msgspec.Struct types. Most schemas use kebab-case renaming by default.

By default, forbid_unknown_fields is True, meaning extra fields will cause validation errors. Child classes can override this by setting forbid_unknown_fields=False in their class definition:

class MySchema(Schema, forbid_unknown_fields=False):
foo: str

classmethod validate(data)#: Validate data against this schema.

class taskgraph.util.schema.SkipUnlessChangedOptimizationSchema(skip_unless_changed: list[str])#

Bases: Schema

Skip this task if none of the given file patterns match.

skip_unless_changed: list[str]#

class taskgraph.util.schema.TaskRefTypeSchema(task_reference: str | None = None, artifact_reference: str | None = None)#

Bases: Schema

Schema that accepts either task-reference or artifact-reference (msgspec version).

artifact_reference: str | None#

task_reference: str | None#

class taskgraph.util.schema.TaskReferenceSchema(task_reference: str)#

Bases: Schema

Reference to another task (msgspec version).

task_reference: str#

taskgraph.util.schema.UnionTypes(*types)#: Use functools.reduce to simulate Union[*allowed_types] on older Python versions.

taskgraph.util.schema.check_schema(schema)#

taskgraph.util.schema.optionally_keyed_by(*arguments, use_msgspec=False)#

Mark a schema value as optionally keyed by any of a number of fields.

Parameters:

*arguments – Field names followed by the schema
use_msgspec – If True, return msgspec type hints; if False, return voluptuous validator

taskgraph.util.schema.resolve_keyed_by(item, field, item_name, defer=None, enforce_single_match=True, **extra_values)#

For values which can either accept a literal value, or be keyed by some other attribute of the item, perform that lookup and replacement in-place (modifying item directly). The field is specified using dotted notation to traverse dictionaries.

For example, given item:

task:
    test-platform: linux128
    chunks:
        by-test-platform:
            macosx-10.11/debug: 13
            win.*: 6
            default: 12

a call to resolve_keyed_by(item, ‘task.chunks’, item[‘thing-name’]) would mutate item in-place to:

task:
    test-platform: linux128
    chunks: 12

The item_name parameter is used to generate useful error messages.

If extra_values are supplied, they represent additional values available for reference from by-<field>.

Items can be nested as deeply as the schema will allow:

chunks:
    by-test-platform:
        win.*:
            by-project:
                ash: ..
                cedar: ..
        linux: 13
        default: 12

Parameters:

item (dict) – Object being evaluated.
field (str) – Name of the key to perform evaluation on.
item_name (str) – Used to generate useful error messages.
defer (list) – Allows evaluating a by-* entry at a later time. In the example above it’s possible that the project attribute hasn’t been set yet, in which case we’d want to stop before resolving that subkey and then call this function again later. This can be accomplished by setting defer=[“project”] in this example.
enforce_single_match (bool) – If True (default), each task may only match a single arm of the evaluation.
extra_values (kwargs) – If supplied, represent additional values available for reference from by-<field>.

Returns:

item which has also been modified in-place.

Return type:

dict

taskgraph.util.schema.validate_schema(schema, obj, msg_prefix)#

Validate that object satisfies schema. If not, generate a useful exception beginning with msg_prefix.

Parameters:

schema – A voluptuous.Schema or msgspec-based StructSchema type
obj – Object to validate
msg_prefix – Prefix for error messages

taskgraph.util.shell module#

taskgraph.util.shell.quote(*strings)#

Given one or more strings, returns a quoted string that can be used literally on a shell command line.

>>> quote('a', 'b')
"a b"
>>> quote('a b', 'c')
"'a b' c"

taskgraph.util.taskcluster module#

class taskgraph.util.taskcluster.TaskDefinitionsCache#

Bases: object

get_task_definition(task_id)#

get_task_definitions(task_ids)#

taskgraph.util.taskcluster.cancel_task(task_id)#: Cancels a task given a task_id. In testing mode, just logs that it would have cancelled.

taskgraph.util.taskcluster.find_task_id(index_path)#

taskgraph.util.taskcluster.find_task_id_batched(index_paths)#

Gets the task id of multiple tasks given their respective index.

Parameters:

index_paths (List[str]) – A list of task indexes.

Returns:

A dictionary object mapping each valid index path: to its respective task id.

Return type:

Dict[str, str]

See the endpoint here:: https://docs.taskcluster.net/docs/reference/core/index/api#findTasksAtIndex

taskgraph.util.taskcluster.get_ancestors(task_ids: list[str] | str) → dict[str, str]#

Gets the ancestor tasks of the given task_ids as a dictionary of taskid -> label.

Parameters:: task_ids (str or [str]) – A single task id or a list of task ids to find the ancestors of.
Returns:: A dict whose keys are task ids and values are task labels.
Return type:: dict

taskgraph.util.taskcluster.get_artifact(task_id, path)#

Returns the artifact with the given path for the given task id.

If the path ends with “.json” or “.yml”, the content is deserialized as, respectively, json or yaml, and the corresponding python data (usually dict) is returned. For other types of content, a file-like object is returned.

taskgraph.util.taskcluster.get_artifact_from_index(index_path, artifact_path)#

taskgraph.util.taskcluster.get_artifact_path(task, path)#

taskgraph.util.taskcluster.get_artifact_prefix(task)#

taskgraph.util.taskcluster.get_artifact_url(task_id, path, use_proxy=False)#

taskgraph.util.taskcluster.get_current_scopes()#: Get the current scopes. This only makes sense in a task with the Taskcluster proxy enabled, where it returns the actual scopes accorded to the task.

taskgraph.util.taskcluster.get_index_url(index_path, multiple=False, use_proxy=False)#

taskgraph.util.taskcluster.get_purge_cache_url(provisioner_id, worker_type)#

taskgraph.util.taskcluster.get_root_url(block_proxy=False)#

taskgraph.util.taskcluster.get_session()#

taskgraph.util.taskcluster.get_task_definition(task_id)#

taskgraph.util.taskcluster.get_task_definitions(task_ids)#

taskgraph.util.taskcluster.get_task_url(task_id)#

taskgraph.util.taskcluster.get_taskcluster_client(service: str)#

taskgraph.util.taskcluster.list_artifacts(task_id)#

taskgraph.util.taskcluster.list_task_group_incomplete_tasks(task_group_id)#: Generate the incomplete tasks in a task group

taskgraph.util.taskcluster.list_tasks(index_path)#: Returns a list of task_ids where each task_id is indexed under a path in the index. Results are sorted by expiration date from oldest to newest.

taskgraph.util.taskcluster.parse_time(timestamp)#: Turn a “JSON timestamp” as used in TC APIs into a datetime

taskgraph.util.taskcluster.purge_cache(provisioner_id, worker_type, cache_name)#: Requests a cache purge from the purge-caches service.

taskgraph.util.taskcluster.requests_retry_session(retries, backoff_factor=0.1, status_forcelist=(500, 502, 503, 504), concurrency=50, session=None, allowed_methods=None)#

taskgraph.util.taskcluster.rerun_task(task_id)#: Reruns a task given a task_id. In testing mode, just logs that it would have reran.

taskgraph.util.taskcluster.send_email(address, subject, content, link)#: Sends an email using the notify service

taskgraph.util.taskcluster.state_task(task_id)#

Gets the state of a task given a task_id.

In testing mode, just logs that it would have retrieved state. This is a subset of the data returned by status_task().

Parameters:

task_id (str) – A task id.

Returns:

The state of the task, one of: pending, running, completed, failed, exception, unknown.

Return type:

str

taskgraph.util.taskcluster.status_task(task_id)#

Gets the status of a task given a task_id.

In testing mode, just logs that it would have retrieved status and return an empty dict.

Parameters:

task_id (str) – A task id.

Returns:

A dictionary object as defined here:: https://docs.taskcluster.net/docs/reference/platform/queue/api#status

Return type:

dict

taskgraph.util.taskcluster.status_task_batched(task_ids)#

Gets the status of multiple tasks given task_ids.

In testing mode, just logs that it would have retrieved statuses.

Parameters:

task_id (List[str]) – A list of task ids.

Returns:

A dictionary object as defined here:: https://docs.taskcluster.net/docs/reference/platform/queue/api#statuses

Return type:

dict

taskgraph.util.taskgraph module#

Tools for interacting with existing taskgraphs.

taskgraph.util.taskgraph.find_decision_task(parameters, graph_config)#: Given the parameters for this action, find the taskId of the decision task

taskgraph.util.taskgraph.find_existing_tasks_from_previous_kinds(full_task_graph, previous_graph_ids, rebuild_kinds)#: Given a list of previous decision/action taskIds and kinds to ignore from the previous graphs, return a dictionary of labels-to-taskids to use as existing_tasks in the optimization step.

taskgraph.util.templates module#

taskgraph.util.templates.deep_get(dict_, field)#

taskgraph.util.templates.merge(*objects)#

Merge the given objects, using the semantics described for merge_to, with objects later in the list taking precedence. From an inheritance perspective, “parents” should be listed before “children”.

Returns the result without modifying any arguments.

taskgraph.util.templates.merge_to(source, dest)#

Merge dict and arrays (override scalar values)

Keys from source override keys from dest, and elements from lists in source are appended to lists in dest.

Parameters:

source (dict) – to copy from
dest (dict) – to copy to (modified in place)

taskgraph.util.templates.substitute(item: Any, **subs: dict[str, Any]) → Any#

taskgraph.util.templates.substitute_task_fields(task: dict[str, Any], fields: list[str], **subs: Any) → None#

taskgraph.util.time module#

exception taskgraph.util.time.InvalidString#: Bases: Exception

exception taskgraph.util.time.UnknownTimeMeasurement#: Bases: Exception

taskgraph.util.time.current_json_time(datetime_format=False)#

Parameters:: datetime_format (boolean) – Set True to get a datetime output
Returns:: JSON string representation of the current time.

taskgraph.util.time.days(value)#

taskgraph.util.time.hours(value)#

taskgraph.util.time.json_time_from_now(input_str, now=None, datetime_format=False)#

Parameters:

input_str (str) – Input string (see value of)
now (datetime) – Optionally set the definition of now
datetime_format (boolean) – Set True to get a datetime output

Returns:

JSON string representation of time in future.

taskgraph.util.time.minutes(value)#

taskgraph.util.time.months(value)#

taskgraph.util.time.seconds(value)#

taskgraph.util.time.value_of(input_str)#: Convert a string to a json date in the future :param str input_str: (ex: 1d, 2d, 6years, 2 seconds) :returns: Unit given in seconds

taskgraph.util.time.years(value)#

taskgraph.util.treeherder module#

taskgraph.util.treeherder.add_suffix(treeherder_symbol, suffix)#: Add a suffix to a treeherder symbol that may contain a group.

taskgraph.util.treeherder.inherit_treeherder_from_dep(task, dep_task)#: Inherit treeherder defaults from dep_task

taskgraph.util.treeherder.join_symbol(group, symbol)#: Perform the reverse of split_symbol, combining the given group and symbol. If the group is ‘?’, then it is omitted.

taskgraph.util.treeherder.replace_group(treeherder_symbol, new_group)#: Add a suffix to a treeherder symbol that may contain a group.

taskgraph.util.treeherder.split_symbol(treeherder_symbol)#: Split a symbol expressed as grp(sym) into its two parts. If no group is given, the returned group is ‘?’

taskgraph.util.treeherder.treeherder_defaults(kind, label)#

taskgraph.util.vcs module#

class taskgraph.util.vcs.GitRepository(path)#

Bases: Repository

property all_remote_names#: Name of all configured remote repositories.

property base_rev#: Hash of revision the current topic branch is based on.

property branch#: Current branch or bookmark the checkout has active.

property default_branch#: Name of the default branch.

property default_remote_name: str#: Name the VCS defines for the remote repository when cloning it for the first time. This name may not exist anymore if users changed the default configuration, for instance.

does_revision_exist_locally(revision)#

Check whether this revision exists in the local repository.

If this function returns an unexpected value, then make sure the revision was fetched from the remote repository.

find_latest_common_revision(base_ref_or_rev, head_rev)#

Find the latest revision that is common to both the given head_rev and base_ref_or_rev.

If no common revision exists, Repository.NULL_REVISION will be returned.

get_changed_files(diff_filter=None, mode=None, rev=None, base=None)#

Return a list of files that are changed in:

either this repository’s working copy,
or at a given revision (rev)
or between 2 revisions (base and rev)

diff_filter controls which kinds of modifications are returned. It is a string which may only contain the following characters:

A - Include files that were added D - Include files that were deleted M - Include files that were modified

By default, all three will be included.

mode can be one of ‘unstaged’, ‘staged’ or ‘all’. Only has an effect on git. Defaults to ‘unstaged’.

rev is a specifier for which changesets to consider for changes. The exact meaning depends on the vcs system being used.

base specifies the range of changesets. This parameter cannot be used without rev. The range includes rev but excludes base.

get_commit_message(revision=None)#: Commit message of specified revision or current commit.

get_outgoing_files(diff_filter='ADM', upstream=None)#

Return a list of changed files compared to upstream.

diff_filter works the same as get_changed_files. upstream is a remote ref to compare against. If unspecified, this will be determined automatically. If there is no remote ref, a MissingUpstreamRepo exception will be raised.

get_tracked_files(*paths, rev=None)#

Return list of tracked files.

*paths are path specifiers to limit results to. rev is a revision specifier at which to retrieve the files. Defaults to the parent of the working copy if unspecified.

get_url(remote=None)#: Get URL of the upstream repository.

property head_rev#: Hash of HEAD revision.

property is_shallow#: Whether this repo is a shallow clone.

property remote_name#: Name of the remote repository.

property tool: str#: Version control system being used, either ‘hg’ or ‘git’.

update(ref)#: Update the working directory to the specified reference.

working_directory_clean(untracked=False, ignored=False)#

Determine if the working directory is free of modifications.

Returns True if the working directory does not have any file modifications. False otherwise.

By default, untracked and ignored files are not considered. If untracked or ignored are set, they influence the clean check to factor these file classes into consideration.

class taskgraph.util.vcs.HgRepository(*args, **kwargs)#

Bases: Repository

property all_remote_names#: Name of all configured remote repositories.

property base_rev#: Hash of revision the current topic branch is based on.

property branch#: Current branch or bookmark the checkout has active.

property default_branch#: Name of the default branch.

property default_remote_name: str#: Name the VCS defines for the remote repository when cloning it for the first time. This name may not exist anymore if users changed the default configuration, for instance.

does_revision_exist_locally(revision)#

Check whether this revision exists in the local repository.

If this function returns an unexpected value, then make sure the revision was fetched from the remote repository.

find_latest_common_revision(base_ref_or_rev, head_rev)#

Find the latest revision that is common to both the given head_rev and base_ref_or_rev.

If no common revision exists, Repository.NULL_REVISION will be returned.

get_changed_files(diff_filter=None, mode=None, rev=None, base=None)#

Return a list of files that are changed in:

either this repository’s working copy,
or at a given revision (rev)
or between 2 revisions (base and rev)

diff_filter controls which kinds of modifications are returned. It is a string which may only contain the following characters:

A - Include files that were added D - Include files that were deleted M - Include files that were modified

By default, all three will be included.

mode can be one of ‘unstaged’, ‘staged’ or ‘all’. Only has an effect on git. Defaults to ‘unstaged’.

rev is a specifier for which changesets to consider for changes. The exact meaning depends on the vcs system being used.

base specifies the range of changesets. This parameter cannot be used without rev. The range includes rev but excludes base.

get_commit_message(revision=None)#: Commit message of specified revision or current commit.

get_outgoing_files(diff_filter='ADM', upstream=None)#

Return a list of changed files compared to upstream.

get_tracked_files(*paths, rev=None)#

Return list of tracked files.

*paths are path specifiers to limit results to. rev is a revision specifier at which to retrieve the files. Defaults to the parent of the working copy if unspecified.

get_url(remote=None)#: Get URL of the upstream repository.

property head_rev#: Hash of HEAD revision.

property is_shallow#: Whether this repo is a shallow clone.

property remote_name#: Name of the remote repository.

property tool: str#: Version control system being used, either ‘hg’ or ‘git’.

update(ref)#: Update the working directory to the specified reference.

working_directory_clean(untracked=False, ignored=False)#

Determine if the working directory is free of modifications.

Returns True if the working directory does not have any file modifications. False otherwise.

By default, untracked and ignored files are not considered. If untracked or ignored are set, they influence the clean check to factor these file classes into consideration.

class taskgraph.util.vcs.Repository(path)#

Bases: ABC

NULL_REVISION = '0000000000000000000000000000000000000000'#

abstract property all_remote_names: list[str]#: Name of all configured remote repositories.

abstract property base_rev: str#: Hash of revision the current topic branch is based on.

abstract property branch: str | None#: Current branch or bookmark the checkout has active.

abstract property default_branch: str#: Name of the default branch.

abstract property default_remote_name: str#: Name the VCS defines for the remote repository when cloning it for the first time. This name may not exist anymore if users changed the default configuration, for instance.

abstractmethod does_revision_exist_locally(revision: str) → bool#

Check whether this revision exists in the local repository.

If this function returns an unexpected value, then make sure the revision was fetched from the remote repository.

abstractmethod find_latest_common_revision(base_ref_or_rev: str, head_rev: str) → str#

Find the latest revision that is common to both the given head_rev and base_ref_or_rev.

If no common revision exists, Repository.NULL_REVISION will be returned.

abstractmethod get_changed_files(diff_filter: str | None, mode: str | None, rev: str | None, base: str | None) → list[str]#

Return a list of files that are changed in:

either this repository’s working copy,
or at a given revision (rev)
or between 2 revisions (base and rev)

diff_filter controls which kinds of modifications are returned. It is a string which may only contain the following characters:

A - Include files that were added D - Include files that were deleted M - Include files that were modified

By default, all three will be included.

mode can be one of ‘unstaged’, ‘staged’ or ‘all’. Only has an effect on git. Defaults to ‘unstaged’.

rev is a specifier for which changesets to consider for changes. The exact meaning depends on the vcs system being used.

base specifies the range of changesets. This parameter cannot be used without rev. The range includes rev but excludes base.

abstractmethod get_commit_message(revision: str | None) → str#: Commit message of specified revision or current commit.

abstractmethod get_outgoing_files(diff_filter: str, upstream: str) → list[str]#

Return a list of changed files compared to upstream.

abstractmethod get_tracked_files(*paths: str, rev: str | None = None) → list[str]#

Return list of tracked files.

*paths are path specifiers to limit results to. rev is a revision specifier at which to retrieve the files. Defaults to the parent of the working copy if unspecified.

abstractmethod get_url(remote: str | None) → str#: Get URL of the upstream repository.

abstract property head_rev: str#: Hash of HEAD revision.

abstract property is_shallow: str#: Whether this repo is a shallow clone.

abstract property remote_name: str#: Name of the remote repository.

run(*args: str, **kwargs) → str#

abstract property tool: str#: Version control system being used, either ‘hg’ or ‘git’.

abstractmethod update(ref: str) → None#: Update the working directory to the specified reference.

abstractmethod working_directory_clean(untracked: bool | None = False, ignored: bool | None = False) → bool#

Determine if the working directory is free of modifications.

Returns True if the working directory does not have any file modifications. False otherwise.

By default, untracked and ignored files are not considered. If untracked or ignored are set, they influence the clean check to factor these file classes into consideration.

taskgraph.util.vcs.get_repository(path: str)#: Get a repository object for the repository at path. If path is not a known VCS repository, raise an exception.

taskgraph.util.verify module#

class taskgraph.util.verify.GraphConfigVerification(func: Callable)#

Bases: Verification

verify(graph_config: GraphConfig)#

class taskgraph.util.verify.GraphVerification(func: Callable, run_on_projects: list | None = None)#

Bases: Verification

Verification for a TaskGraph object.

run_on_projects: list | None = None#

verify(graph: TaskGraph, graph_config: GraphConfig, parameters: Parameters)#

class taskgraph.util.verify.InitialVerification(func: Callable)#

Bases: Verification

Verification that doesn’t depend on any generation state.

verify()#

class taskgraph.util.verify.KindsVerification(func: Callable)#

Bases: Verification

Verification for kinds.

verify(kinds: dict)#

class taskgraph.util.verify.ParametersVerification(func: Callable)#

Bases: Verification

Verification for a set of parameters.

verify(parameters: Parameters)#

class taskgraph.util.verify.Verification(func: Callable)#

Bases: ABC

func: Callable#

abstractmethod verify(**kwargs) → None#

class taskgraph.util.verify.VerificationSequence(_verifications: dict = <factory>)#

Bases: object

Container for a sequence of verifications over a TaskGraph. Each verification is represented as a callable taking (task, taskgraph, scratch_pad), called for each task in the taskgraph, and one more time with no task but with the taskgraph and the same scratch_pad that was passed for each task.

add(name, **kwargs)#

taskgraph.util.verify.verify_always_optimized(task, taskgraph, scratch_pad, graph_config, parameters)#: This function ensures that always-optimized tasks have been optimized.

taskgraph.util.verify.verify_caches_are_volumes(task, taskgraph, scratch_pad, graph_config, parameters)#

Ensures that all cache paths are defined as volumes.

Caches and volumes are the only filesystem locations whose content isn’t defined by the Docker image itself. Some caches are optional depending on the task environment. We want paths that are potentially caches to have as similar behavior regardless of whether a cache is used. To help enforce this, we require that all paths used as caches to be declared as Docker volumes. This check won’t catch all offenders. But it is better than nothing.

taskgraph.util.verify.verify_dependency_tiers(task, taskgraph, scratch_pad, graph_config, parameters)#

taskgraph.util.verify.verify_index_route(task, taskgraph, scratch_pad, graph_config, parameters)#: This function ensures that routes do not contain forward slashes.

taskgraph.util.verify.verify_routes_notification_filters(task, taskgraph, scratch_pad, graph_config, parameters)#

This function ensures that only understood filters for notifications are specified.

See: https://docs.taskcluster.net/reference/core/taskcluster-notify/docs/usage

taskgraph.util.verify.verify_run_task_caches(task, taskgraph, scratch_pad, graph_config, parameters)#

Audit for caches requiring run-task.

run-task manages caches in certain ways. If a cache managed by run-task is used by a non run-task task, it could cause problems. So we audit for that and make sure certain cache names are exclusive to run-task.

IF YOU ARE TEMPTED TO MAKE EXCLUSIONS TO THIS POLICY, YOU ARE LIKELY CONTRIBUTING TECHNICAL DEBT AND WILL HAVE TO SOLVE MANY OF THE PROBLEMS THAT RUN-TASK ALREADY SOLVES. THINK LONG AND HARD BEFORE DOING THAT.

taskgraph.util.verify.verify_task_dependencies(task, taskgraph, scratch_pad, graph_config, parameters)#: Ensures that tasks don’t have more than 100 dependencies.

taskgraph.util.verify.verify_task_graph_symbol(task, taskgraph, scratch_pad, graph_config, parameters)#: This function verifies that tuple (collection.keys(), machine.platform, groupSymbol, symbol) is unique for a target task graph.

taskgraph.util.verify.verify_task_identifiers(task, taskgraph, scratch_pad, graph_config, parameters)#: Ensures that all tasks have well defined identifiers: ^[a-zA-Z0-9_-]{1,38}$

taskgraph.util.verify.verify_toolchain_alias(task, taskgraph, scratch_pad, graph_config, parameters)#: This function verifies that toolchain aliases are not reused.

taskgraph.util.verify.verify_trust_domain_v2_routes(task, taskgraph, scratch_pad, graph_config, parameters)#: This function ensures that any two tasks have distinct index.{trust-domain}.v2 routes.

taskgraph.util.workertypes module#

taskgraph.util.workertypes.get_worker_type(graph_config, alias, level)#: Get the worker type based, evaluating aliases from the graph config.

taskgraph.util.workertypes.worker_type_implementation(graph_config, worker_type)#: Get the worker implementation and OS for the given workerType, where the OS represents the host system, not the target OS, in the case of cross-compiles.

taskgraph.util.yaml module#

class taskgraph.util.yaml.UnicodeLoader(stream)#

Bases: CSafeLoader

construct_yaml_str(node)#

yaml_constructors = {'tag:yaml.org,2002:binary': <function SafeConstructor.construct_yaml_binary>, 'tag:yaml.org,2002:bool': <function SafeConstructor.construct_yaml_bool>, 'tag:yaml.org,2002:float': <function SafeConstructor.construct_yaml_float>, 'tag:yaml.org,2002:int': <function SafeConstructor.construct_yaml_int>, 'tag:yaml.org,2002:map': <function SafeConstructor.construct_yaml_map>, 'tag:yaml.org,2002:null': <function SafeConstructor.construct_yaml_null>, 'tag:yaml.org,2002:omap': <function SafeConstructor.construct_yaml_omap>, 'tag:yaml.org,2002:pairs': <function SafeConstructor.construct_yaml_pairs>, 'tag:yaml.org,2002:seq': <function SafeConstructor.construct_yaml_seq>, 'tag:yaml.org,2002:set': <function SafeConstructor.construct_yaml_set>, 'tag:yaml.org,2002:str': <function UnicodeLoader.construct_yaml_str>, 'tag:yaml.org,2002:timestamp': <function SafeConstructor.construct_yaml_timestamp>, None: <function SafeConstructor.construct_undefined>}#

taskgraph.util.yaml.load_stream(stream)#: Parse the first YAML document in a stream and produce the corresponding Python object.

taskgraph.util.yaml.load_yaml(*parts)#: Convenience function to load a YAML file in the given path. This is useful for loading kind configuration files from the kind path.

taskgraph.util package

Contents

taskgraph.util package#

Submodules#

taskgraph.util.archive module#

taskgraph.util.attributes module#

taskgraph.util.cached_tasks module#

taskgraph.util.dependencies module#

taskgraph.util.docker module#

taskgraph.util.hash module#

taskgraph.util.keyed_by module#

taskgraph.util.parameterization module#

taskgraph.util.path module#

taskgraph.util.python_path module#

taskgraph.util.readonlydict module#

taskgraph.util.schema module#

taskgraph.util.shell module#

taskgraph.util.taskcluster module#

taskgraph.util.taskgraph module#

taskgraph.util.templates module#

taskgraph.util.time module#

taskgraph.util.treeherder module#

taskgraph.util.vcs module#

taskgraph.util.verify module#

taskgraph.util.workertypes module#

taskgraph.util.yaml module#

Module contents#