In this article we will pick up where we left off in the previous article in which we deployed our Tekton Pipelines environment and we will explore in detail how to find, build and customize Tekton Tasks to create all the necessary building blocks for our pipelines. On top of that, we will also look at how to maintain and test our newly built Tasks, while using all the best practices for creating reusable, testable, well-structured and simple tasks.
If you haven't do so yet, then go checkout the previous article to get your Tekton development environment up and running, so you can follow along with the examples in this one.
Note: All the the code and resources used in this article are available in
What Are Tasks?
Tekton Tasks are the basic building blocks of Pipelines. Task is a sequence of steps that performs some particular, well... task. Each of the steps in a Task is a container inside Task's Pod. Isolating this kind of sequence for related steps into single reusable Task provides Tekton with a lot of versatility and flexibility. They can be as simple as running single
echo command or as complex as Docker build followed push to registry finished by image digest output.
Apart from Tasks, ClusterTasks are also available. They're not much different from basic Tasks as they're just a cluster-scoped Tasks. These are useful for general purpose Tasks that perform basic operations such as cloning repository or running
kubectl commands. Using ClusterTasks helps avoid duplication of code and helps with reusability. Be mindful of modifications to ClusterTasks though, as any changes to them might impact many other pipelines in all the other namespaces in your cluster.
When we want to execute a Task or ClusterTask we create TaskRun. In programming terms you could also think of a Task as a class and TaskRun as it's instance.
If the above explanation isn't clear enough, then little example might help. Here's the simplest possible Task we can create:
- name: hello
script: echo 'Hello world!'
This simple Task called
echo really does just that - it runs a
ubuntu container and inject script into it that executes
echo 'Hello world!'. Now that we have a task, we can also run it, or in other words create TaskRun. We can create a YAML file for that and apply it or we can also use
~ $ tkn task start --filename echo.yaml
TaskRun started: echo-run-jqn9l
In order to track the TaskRun progress run:
tkn taskrun logs echo-run-jqn9l -f -n default
~ $ tkn taskrun logs echo-run-jqn9l -f -n default
[hello] + echo Hello world!
[hello] Hello world!
And it's simple as that! We've run our first Task, so let's now move onto something a bit more useful and explore what Tasks are already out there...
Don't Reinvent a Wheel
This article is about creating and customizing Tekton Tasks, but let's not try to reinvent a wheel here. Instead let's use what Tekton community already created. The main source for existing Tasks that are ready to be used is Tekton Catalog. It's a repository of reliable, curated Tasks reviewed by Tekton maintainers. In addition to Tekton Catalog repository you can also use Tekton Hub which lists all the same Tasks as the catalog but in a bit easier to navigate view. It also lists rating of each Task, which might be a helpful indicator of quality.
In this catalog, you should be able to find all the basic stuff like Tasks for fetching repository (
git-clone), building and pushing Docker images (
buildah) or sending Slack notifications (
send-to-webhook-slack). So, before you decide to build custom Tasks try checking the catalog for existing solutions to common problems.
If you browsed through the Tekton catalog or Tekton hub, you probably noticed that installation of each Task is just a single
kubectl apply -f .... That's easy enough but if you rely on many of these Tasks and want to track them in version control without copy-pasting all of their YAMLs, then you can use convenient script in
tekton-kickstarter repository which will take list of remote YAML URLs, such as:
And with invocation of
make catalog, apply them to your cluster.
Before we start making custom tasks, it's a good idea to decide on layout which would make them easy to navigate, test and deploy. We can take a bit of an inspiration from Tekton Catalog repository and use the following directory structure:
├── tasks - Custom or remotely retrieved Tasks and ClusterTasks
│ ├── catalog.yaml - List of Tasks retrieved from remote registries (e.g. Tekton catalog)
│ └── task-name - Some custom Task
| ├── task-name.yaml - File containing Task or ClusterTask
| └── tests - Directory with files for testing
| ├── resources.yaml - Resources required for testing, e.g. PVC, Deployment
| └── run.yaml - TaskRun(s) that performs the test
We store all the tasks in single directory called
tasks. In there we create one directory for each Task, which will contain one YAML file containing the Task itself and one more directory (
tests) with resources required for testing. Those would be TaskRun(s) in
run.yaml and any additional resources needed to perform the test inside
resources.yaml. Those could be - for example - PVC for Task performing DB backup or Deployment for task that performs application scaling.
One more file shown in the structure above which we mentioned already in previous section is the
catalog.yaml, which holds list of task that are to be installed from remote sources.
For convenience (if using
tekton-kickstarter) all of these can be installed with one command, which is
make deploy-tasks which traverses
tasks directory and applies all the Tasks to your cluster while omitting all the testing resources.
Building Custom Ones
If you can't find appropriate Task for the the job in the catalog, then it's time to write your own. In the beginning of the article I showed very simple "Hello world." example, but Tekton Tasks can get much more complex, so let's go over all the configuration options and features that we can leverage.
Let's start simple and introduce the basics that we will need in pretty much any Task that we build. Those are Task parameters and scripts for Task steps:
- name: name
description: Deployment name.
- name: namespace
- name: rollout
kubectl rollout restart deployment/$(params.name) -n $(params.namespace)
With the above
deploy Task we can perform simple rollout of Kubernetes Deployment, by supplying it with name of the Deployment in
name parameter and optionally
namespace in which it resides. These parameters that we pass to the Task are then used in
script section where they're expanded before the script is executed. To tell Tekton to expand the parameter we use the
$(params.name) notation. This can be also used in other parts of spec, not just in the script.
Now, let's take a closer look at the
script section - we begin it with shebang to make sure that we will be using
bash. However, that doesn't mean that you always have to use
bash, same way you could use for example Python with
#!/usr/bin/env python, it all depend on your preference and on what's available in the image being used. After shebang we also use
set -xe which tells the script to echo each command being executed - you don't have to do this, but it can be very helpful during debugging.
Alternatively, if you don't need whole script, but just a single command, then you can replace
script section with
command. This is how it would like like for simple Task that performs application health check using
kubectl wait (note: omitting obvious/non-relevant parts of Task body for remaining examples):
- name: wait
This works the same way as with
command directive in pods and so it can get verbose and hard to read if you have many arguments as in the example above. For this reason I prefer to use
script for almost everything, as it's more readable and easier to update/change.
Another common thing that you might need in your Tasks is some kind of a storage where you can write data that can be used by subsequent steps in the Task or by other Tasks in the pipeline. The most common use case for this would be a place to fetch git repo. This kind of a storage is called
workspace in Tekton and the following example shows a Tasks that mounts and clears the storage using
- name: path
description: Path to directory being deleted.
- name: source
- image: ubuntu
rm -rf "$(workspaces.source.path)/$(params.path)"
clean Task includes
workspace section that defines name of the workspace and path where it should be mounted. To make it easier to update the
mountPath, Tekton provides variable in a format
$(workspaces.ws-name.path) which can be used in scripts to reference the path.
So, defining the workspace is pretty simple, but the disk backing the workspace won't appear out of thin air. Therefore, when we execute Task that asks for a workspace, we also need to create PVC for it. The TaskRun with creation of needed PVC for the above Task would look like so:
- name: path
name: clean # Reference to the task
kind: ClusterTask # In case of ClusterTask we need to explicitly specify `kind`
- name: source
volumeClaimTemplate: # PVC created this way will be cleaned-up after TaskRun/PipelineRun completes!
Workspaces are very generic and therefore can be also used not just to store some ephemeral data during Task/Pipeline run, but also as a long term storage like in the following example:
- name: HOST
- name: DATABASE
- name: DEST
- name: backup
- name: pg-dump
- name: USERNAME
- name: PASSWORD
PGPASSWORD="$PASSWORD" pg_dump -h $(params.HOST) -Fc -U $USERNAME $(params.DATABASE) > $(workspaces.backup.path)/$(params.DEST)
This Task can perform a PostgreSQL database backup using
pg_dump utility. In this example, the script grabs the database data from host and DB specified in parameters and streams it into workspaces which is backed by PVC.
Another feature that I sneaked into this example that you will encounter quite often is ability to inject environment variables from ConfigMaps or Secrets. This is done in
env section. This works exactly the same way as with Pods, so you can refer to that part of an API for this.
Back to the main topic of this example - PVC for database backup - considering that we want this data to be persistent, we cannot use PVC created using
volumeClaimTemplate as shown previously as that would get cleaned up after Task finishes, so instead we need to create the PVC separately and pass it to Task this way:
- name: HOST
- name: DATABASE
- name: DEST
- name: backup
Here we use
persistentVolumeClaim instead of
volumeClaimTemplate and we specify name of existing PVC which is also defined in the above snippet. This example also assumes that there's a PostgreSQL database running at specified host - for full code including PostgreSQL deployment checkout files here.
Similarly to injection of environment variables, we can also use workspaces to inject whole ConfigMaps or Secrets (or some keys in them) as a file. This can be useful for example when you want whole
.pem certificate from Secret in the Task or as a config file that maps GitHub repository to application name as in following example:
- name: repository-url
- name: mapping-file
- name: get-application-name
yq e '."$(params.repository-url)"' /config/$(params.mapping-file) | tr -d '\012\015' > /tekton/results/application-name
- name: application-name # Can be accessed by other Tasks with $(tasks.get-application-name.results.application-name)
- name: config
This Task takes repository URL and uses it to look up matching application name in the file which is part of the above shown ConfigMap. This is done using
yq utility followed by output to a file in special directory called
/tekton/results/.... This directory stores
results of Tasks which is a feature (and YAML section) we haven't mentioned yet.
Task results are small pieces of data that Task can output, which then can be used by subsequent Tasks. To use these, one has to specify the name of the result variable in
results section and then write something to
/tekton/results/result-var-name. Also, as you surely noticed in the above script, we strip newline from the result using
tr before writing it into the file, that's because the result should be simple output - ideally just single word - and not a big chunk of text. If you decide to write something longer (with multiple lines) to the result, you might end-up losing part of the value or seeing only empty string if you don't strip the newline character.
To then use Task - such as this one - that uses workspace from ConfigMap, we have to specify the config map in
workspaces section in the following way:
# Full example at https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/get-application-name/tests/run.yaml
- name: config
Final thing I want to show here is the usage of sidecar containers. These are not so common, but can be useful when you need to run some service (which your Task depends on) for the duration of Task execution. One such service can be Docker daemon sidecar with exposed socket. To demonstrate this we can create a Task that performs Docker image efficiency analysis using tool called Dive:
- name: IMAGE
description: Name (reference) of the image to analyze.
- image: 'wagoodman/dive:latest'
- name: CI
- mountPath: /var/run/
- image: 'docker:18.05-dind'
- mountPath: /var/lib/docker
- mountPath: /var/run/
- name: dind-storage
- name: dind-socket
As you can see here, the
sidecars section is pretty much the same as definition of any container in Pod spec. In addition to the usual stuff we've seen in previous examples, here we also specify
volumes which are shared between sidecar and container in Tasks steps - in this case one of them being Docker storage in
dind-socket which Dive container attaches to.
This covers most of the features of Tekton Tasks, but one thing I didn't mention so far and you will surely run into when reading Tekton docs is PipelineResource object, which can be used as Tasks input or output - for example GitHub sources as input or Docker image as output. So, why haven't I mentioned it yet? Well, PipelineResource is part of Tekton that I prefer to not use for a couple of reasons:
- It's still in alpha unlike all the other resource types we used so far
- There are very few PipelineResources. It's mostly just Git, Pull request and Image resources.
- It's hard to troubleshoot them.
If you need more reason not to use them (for now), then take a look at docs section here.
In all these examples that we went over, we've seen a lot of YAML sections, options and features which shows how flexible Tekton is, this however makes it's API spec - naturally - very complex.
kubectl explain unfortunately doesn't help with exploring API, but API spec is available in docs but is lacking at best. So, in case you have trouble finding what can you put in which part of YAML, then your best bet is to rely on fields listed at the beginning of Tasks doc or on examples here, but make sure you're on right branch for your version of Tekton, otherwise you might spend long hours debugging why your seemingly correct Task cannot be validated by Tekton controller.
Running and Testing
So far, we mostly just talked about Tasks and not much was said about TaskRuns. That's because - in my opinion - individual TaskRuns are best suited for testing and not really for running the Tasks regularly. For that you should use pipelines which will a topic of next article.
Speaking of testing - when we're done implementing the Task, then it's time to run some tests. For straightforward and easy testing, I recommend using the layout mentioned earlier in the article. Using it should help you encapsulate the Task in a way that allows you to test it independently of any resources outside of it's directory.
To then perform the actual test it's enough to apply resources/dependencies in
.../tests/resources.yaml (if any) and then apply the actual test(s) inside
.../tests/run.yaml. The tests really are just set of TaskRuns that use your custom Task, so for this basic testing approach there's no need for any setup/teardown or additional scripts - just
kubectl apply -f resources.yaml and
kubectl apply -f run.yaml. Examples of simple tests can be found each Task's directory in tekton-kickstarter or in Tekton Catalog repository.
In general though, for any particular Task in repositories of both of these projects, you can run the following commands to perform the test:
kubectl apply -f tests/resources.yaml # Deploy dependencies
kubectl apply -f tests/run.yaml # Start tests
kubectl get tr # List the TaskRuns (tests)
NAME SUCCEEDED REASON STARTTIME COMPLETIONTIME
some-taskrun True Succeeded 106s 4s
tkn tr logs some-taskrun # View logs from TaskRun
For me personally - when it comes to testing Tasks - it's sufficient to use the above basic testing approach for validation and ad-hoc testing. If you however end-up creating large number of custom Tasks and want to go all out, then you can adopt the approach in Tekton catalog and leverage testing scripts in it's repository. If you decide to go that route and strictly follow the layout and testing, you might also want to try contributing the Tasks to the Tekton Catalog, so that whole community can benefit from more high quality Tasks. 😉
As for the scripts that you'll need for this - you'll need to include
test directory as well as Go dependencies (
vendor directory) from Tekton Catalog in your code and then follow E2E testing guide in docs here.
Regardless of whether you choose basic or "all-out" approach for your testing though, try to make sure that you test more than just happy paths in the Tasks and Pipelines, otherwise you might end-up seeing a lot of bugs when they get deployed "in the wild".
After you have implemented and tested your custom Tasks, it's a good idea to go back and make sure your Tasks are following the best development practices, which will make them more reusable and maintainable in a long run.
The simplest thing that you can do that will also give the most benefit - is to use
yamllint - a linter for YAML files. This tip doesn't apply only to Tekton Tasks, but rather to all YAML files as they can get tricky to get right with all the indentation, but it's especially important with Task definitions can get very long and complex, with many levels of indentation, so keeping them readable and validated can save you some unnecessary debugging as well as help you in keeping them more maintainable. You can find a custom
.yamlint config in my repository which I like to use, but you should customize it to suit your code style and formatting. Just make sure you run
yamllint from time to time (ideally in CI/CD) to keep things linted and validated.
As for the actual Tekton best practices - I could give you a huge list, but it would be mostly just things recommended by Tekton maintainers, so instead of copy-pasting it here, I will just point you to the relevant resources:
In this article we saw how flexible and versatile Tekton is and with this flexibility also comes complexity of building or testing Tasks. Therefore, it's very much preferable to use existing Tasks created by community, rather then try to reinvent a wheel yourself. If there's however no suitable Task available and you have to build your own, make sure you write tests for your Tasks and follow best practices mentioned above to keep your Tasks maintainable and reliable.
After this article we should have enough experience as well as bunch of individual custom Tasks which we can use to start composing our pipelines. And that's exactly what we're going to do in the next article in these series, where we will explore how to build fully-featured pipelines to build, deploy, test your applications and much more.
Also if you haven't done so yet, make sure you checkout tekton-kickstarter repository where you can find all the Tasks and examples from this article as well as the pipelines you will see in the next one. 😉