In this article we will pick up where we left off in the previous article in which we deployed our Tekton Pipelines environment and we will explore in detail how to find, build and customize Tekton Tasks to create all the necessary building blocks for our pipelines. On top of that, we will also look at how to maintain and test our newly built Tasks, while using all the best practices for creating reusable, testable, well-structured and simple tasks.
If you haven't do so yet, then go checkout the previous article to get your Tekton development environment up and running, so you can follow along with the examples in this one.
Note: All the the code and resources used in this article are available in tekton-kickstarter
repository
What Are Tasks?
Tekton Tasks are the basic building blocks of Pipelines. Task is a sequence of steps that performs some particular, well... task. Each of the steps in a Task is a container inside Task's Pod. Isolating this kind of sequence for related steps into single reusable Task provides Tekton with a lot of versatility and flexibility. They can be as simple as running single echo
command or as complex as Docker build followed push to registry finished by image digest output.
Apart from Tasks, ClusterTasks are also available. They're not much different from basic Tasks as they're just a cluster-scoped Tasks. These are useful for general purpose Tasks that perform basic operations such as cloning repository or running kubectl
commands. Using ClusterTasks helps avoid duplication of code and helps with reusability. Be mindful of modifications to ClusterTasks though, as any changes to them might impact many other pipelines in all the other namespaces in your cluster.
When we want to execute a Task or ClusterTask we create TaskRun. In programming terms you could also think of a Task as a class and TaskRun as it's instance.
If the above explanation isn't clear enough, then little example might help. Here's the simplest possible Task we can create:
# echo.yaml
apiVersion: tekton.dev/v1beta1
kind: Task
metadata:
name: echo
spec:
steps:
- name: hello
image: ubuntu
script: echo 'Hello world!'
This simple Task called echo
really does just that - it runs a ubuntu
container and inject script into it that executes echo 'Hello world!'
. Now that we have a task, we can also run it, or in other words create TaskRun. We can create a YAML file for that and apply it or we can also use tkn
CLI:
~ $ tkn task start --filename echo.yaml
TaskRun started: echo-run-jqn9l
In order to track the TaskRun progress run:
tkn taskrun logs echo-run-jqn9l -f -n default
~ $ tkn taskrun logs echo-run-jqn9l -f -n default
[hello] + echo Hello world!
[hello] Hello world!
And it's simple as that! We've run our first Task, so let's now move onto something a bit more useful and explore what Tasks are already out there...
Don't Reinvent a Wheel
This article is about creating and customizing Tekton Tasks, but let's not try to reinvent a wheel here. Instead let's use what Tekton community already created. The main source for existing Tasks that are ready to be used is Tekton Catalog. It's a repository of reliable, curated Tasks reviewed by Tekton maintainers. In addition to Tekton Catalog repository you can also use Tekton Hub which lists all the same Tasks as the catalog but in a bit easier to navigate view. It also lists rating of each Task, which might be a helpful indicator of quality.
In this catalog, you should be able to find all the basic stuff like Tasks for fetching repository (git-clone
), building and pushing Docker images (kaniko
or buildah
) or sending Slack notifications (send-to-webhook-slack
). So, before you decide to build custom Tasks try checking the catalog for existing solutions to common problems.
If you browsed through the Tekton catalog or Tekton hub, you probably noticed that installation of each Task is just a single kubectl apply -f ...
. That's easy enough but if you rely on many of these Tasks and want to track them in version control without copy-pasting all of their YAMLs, then you can use convenient script in tekton-kickstarter
repository which will take list of remote YAML URLs, such as:
# catalog.yaml
git-clone: 'https://raw.githubusercontent.com/tektoncd/catalog/master/task/git-clone/0.2/git-clone.yaml'
send-to-webhook-slack: 'https://raw.githubusercontent.com/.../send-to-webhook-slack/0.1/send-to-webhook-slack.yaml'
skopeo-copy: 'https://raw.githubusercontent.com/tektoncd/catalog/master/task/skopeo-copy/0.1/skopeo-copy.yaml'
buildid: 'https://raw.githubusercontent.com/tektoncd/catalog/master/task/generate-build-id/0.1/generate-build-id.yaml'
And with invocation of make catalog
, apply them to your cluster.
Layout
Before we start making custom tasks, it's a good idea to decide on layout which would make them easy to navigate, test and deploy. We can take a bit of an inspiration from Tekton Catalog repository and use the following directory structure:
...
├── tasks - Custom or remotely retrieved Tasks and ClusterTasks
│ ├── catalog.yaml - List of Tasks retrieved from remote registries (e.g. Tekton catalog)
│ └── task-name - Some custom Task
| ├── task-name.yaml - File containing Task or ClusterTask
| └── tests - Directory with files for testing
| ├── resources.yaml - Resources required for testing, e.g. PVC, Deployment
| └── run.yaml - TaskRun(s) that performs the test
We store all the tasks in single directory called tasks
. In there we create one directory for each Task, which will contain one YAML file containing the Task itself and one more directory (tests
) with resources required for testing. Those would be TaskRun(s) in run.yaml
and any additional resources needed to perform the test inside resources.yaml
. Those could be - for example - PVC for Task performing DB backup or Deployment for task that performs application scaling.
One more file shown in the structure above which we mentioned already in previous section is the catalog.yaml
, which holds list of task that are to be installed from remote sources.
For convenience (if using tekton-kickstarter
) all of these can be installed with one command, which is make deploy-tasks
which traverses tasks
directory and applies all the Tasks to your cluster while omitting all the testing resources.
Building Custom Ones
If you can't find appropriate Task for the the job in the catalog, then it's time to write your own. In the beginning of the article I showed very simple "Hello world." example, but Tekton Tasks can get much more complex, so let's go over all the configuration options and features that we can leverage.
Let's start simple and introduce the basics that we will need in pretty much any Task that we build. Those are Task parameters and scripts for Task steps:
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/deploy/deploy.yaml
apiVersion: tekton.dev/v1beta1
kind: ClusterTask
metadata:
name: deploy
spec:
params:
- name: name
description: Deployment name.
- name: namespace
default: default
steps:
- name: rollout
image: 'bitnami/kubectl:1.20.2'
script: |
#!/usr/bin/env bash
set -xe
kubectl rollout restart deployment/$(params.name) -n $(params.namespace)
With the above deploy
Task we can perform simple rollout of Kubernetes Deployment, by supplying it with name of the Deployment in name
parameter and optionally namespace
in which it resides. These parameters that we pass to the Task are then used in script
section where they're expanded before the script is executed. To tell Tekton to expand the parameter we use the $(params.name)
notation. This can be also used in other parts of spec, not just in the script.
Now, let's take a closer look at the script
section - we begin it with shebang to make sure that we will be using bash
. However, that doesn't mean that you always have to use bash
, same way you could use for example Python with #!/usr/bin/env python
, it all depend on your preference and on what's available in the image being used. After shebang we also use set -xe
which tells the script to echo each command being executed - you don't have to do this, but it can be very helpful during debugging.
Alternatively, if you don't need whole script, but just a single command, then you can replace script
section with command
. This is how it would like like for simple Task that performs application health check using kubectl wait
(note: omitting obvious/non-relevant parts of Task body for remaining examples):
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/healthcheck/healthcheck.yaml
...
steps:
- name: wait
image: 'bitnami/kubectl:1.20.2'
command:
- kubectl
- wait
- --for=condition=available
- --timeout=600s
- deployment/$(params.name)
- -n
- $(params.namespace)
This works the same way as with command
directive in pods and so it can get verbose and hard to read if you have many arguments as in the example above. For this reason I prefer to use script
for almost everything, as it's more readable and easier to update/change.
Another common thing that you might need in your Tasks is some kind of a storage where you can write data that can be used by subsequent steps in the Task or by other Tasks in the pipeline. The most common use case for this would be a place to fetch git repo. This kind of a storage is called workspace
in Tekton and the following example shows a Tasks that mounts and clears the storage using rmdir
:
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/clean/clean.yaml
...
spec:
params:
- name: path
description: Path to directory being deleted.
default: '.'
workspaces:
- name: source
mountPath: /workspace
steps:
- image: ubuntu
name: rmdir
script: |
#!/usr/bin/env bash
set -xe
rm -rf "$(workspaces.source.path)/$(params.path)"
Above clean
Task includes workspace
section that defines name of the workspace and path where it should be mounted. To make it easier to update the mountPath
, Tekton provides variable in a format $(workspaces.ws-name.path)
which can be used in scripts to reference the path.
So, defining the workspace is pretty simple, but the disk backing the workspace won't appear out of thin air. Therefore, when we execute Task that asks for a workspace, we also need to create PVC for it. The TaskRun with creation of needed PVC for the above Task would look like so:
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/clean/tests/run.yaml
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
name: clean
namespace: default
spec:
params:
- name: path
value: 'somedir'
taskRef:
name: clean # Reference to the task
kind: ClusterTask # In case of ClusterTask we need to explicitly specify `kind`
workspaces:
- name: source
volumeClaimTemplate: # PVC created this way will be cleaned-up after TaskRun/PipelineRun completes!
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Workspaces are very generic and therefore can be also used not just to store some ephemeral data during Task/Pipeline run, but also as a long term storage like in the following example:
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/pg-dump/pg-dump.yaml
apiVersion: tekton.dev/v1beta1
kind: ClusterTask
metadata:
name: pg-dump
spec:
params:
- name: HOST
type: string
- name: DATABASE
type: string
- name: DEST
type: string
workspaces:
- name: backup
mountPath: /backup
steps:
- name: pg-dump
image: 'postgres:13.2-alpine'
env:
- name: USERNAME
valueFrom:
secretKeyRef:
name: postgres-config
key: POSTGRES_USER
- name: PASSWORD
valueFrom:
secretKeyRef:
name: postgres-config
key: POSTGRES_PASSWORD
script: |
#!/usr/bin/env sh
set -xe
PGPASSWORD="$PASSWORD" pg_dump -h $(params.HOST) -Fc -U $USERNAME $(params.DATABASE) > $(workspaces.backup.path)/$(params.DEST)
This Task can perform a PostgreSQL database backup using pg_dump
utility. In this example, the script grabs the database data from host and DB specified in parameters and streams it into workspaces which is backed by PVC.
Another feature that I sneaked into this example that you will encounter quite often is ability to inject environment variables from ConfigMaps or Secrets. This is done in env
section. This works exactly the same way as with Pods, so you can refer to that part of an API for this.
Back to the main topic of this example - PVC for database backup - considering that we want this data to be persistent, we cannot use PVC created using volumeClaimTemplate
as shown previously as that would get cleaned up after Task finishes, so instead we need to create the PVC separately and pass it to Task this way:
# https://github.com/MartinHeinz/tekton-kickstarter/tree/master/tasks/pg-dump/tests
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-backup
spec:
resources:
requests:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
---
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
name: pg-dump
spec:
taskRef:
name: pg-dump
kind: ClusterTask
params:
- name: HOST
value: 'postgres.default.svc.cluster.local'
- name: DATABASE
value: postgres
- name: DEST
value: 'test.bak'
workspaces:
- name: backup
persistentVolumeClaim:
claimName: postgres-backup
Here we use persistentVolumeClaim
instead of volumeClaimTemplate
and we specify name of existing PVC which is also defined in the above snippet. This example also assumes that there's a PostgreSQL database running at specified host - for full code including PostgreSQL deployment checkout files here.
Similarly to injection of environment variables, we can also use workspaces to inject whole ConfigMaps or Secrets (or some keys in them) as a file. This can be useful for example when you want whole .pem
certificate from Secret in the Task or as a config file that maps GitHub repository to application name as in following example:
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/misc/config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: repo-app-mapping
data:
repo-app-mapping.yaml: |
'git@github.com:kelseyhightower/nocode.git': 'nocode'
'git@github.com:MartinHeinz/blog-frontend.git': 'blog-backend'
'git@github.com:MartinHeinz/game-server-operator.git': 'game-server-operator'
'git@github.com:MartinHeinz/python-project-blueprint.git': 'sample-python-app'
---
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/get-application-name/get-application-name.yaml
apiVersion: tekton.dev/v1beta1
kind: ClusterTask
metadata:
name: get-application-name
spec:
params:
- name: repository-url
type: string
- name: mapping-file
type: string
steps:
- name: get-application-name
image: mikefarah/yq
script: |
#!/usr/bin/env sh
set -xe
yq e '."$(params.repository-url)"' /config/$(params.mapping-file) | tr -d '\012\015' > /tekton/results/application-name
results:
- name: application-name # Can be accessed by other Tasks with $(tasks.get-application-name.results.application-name)
workspaces:
- name: config
mountPath: /config
This Task takes repository URL and uses it to look up matching application name in the file which is part of the above shown ConfigMap. This is done using yq
utility followed by output to a file in special directory called /tekton/results/...
. This directory stores results
of Tasks which is a feature (and YAML section) we haven't mentioned yet.
Task results are small pieces of data that Task can output, which then can be used by subsequent Tasks. To use these, one has to specify the name of the result variable in results
section and then write something to /tekton/results/result-var-name
. Also, as you surely noticed in the above script, we strip newline from the result using tr
before writing it into the file, that's because the result should be simple output - ideally just single word - and not a big chunk of text. If you decide to write something longer (with multiple lines) to the result, you might end-up losing part of the value or seeing only empty string if you don't strip the newline character.
To then use Task - such as this one - that uses workspace from ConfigMap, we have to specify the config map in workspaces
section in the following way:
# Full example at https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/get-application-name/tests/run.yaml
workspaces:
- name: config
configmap:
name: repo-app-mapping
Final thing I want to show here is the usage of sidecar containers. These are not so common, but can be useful when you need to run some service (which your Task depends on) for the duration of Task execution. One such service can be Docker daemon sidecar with exposed socket. To demonstrate this we can create a Task that performs Docker image efficiency analysis using tool called Dive:
# https://github.com/MartinHeinz/tekton-kickstarter/blob/master/tasks/dive/dive.yaml
apiVersion: tekton.dev/v1beta1
kind: ClusterTask
metadata:
name: dive
spec:
params:
- name: IMAGE
description: Name (reference) of the image to analyze.
steps:
- image: 'wagoodman/dive:latest'
name: dive
env:
- name: CI
value: 'true'
command:
- dive
- $(params.IMAGE)
volumeMounts:
- mountPath: /var/run/
name: dind-socket
sidecars:
- image: 'docker:18.05-dind'
name: server
securityContext:
privileged: true
volumeMounts:
- mountPath: /var/lib/docker
name: dind-storage
- mountPath: /var/run/
name: dind-socket
volumes:
- name: dind-storage
emptyDir: {}
- name: dind-socket
emptyDir: {}
As you can see here, the sidecars
section is pretty much the same as definition of any container in Pod spec. In addition to the usual stuff we've seen in previous examples, here we also specify volumes
which are shared between sidecar and container in Tasks steps - in this case one of them being Docker storage in dind-socket
which Dive container attaches to.
This covers most of the features of Tekton Tasks, but one thing I didn't mention so far and you will surely run into when reading Tekton docs is PipelineResource object, which can be used as Tasks input or output - for example GitHub sources as input or Docker image as output. So, why haven't I mentioned it yet? Well, PipelineResource is part of Tekton that I prefer to not use for a couple of reasons:
- It's still in alpha unlike all the other resource types we used so far
- There are very few PipelineResources. It's mostly just Git, Pull request and Image resources.
- It's hard to troubleshoot them.
If you need more reason not to use them (for now), then take a look at docs section here.
In all these examples that we went over, we've seen a lot of YAML sections, options and features which shows how flexible Tekton is, this however makes it's API spec - naturally - very complex. kubectl explain
unfortunately doesn't help with exploring API, but API spec is available in docs but is lacking at best. So, in case you have trouble finding what can you put in which part of YAML, then your best bet is to rely on fields listed at the beginning of Tasks doc or on examples here, but make sure you're on right branch for your version of Tekton, otherwise you might spend long hours debugging why your seemingly correct Task cannot be validated by Tekton controller.
Running and Testing
So far, we mostly just talked about Tasks and not much was said about TaskRuns. That's because - in my opinion - individual TaskRuns are best suited for testing and not really for running the Tasks regularly. For that you should use pipelines which will a topic of next article.
Speaking of testing - when we're done implementing the Task, then it's time to run some tests. For straightforward and easy testing, I recommend using the layout mentioned earlier in the article. Using it should help you encapsulate the Task in a way that allows you to test it independently of any resources outside of it's directory.
To then perform the actual test it's enough to apply resources/dependencies in .../tests/resources.yaml
(if any) and then apply the actual test(s) inside .../tests/run.yaml
. The tests really are just set of TaskRuns that use your custom Task, so for this basic testing approach there's no need for any setup/teardown or additional scripts - just kubectl apply -f resources.yaml
and kubectl apply -f run.yaml
. Examples of simple tests can be found each Task's directory in tekton-kickstarter or in Tekton Catalog repository.
In general though, for any particular Task in repositories of both of these projects, you can run the following commands to perform the test:
kubectl apply -f tests/resources.yaml # Deploy dependencies
kubectl apply -f tests/run.yaml # Start tests
kubectl get tr # List the TaskRuns (tests)
NAME SUCCEEDED REASON STARTTIME COMPLETIONTIME
some-taskrun True Succeeded 106s 4s
tkn tr logs some-taskrun # View logs from TaskRun
...
For me personally - when it comes to testing Tasks - it's sufficient to use the above basic testing approach for validation and ad-hoc testing. If you however end-up creating large number of custom Tasks and want to go all out, then you can adopt the approach in Tekton catalog and leverage testing scripts in it's repository. If you decide to go that route and strictly follow the layout and testing, you might also want to try contributing the Tasks to the Tekton Catalog, so that whole community can benefit from more high quality Tasks. 😉
As for the scripts that you'll need for this - you'll need to include test
directory as well as Go dependencies (vendor
directory) from Tekton Catalog in your code and then follow E2E testing guide in docs here.
Regardless of whether you choose basic or "all-out" approach for your testing though, try to make sure that you test more than just happy paths in the Tasks and Pipelines, otherwise you might end-up seeing a lot of bugs when they get deployed "in the wild".
Best Practices
After you have implemented and tested your custom Tasks, it's a good idea to go back and make sure your Tasks are following the best development practices, which will make them more reusable and maintainable in a long run.
The simplest thing that you can do that will also give the most benefit - is to use yamllint
- a linter for YAML files. This tip doesn't apply only to Tekton Tasks, but rather to all YAML files as they can get tricky to get right with all the indentation, but it's especially important with Task definitions can get very long and complex, with many levels of indentation, so keeping them readable and validated can save you some unnecessary debugging as well as help you in keeping them more maintainable. You can find a custom .yamlint
config in my repository which I like to use, but you should customize it to suit your code style and formatting. Just make sure you run yamllint
from time to time (ideally in CI/CD) to keep things linted and validated.
As for the actual Tekton best practices - I could give you a huge list, but it would be mostly just things recommended by Tekton maintainers, so instead of copy-pasting it here, I will just point you to the relevant resources:
Closing Thoughts
In this article we saw how flexible and versatile Tekton is and with this flexibility also comes complexity of building or testing Tasks. Therefore, it's very much preferable to use existing Tasks created by community, rather then try to reinvent a wheel yourself. If there's however no suitable Task available and you have to build your own, make sure you write tests for your Tasks and follow best practices mentioned above to keep your Tasks maintainable and reliable.
After this article we should have enough experience as well as bunch of individual custom Tasks which we can use to start composing our pipelines. And that's exactly what we're going to do in the next article in these series, where we will explore how to build fully-featured pipelines to build, deploy, test your applications and much more.
Also if you haven't done so yet, make sure you checkout tekton-kickstarter repository where you can find all the Tasks and examples from this article as well as the pipelines you will see in the next one. 😉