Skip to main content

Build a Python Transformation

By the end of this tutorial you will have built a working Python Transformation end-to-end: loaded a small network-device schema, created a few sample devices, written a GraphQL query that filters by device name, implemented a DeviceConfigTransform Python class that returns a JSON object derived from the response, registered it in .infrahub.yml, tested it locally with infrahubctl transform, and called the Transformation via the REST API. You'll leave with both the raw-dictionary and SDK-converted patterns side by side so you can pick the one that fits the next Transformation you write.

For conceptual background, see Transformations. For a recipe-form how-to without the running example, see Write a Python transformation.

Within Infrahub a Transformation is defined in an external repository. However, during development and troubleshooting it is easiest to start from your local computer and run the Transformation using infrahubctl transform.

The tutorial follows these steps:

  1. Identify the relevant data you want to extract from the database using a GraphQL query, that can take an input parameter to filter the data
  2. Write a Python script that uses the GraphQL query to read information from the system and transform the data into a new format
  3. Create an entry for the Transformation within an .infrahub.yml file
  4. Create a Git repository
  5. Test the Transformation with infrahubctl
  6. Add the repository to Infrahub as an external repository
  7. Validate that the Transformation works using the Transformation API endpoint

1. Loading a schema

This tutorial uses a very simplistic network device model. The Transformation won't be very useful on its own — the goal is to show how Transformations are created. Once you've mastered the basics you'll be ready to create more advanced Transformations.

---
version: "1.0"
nodes:
- name: Device
namespace: Network
display_label: "{{ name__value }}"
attributes:
- name: name
kind: Text
label: Name
optional: false
unique: true
- name: description
kind: Text
label: Description
optional: true

Store the schema as a YAML file on your local disk, and load the schema into Infrahub using the following command

infrahubctl schema load /path/to/schema.yml

2. Creating a query to collect the desired data

As the first step we need some data in the database to actually query.

Create three devices, called "switch1", "switch2", "switch3", either using the frontend or by submitting three GraphQL mutations as per below (swapping out the name each time).

mutation CreateDevice {
NetworkDeviceCreate(
data: {name: {value: "switch1"}, description: {value: "This is device switch1"}}
) {
ok
object {
id
}
}
}

The next step is to create a query that returns the data we created above. The rest of this tutorial assumes that the following query will return a response similar to the response below the query.

Single-target query requirement

For proper artifact detection, your query must target a unique node using a unique attribute or ID. This ensures Infrahub only regenerates the necessary artifacts instead of regenerating all artifacts unnecessarily.

Requirements for a valid single-target query:

  • Must filter on a unique identifier (ID or unique attribute like name)
  • Must use a required variable, for example, $name: String!
  • Must use exact match filters, for example, name__value: $name, not list filters, for example, name__values: $name

Valid example:

query BuiltinTag($name: String!) {
BuiltinTag(name__value: $name) {
edges { node { id } }
}
}

Invalid examples (will cause excessive artifact generation):

No filter at all:

query BuiltinTag {
BuiltinTag {
edges { node { id } }
}
}

Filtering on a non-unique attribute:

query BuiltinTag($description: String!) {
BuiltinTag(description__value: $description) {
edges { node { id } }
}
}

To learn more about single-target queries and why they are important, see the GraphQL topic.

Convert query to Infrahub SDK objects

We provide convert_query_response option to be toggled to be able to access objects from the GraphQL query as Infrahub SDK objects rather than the raw dictionary response.

This allows you to manage the returned data with helper methods on the SDK objects such as save, fetch, etc. on the returned data rather than having to build a payload to send back to Infrahub to manage the objects.

Read more on the Infrahub Python SDK.

query DeviceQuery {
NetworkDevice {
edges {
node {
name {
value
}
description {
value
}
}
}
}
}

Response to the query:

{
"data": {
"NetworkDevice": {
"edges": [
{
"node": {
"name": {
"value": "switch1"
},
"description": {
"value": "This is device switch1"
}
}
},
{
"node": {
"name": {
"value": "switch2"
},
"description": {
"value": "This is device switch2"
}
}
},
{
"node": {
"name": {
"value": "switch3"
},
"description": {
"value": "This is device switch3"
}
}
}
]
}
}
}

While it's possible to create a Transformation that targets all of these devices — for example to create a report — the goal here is to focus on one device at a time. Modify the query above to take an input parameter so that we can filter the result.

Create a local directory on your computer.

mkdir device_config_render

Then save the above query as a text file named device_config_render/device_config.gql.

The query requires an input parameter called $name that will refer to the name of each device. When we want to query for device switch1, the input variables to the query would look like this:

{
"name": "switch1"
}

3. Create the Python Transformation file

The next step is to create the actual Python Transformation. The Transformation is a Python class that inherits from InfrahubTransform from the Python SDK. Create a file called device_config_render/device_config.py:

from infrahub_sdk.transforms import InfrahubTransform
class DeviceConfigTransform(InfrahubTransform):
query = "device_config_query"
async def transform(self, data):
device = data["NetworkDevice"]["edges"][0]["node"]
device_name = device["name"]["value"]
device_description = device["description"]["value"]
return {
"device_hostname": device_name,
"device_description": f"*{device_description}*"
}

The example is simplistic in terms of what we do with the data, but all of the important parts of a Transformation exist here.

  1. We import the InfrahubTransform class.
from infrahub_sdk.transforms import InfrahubTransform
  1. We define our own class based on InfrahubTransform.
class DeviceConfigTransform(InfrahubTransform):

Note the name of the class — we need it later. Optionally we could call it Transform, which is the default name.

  1. We define where data comes from and what API endpoint to use.
query = "device_config_query"

The query part refers to the the query that we will define in the .infrahub.yml repository configuration file later in the tutorial.

With this configuration, the endpoint of our Transformation will be http://localhost:8000/api/transform/python/device_config_transform.

  1. The Transformation method
async def transform(self, data):
device = data["BuiltinTag"]["edges"][0]["node"]
device_name = device["name"]["value"]
device_description = device["description"]["value"]
return {
"device_hostname": device_name,
"device_description": f"*{device_description}*"
}

When running the Transformation, the data input variable will consist of the response to the query we created or the webhook payload if using the Transformation with a Custom Webhook.

In this case, the Transformation returns a JSON object consisting of two keys device_hostname and device_description where we have modified the data in some way. Here you would return data in the format you need.

info

If you are unsure of the format of the data you can set a debug marker when testing the Transformation with infrahubctl:

async def transform(self, data):
breakpoint()
device = data["BuiltinTag"]["edges"][0]["node"]
device_name = device["name"]["value"]
device_description = device["description"]["value"]

4. Create an .infrahub.yml file

The .infrahub.yml file allows you to define both the Transformation and query.

Convert query to Infrahub SDK objects

We provide convert_query_response option to be toggled to be able to access objects from the GraphQL query as Infrahub SDK objects rather than the raw dictionary response.

This allows you to manage the returned data with helper methods on the SDK objects such as save, fetch, etc. on the returned data rather than having to build a payload to send back to Infrahub to manage the objects.

Read more on the Infrahub Python SDK.

See the infrahub.yml configuration page for a full explanation of everything that can be defined in the .infrahub.yml file.

.infrahub.yml
# yaml-language-server: $schema=https://schema.infrahub.app/python-sdk/repository-config/latest.json
---
python_transforms:
- name: device_config_transform
class_name: DeviceConfigTransform
file_path: "device_config_render/device_config.py"
queries:
- name: device_config_query
file_path: "device_config_render/device_config.gql"

Two parts here are required: first the name of the Transformation which should be unique across Infrahub, and also the file_path that should point to the Python file within the repository. In this example we have also defined class_name because we gave our class the name DeviceConfigTransform instead of the default Transform.

5. Test the Transformation using infrahubctl

Using infrahubctl you can first verify that the .infrahub.yml file is formatted correctly by listing available Transformations.

❯ infrahubctl transform --list
Python Transformations defined in repository: 1
device_config_transform (device_config.py::DeviceConfigTransform)
info

Trying to run the Transformation with only the Transformation name will produce an error.

❯ infrahubctl transform device_config_transform
{'message': "Variable '$name' of required type 'String!' was not provided.", 'locations': [{'line': 1, 'column': 19}]}

Here we can see that our query is missing the required input for $name which is needed to filter the data.

Run the Transformation and specify the variable name along with the device we want to target.

❯ infrahubctl transform device_config_transform name=switch2
{
"device_description": "*This is device switch2*",
"device_hostname": "switch2"
}

You've now successfully created a Transformation. Most Transformations you build will be more complex than this, but the main building blocks remain the same. The output could be in OpenConfig format, as Terraform input variables, or any other format you need.

6. Create a Git repository

Within the device_config_render folder you should now have 3 files:

  • device_config_render/device_config.gql: Contains the GraphQL query
  • device_config_render/device_config.py: Contains the Python code for the Transformation
  • .infrahub.yml: Contains the definition for the Transformation

Before we can test our Transformation we must add the files to a local Git repository.

git init --initial-branch=main
git add .
git commit -m "First commit"

7. Adding the repository to Infrahub

To avoid repeating the same instructions, see Connect a repository for syncing the repository you created and making it available within Infrahub.

8. Accessing the Transformation from the API

Once the repository is synced to Infrahub you can access the Transformation from the API:

❯ curl http://localhost:8000/api/transform/python/device_config_transform?name=switch2
{
"device_hostname":"switch2",
"device_description":"*This is device switch2*"
}
❯ curl http://localhost:8000/api/transform/python/device_config_transform?name=switch3
{
"device_hostname":"switch3",
"device_description":"*This is device switch3*"
}

What you learned

You now have a working Python Transformation:

  • A NetworkDevice schema loaded into Infrahub
  • Three sample devices (switch1, switch2, switch3) created via GraphQL mutations
  • A GraphQL query that selects a single device by name and reads its description
  • A DeviceConfigTransform class that returns a JSON object derived from the response, in both the raw-dictionary and SDK-converted forms
  • A .infrahub.yml definition that wires the query and the Transformation together
  • The Transformation responding both via infrahubctl transform locally and via the REST API endpoint

Next steps