Data Importer
The Data Importer turns CSV or TSV inputs into Infrahub object YAML and loads it onto a fresh branch. It introspects the live schema, maps columns to attributes by heuristic plus an up-front interview, splits denormalized inputs across the right kinds with the correct load order, and fails closed when a column has no schema home β without ever proposing a schema edit.
When to useβ
- Importing infrastructure data from a CSV or TSV export (NetBox, vendor inventories, spreadsheets)
- Loading a denormalized one-big-sheet export and splitting it across the right Infrahub kinds
- Bulk-loading initial data into a new Infrahub instance from a folder of CSVs
- Re-importing edits to existing rows by HFID (upsert)
- Importing data with lineage tagging via
source/owner/is_protected
What it producesβ
- Numbered object YAML files (
01_manufacturers.yml,02_sites.yml,10_devices.yml, β¦) that conform to the Object Manager envelope (apiVersion: infrahub.app/v1,kind: Object,spec.kind,spec.data) - A fresh import branch created via
infrahubctl branch createbefore any validate or load - A confirmed mapping plan from the up-front interview, locked before the first file is written
- Local self-check against Object Manager rules, followed by server-side
infrahubctl object validate, followed byobject loadβ all scoped to the import branch - Optional
source/owner/is_protectedmetadata stamping when the user opts in
Example promptsβ
- "Import
devices.csvinto Infrahub on a fresh branch" - "Convert this denormalized CSV with manufacturer, location, and device columns into separate object files in the right load order"
- "Load these three CSVs (manufacturers, sites, devices) and resolve the references between them"
- "Import an interface CSV with
eth0..eth47ranges and collapse them withexpand_range: true" - "Import
assets.csvand stamp every value withsource: csv-import-20260622"
Key rules enforcedβ
- Schema is read-only β the skill never proposes attribute additions, dropdown choice additions, or any schema edit; an unmappable column triggers a fail-closed report and routes to the Schema Manager
- Up-front interview β every ambiguity (unmapped dropdown cells, denormalization splits, branch name, lineage opt-in) is batched into one question round before any file is written
- Branch-first β runs
infrahubctl branch create <name>beforeobject validateorobject load; the import never lands on the default branch - Local self-check before server validate β the emission is walked against the Object Manager rules locally first, so shape errors don't cost a branch that has to be discarded
- Dropdown label to choice name β UI-facing labels in the CSV (e.g.,
Active) are translated to the schema's choice name (e.g.,active) using a label-to-name lookup - Reference shape from HFID β relationship references emit as a scalar or a positional list based on the target node's
human_friendly_idlength, read directly from the schema - Range collapse for interfaces β contiguous sequences like
eth0..eth47collapse toeth[0-47]withparameters.expand_range: truewhen all sibling columns match across the range - Numbered load order β files use
NN_prefixes so referents load before referrers, and dependent kinds are emitted to higher-numbered files
Common mistakes it catchesβ
| Mistake | What the skill does instead |
|---|---|
| Dropping unmapped columns silently | Fails closed with a structured report of unmappable columns and the kinds checked |
| Emitting dropdown labels instead of choice names | Builds a labelβname table from the schema and emits the name |
| Wrong reference shape (scalar vs list) | Reads the target's human_friendly_id length and emits the matching shape |
| Repeated parent rows treated as duplicates | Detects the denormalization and asks whether to nest as component children or split into separate kinds |
| Emitting interface rows as 48 literal entries | Detects contiguous sequences with identical sibling columns and emits a single range row plus expand_range: true |
| Loading to the default branch | Creates a dedicated import branch and validates + loads there |
| Trying to resume a partial load | Discards the branch and re-runs with a fresh branch name (object load is not transactional across files) |
Validating and loadingβ
The skill produces these commands as part of the workflow. Run them in order on the import branch:
# Create the dedicated import branch (after the local self-check passes)
infrahubctl branch create csv-import-20260622-1430
# Validate the emission against the branch
infrahubctl object validate ./output_dir/ --branch csv-import-20260622-1430
# Load on success
infrahubctl object load ./output_dir/ --branch csv-import-20260622-1430
# If anything fails partway, discard the branch and re-run with a fresh name
infrahubctl branch delete csv-import-20260622-1430
The skill never edits the schema. If a CSV column has no attribute or relationship to map to, the skill stops with a list of the unmappable columns and the kinds it checked. Resolve the gap with the Schema Manager, then re-run.
For Excel inputs, export each sheet to CSV first β the skill covers CSV and TSV only. For LDJSON dumps from infrahubctl export dump, use infrahubctl import load instead; that's a different format and a different tool.