Skip to main content

Backup and restore

This guide shows you how to create comprehensive backups of your Infrahub deployment and restore them when needed. You'll learn to backup the Neo4j graph database, object storage, and task management data to ensure complete data recovery capabilities.

For Neo4j cluster deployments, see Cluster backup and restore.

Prerequisites​

  • Running Infrahub deployment (Docker Compose or Kubernetes)
  • Administrative access to the Neo4j database
  • Access to the object storage location (S3 or local filesystem)
  • Sufficient storage space for backup files
  • For cluster deployments: Understanding of your cluster topology

Create a full backup​

Step 1: Install the backup tool​

Install the infrahub-backup CLI tool:

curl https://infrahub.opsmill.io/ops/$(uname -s)/$(uname -m)/infrahub-backup -o infrahub-backup
chmod +x infrahub-backup

Step 2: Backup the databases​

Create a backup of your running Infrahub instance:

./infrahub-backup create

The tool automatically:

  • Checks for running tasks before starting (use --force to skip)
  • Creates a timestamped backup archive (for example, infrahub_backup_20250129_153045.tar.gz)
  • Backs up Neo4j database with metadata (configurable with --neo4jmetadata)
  • Backs up Prefect/PostgreSQL task management database
  • Calculates SHA256 checksums for integrity verification
note

We plan to add object storage backup in a future release. Handle object storage backups separately for now.

Step 3: Backup the object storage​

The object storage layer holds all file content (file objects and artifacts) outside of the graph database. The graph database references this content through storage_id values, so both must be backed up together to maintain consistency.

If using S3 for object storage, use AWS CLI or your preferred S3 backup tool:

# Sync S3 bucket to local backup directory
aws s3 sync s3://your-infrahub-bucket /backup/object_store/

Restore from backup​

Step 1: Prepare the environment​

Ensure Infrahub services are running before starting the restore process. You can start from a scratch/blank deployment.

Restore from a backup archive:

./infrahub-backup restore infrahub_backup_20250129_153045.tar.gz

The tool automatically:

  • Validates backup integrity using checksums
  • Wipes cache and message queue data
  • Stops application containers
  • Restores PostgreSQL database first
  • Restores Neo4j database with metadata
  • Restarts all services in correct order

Step 2: Restore the databases​

This is automatically handled by infrahub-backup.

Step 3: Restore the object storage​

# Restore S3 bucket from backup
aws s3 sync /backup/object_store/ s3://your-infrahub-bucket

Step 4: Restart Infrahub services​

This is automatically handled by infrahub-backup.

Validation​

Verify your restoration was successful:

  1. Check database status:

    docker compose exec -T database cypher-shell -u neo4j \
    -c "SHOW DATABASES;"

    The Neo4j database should show as "online".

  2. Verify Infrahub API:

    curl http://localhost:8000/api/schema/summary

    You should receive a valid schema response.

  3. Check task manager:

    docker compose logs task-manager --tail 50

    Logs should show normal operation without errors.

  4. Test artifact retrieval: Access the Infrahub UI and verify that stored artifacts (Transformations, queries) are accessible.

Advanced usage​

Using the Python-based backup utility​

Legacy Tool

The Python-based utility (utilities/db_backup) is still available in the main Infrahub repository but is being replaced by infrahub-backup. Use it only if infrahub-backup doesn't meet your specific requirements.

Use non-default ports​

If your deployment uses custom ports, specify them during backup and restore operations:

# Backup with custom backup port
python -m utilities.db_backup neo4j backup \
--database-backup-port=12345 \
/infrahub_backups

# Restore with custom Cypher port
python -m utilities.db_backup neo4j restore \
/infrahub_backups \
--database-cypher-port=9876

Run backup tool via Docker​

If you don't have the repository cloned locally, run the backup tool directly from the Infrahub Docker image:

docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
registry.opsmill.io/opsmill/infrahub \
python -m utilities.db_backup