Bringing NetBox to Elasticsearch: Turning your Source-of-truth into Search-at-Scale
- Steven vd Braak

- 5 dagen geleden
- 4 minuten om te lezen
When teams talk about “operational visibility,” they usually think about logs, SIEM data, metrics, or alerts. But there’s another dataset quietly powering everything beneath the surface: your infrastructure source-of-truth.
For many organizations, that’s NetBox — the authoritative registry for devices, racks, circuits, tenants, VLANs, VMs, and topology. But NetBox is not a search engine, not an analytics platform, and not designed for correlation across massive environments.
Recently, we built a complete NetBox → Elasticsearch integration, turning your static source-of-truth into a fast, cross-domain search and analytics layer. This post walks through the real architecture we deployed, the decisions involved, and the tooling that makes it all work.
Why Bring NetBox into Elasticsearch?
Once NetBox data becomes searchable in Elasticsearch, entirely new workflows appear:
Pivot from an EDR alert → asset → device → rack → site → tenant.
Correlate network events with topology and IPAM metadata.
Analyze utilization: racks, prefixes, circuits, wireless assets.
Detect infrastructure drift and outdated configuration.
Join operational logs with your infrastructure model.
NetBox contains authoritative truth.Elasticsearch gives you speed and correlation.
Together, they’re far more powerful than either system alone.
Architecture Overview
Below is the high-level architecture :
+-------------------+ +------------------------+
| NetBox API | ---> | query_netbox.py |
| (dcim/ipam/etc.) | | auto-discovers models |
+-------------------+ +-----------+------------+
|
v
+--------------------------+
| Elasticsearch Ingest |
| - ILM |
| - Rollover indices |
| - Ingest pipelines |
+--------------------------+
Everything flows through a single, schema-aware loader that discovers endpoints, paginates through the entire dataset, normalizes documents, and indexes them into rollover-managed indices inside Elasticsearch.
The Loader: Auto-Discovering and Indexing NetBox Data
The heart of the integration is the loader: query_netbox.py, it:
Discovers real model endpoints under categories eg: /api/dcim/ ⇒ devices, racks, sites, interfaces
Paginates through all list endpoints using ?limit=1000 + next pagination.
Indexes each endpoint into its own index:
netbox-<category>-<endpoint>
Example:netbox-dcim-devices, netbox-ipam-prefixes, netbox-plugins-netbox_topology_views-topology
Wraps each record with a metadata envelope describing its origin.
ASCII Diagram: Loader Workflow
For each category:
--------------------
dcim
ipam
virtualization
extras
plugins
...
|
v
[ Discover endpoints ]
|
v
[ Paginate through all pages ]
|
v
[ Normalize JSON structure ]
|
v
[ Bulk index into Elasticsearch ]
Configuring the Loader
All configuration lives in config-netbox.yaml :
elastic:
host: https://elastic-prod.cluster.local:9200
api_key: "xxxxxxxx=="
netbox:
base_url: https://netbox.cluster.local
token: "xxxxx"
verify_certs: false # optional
page_size: 1000
endpoints:
dcim: "https://netbox.cluster.local/api/dcim/"
ipam: "https://netbox.cluster.local/api/ipam/"
virtualization: "https://netbox.cluster.local/api/virtualization/"
<name all endpoints>...
Running a full ingest
python3 query_netbox.py -c config-netbox.yaml --all
Running a category-specific ingest
python3 query_netbox.py -c config-netbox.yaml --categories dcim,ipam
Dry-run (no Elasticsearch)
python3 query_netbox.py -c config-netbox.yaml --endpoints dcim.devices \
--dry-run --out devices.ndjson --out-format ndjson
Rollover Index Strategy (ILM-Friendly)
Each category gets its own rollover index following:
netbox-<category>-000001
netbox-<category>-write (alias)
The bootstrap script (bootstrap_netbox_rollover.sh) creates these initial indices for all categories:
core
dcim
ipam
virtualization
tenancy
circuits
wireless
extras
users
plugins
schema
status
vpn
➞ These are all automatically created with write aliases(“netbox-*-write”) using the script below.
Example output:
CREATE netbox-core-000001 with alias netbox-core-write
CREATE netbox-dcim-000001 with alias netbox-dcim-write
CREATE netbox-ipam-000001 with alias netbox-ipam-write
...
ASCII Diagram: Rollover Structure
netbox-dcim-write ---> netbox-dcim-000001
|
+--(will roll over to)--> netbox-dcim000002
Bootstrap Script
#!/usr/bin/env bash
CATEGORIES="core dcim ipam virtualization tenancy circuits wireless extras users plugins schema status vpn"
for c in $CATEGORIES; do
IDX="netbox-$c-000001"
ALIAS="netbox-$c-write"
echo "CREATE $IDX with alias $ALIAS"
curl -X PUT "$HOST/$IDX" -d '{
"aliases": { "'$ALIAS'": { "is_write_index": true } }
}'
done
Index Templates from Schema
The template generator reads netbox-schema.json, merges every model per category, and emits:
Component templates
Index templates
Optional ILM policy attachments
Output:13 component templates + 13 index templates were generated.
Example invocation:
python3 generate_netbox_es_templates_by_category.py \
--schema netbox-schema.json \
--outdir templates \
--index-prefix netbox- \
--emit-ilm --ilm-policy netbox-default
Ingest Pipeline: Normalizing NetBox Documents
NetBox timestamps and nested fields vary per model. The ingest pipeline solves this by:
Setting @timestamp from record.last_updated or record.created.
Preserving originals as netbox.last_updated / netbox.created.
Extracting tags[].name → tags_keyword.
Flattening label/value fields (e.g. status).
ASCII Diagram: Pipeline Logic
Incoming Document
|
v
[ Extract timestamps ] ---> @timestamp
|
v
[ Flatten nested objects ]
|
v
[ Extract tags[].name ]
|
v
[ Output normalized document ]
Pipeline Code Snippet
PUT _ingest/pipeline/netbox-default
{
"processors": [
{
"script": {
"ignore_failure": true,
"source": """
def ts = ctx.record.last_updated ?: ctx.record.created;
if (ts != null) { ctx['@timestamp'] = ts; }
"""
}
},
{
"script": {
"ignore_failure": true,
"source": """
if (ctx.record.tags != null) {
ctx.tags_keyword = [];
for (t in ctx.record.tags) { ctx.tags_keyword.add(t.name); }
}
"""
}
}
]
}
What This Integration Unlocks
Once NetBox data lives inside Elasticsearch, entirely new capabilities appear:
Security & IR
Enrich EDR alerts with devices, racks, sites, tenants.
Correlate lateral movement with topology.
Networking
Search prefixes, circuits, wireless assets globally.
Build dashboards on utilization, rack density, peering, VRFs.
SRE / Operations
Detect drift between config and NetBox truth.
Join system logs with VM placement, cluster metadata, assignment groups.
Architecture & Planning
Visualize infrastructure across sites or tenants.
Track lifecycle of devices and virtual assets over time.
NetBox becomes a real-time searchable dataset, not just a documentation tool.
Lessons Learned
1. Every NetBox category has inconsistent schemas
Some models offer timestamps, some don’t. Some embed labels. Tags differ.The pipeline solves this consistently.
2. Pagination matters
NetBox uses ?limit=1000 and DRF-style next paging — miss this and you miss data.
3. Per-endpoint indices scale better
netbox-dcim-devices separate from netbox-dcim-interfaces avoids bloated mappings.
4. ILM is essential
Some categories (dcim, ipam) grow extremely fast.
Where This Goes Next
This integration turns NetBox into a first-class analytics source — enabling new workflows across security, network engineering, and operations.
If you want help designing:
A production-grade NetBox → Elasticsearch pipeline
A schema-driven index strategy
An ingest normalization pipeline
A dashboard suite / search layer
Correlation workflows with EDR/SIEM data
We’re happy to assist you.




Opmerkingen