For years, every monitoring vendor spoke its own dialect. Datadog had its agent. New Relic had its SDK. Dynatrace had its OneAgent. If you wanted to switch tools — or even add a second one — you rewrote your instrumentation from scratch. Vendor lock-in wasn’t a side effect of these platforms. It was the business model.
OTLP (OpenTelemetry Protocol) changes that. It’s the open, vendor-neutral wire format that carries your telemetry data — traces, metrics, and logs — in a standardised envelope that any compliant backend can receive. Instrument your application once using the OpenTelemetry SDK, and you decide where the data goes. Change your mind next year? Update one line of configuration.
This matters enormously for how OpsPilot works, and why we built on OpenTelemetry from the ground up.
How OTLP Organises Your Data
OTLP doesn’t just move raw numbers around. It wraps every piece of telemetry in a three-layer structure that preserves context:
Resource answers “who.” This is the identity of the service — its name, version, the environment it’s running in, which cloud node it lives on. Without this layer, a spike in error rates is just a number. With it, you know it’s your checkout service on a specific pod in production.
Scope answers “how.” This identifies which instrumentation library produced the data — the Express.js plugin, the database driver, the HTTP client. It’s the provenance layer that tells OpsPilot whether it’s looking at application-level behaviour or infrastructure-level behaviour.
Signal is the actual data — the trace spans, metric values, or log entries. Everything else is context that makes the signal meaningful.
This structure is why OpsPilot can deliver prioritised recommendations rather than just surfacing raw numbers. When we analyse your stack, we’re not looking at anonymous metrics. We’re looking at named services, specific libraries, and correlated signals — and we can tell you that this service, instrumented this way, is exhibiting a pattern we’ve seen lead to problems.
The Three Signals OTLP Carries
Traces tell the story of a request as it travels through your system. OTLP uses spans — each representing a single operation like a database query or an API call — and links them with a shared trace_id. This is what allows OpsPilot to follow a slow user request from the frontend all the way through your microservices and identify exactly where time is being lost.
Metrics are the vital signs: counters that accumulate over time, gauges that capture current state, and histograms that distribute values into buckets. OTLP supports all three natively. OpsPilot uses metric data for health scoring — tracking your stack’s performance, error rates, and cost efficiency over time, not just at the moment of an incident.
Logs are where OTLP does something particularly useful. Rather than replacing your existing logging framework, it bridges it. Log4j, Pino, Winston — your existing setup carries on. OTLP enriches log entries with a trace_id, which means OpsPilot can show you the exact log context for any trace it flags. You stop chasing anomalies across disconnected systems and start seeing the full picture in one view.
The Collector: The Often-Underestimated Piece
Most conversations about OTLP focus on the SDK and the backend. The OpenTelemetry Collector sits in between, and it’s worth understanding what it actually does.
The Collector receives telemetry in any format — including older formats like Prometheus — and can translate, enrich, filter, and forward it. For teams migrating from a proprietary monitoring stack, this is the practical migration path. You don’t need a big-bang replacement. You can route data through the Collector in parallel, validate that OpsPilot is receiving what it needs, and shift workloads progressively.
This is also where the distinction between genuine OTLP support and performative OTLP support matters. Some platforms accept OTLP at the door and immediately convert it into their own internal format. Detail gets lost in translation. OpsPilot ingests OTLP natively, which means the full fidelity of your telemetry — resource attributes, scope metadata, all of it — is available for analysis.
What This Means for Teams Using OpsPilot
The shift to OTLP as a standard isn’t just a technical convenience. It changes what’s possible at the intelligence layer.
When your telemetry arrives with consistent structure and rich context, automated analysis becomes far more precise. OpsPilot can correlate a cost anomaly in your metrics with a specific service version in your resource attributes. It can identify an observability gap — a service that’s producing traces but no metrics — because OTLP makes the absence of expected signals visible. It can track health score changes over time because the data is comparable across runs, not shaped differently by each vendor’s collection quirks.
Twenty years of analysing production incidents taught us that the hardest problems to solve aren’t the ones where you have no data. They’re the ones where you have plenty of data but no way to connect it. OTLP, done properly, is what makes the connections reliable enough to act on.
The observability stack is maturing. Collection is largely solved. Visualisation is a commodity. The frontier is interpretation — and that work becomes dramatically more tractable when the data arriving at your intelligence layer speaks a consistent language.