How to Build Production-Ready OpenTelemetry Collectors Using OCB?

Master the OpenTelemetry Collector Builder (OCB) to eliminate version conflicts and build failures. Learn to create lean, custom distributions for production Kubernetes environments.

David George Hope |
A conceptual 3D illustration of a custom OpenTelemetry Collector being assembled on a high-tech production line.

Key Takeaways

  • Custom Distributions Enhance Security: Building custom OpenTelemetry distributions significantly reduces binary size and security attack surfaces by excluding unused components found in the default "contrib" image.
  • Containerized Builds are Mandatory: Multi-stage Docker builds are the industry standard for preventing local environment drift and resolving Windows-specific build failures caused by CGO mismatches.
  • Automation Prevents Dependency Hell: Automated manifest generation is critical for managing complex Go module dependency trees, as manual version matching often leads to build failures.
  • RBAC is Critical for Kubernetes Metadata: Production-grade k8sattributes processor implementation requires explicit RBAC permissions (get, list, watch) and optimized pod association logic for large-scale DaemonSets.
  • Version Pinning Ensures Reproducibility: Pinning specific versions for the otelcol_version and individual components is mandatory to prevent breaking changes during the build process.

What is the OpenTelemetry Collector Builder (OCB)?

The OpenTelemetry Collector Builder (OCB) is a command-line utility that generates a Go main package and compiles a custom OpenTelemetry Collector binary based on a declarative YAML manifest. Instead of deploying the monolithic "contrib" distribution—which contains hundreds of components and weighs several hundred megabytes—OCB allows engineers to compile a lean binary containing only the specific receivers, processors, and exporters required for their observability pipeline.

Definition: OpenTelemetry Collector Builder (OCB) A CLI tool (ocb) that parses a build configuration file (builder-config.yaml) to generate Go source code and compile a custom Collector binary, enabling users to curate a specific set of components and third-party modules.

By excluding unnecessary components, OCB reduces memory overhead and simplifies vulnerability management. If a vulnerability is discovered in a receiver you do not use, a custom build excludes that code entirely, removing the CVE from your environment. It serves as the bridge between generic upstream releases and specialized internal observability requirements, allowing the integration of private modules alongside public community components.

Why do OCB builds frequently fail in local environments?

OCB builds frequently fail locally because local Go environments suffer from "version drift," where the host's toolchain, C library headers, or environment variables conflict with the strict requirements of the Collector components.

Developers attempting to build on Windows or macOS often encounter failures related to CGO (C Go) bindings. Many Collector components rely on specific system calls or headers that are not present or behave differently across operating systems. Windows platforms face specific challenges with long file paths exceeding the 260-character limit and shell-specific character escaping issues within the generated YAML manifests.

Definition: Version Drift The phenomenon where local development environments (Go version, OS libraries, build tools) diverge from the target production environment or the upstream project's requirements, leading to non-reproducible builds.

Furthermore, manual entry of component versions in the builder manifest is a primary source of failure. Incompatible module pairings—such as attempting to compile a v0.90 processor with a v0.85 core collector—result in compilation errors that are difficult to diagnose. The go mod tidy step executed by OCB is also highly sensitive to network configurations; enterprise private proxies often block the retrieval of public Go modules, causing the build process to hang or timeout.

How do you solve version dependency conflicts in OCB?

You solve version dependency conflicts in OCB by utilizing automated manifest generation tools to map the desired OpenTelemetry version to its validated component ecosystem and strictly pinning versions.

To avoid the common "dependency hell," you must pin the otelcol_version in the manifest to a specific semantic version rather than using latest. This ensures the core collector logic remains stable. Additionally, implement a "strict versioning" policy where all components (receivers, processors, exporters) are validated against the core collector version before build time.

Definition: Dependency Hell A situation where software components have mutually incompatible dependencies on different versions of the same shared library, preventing the software from compiling or running correctly.

For complex environments, use the replaces block in the OCB manifest. This directive allows you to manually resolve upstream module conflicts or point to internal forks of a component. This is particularly useful when a transitive dependency in one component conflicts with the core collector's requirements.

Comparison: Manual vs. Automated Version Management

How do you build a custom collector using multi-stage Docker?

You build a custom collector using multi-stage Docker by executing the OCB compilation inside a controlled Golang container and then copying the resulting binary to a minimal runtime image. This approach ensures environment parity across the engineering team and eliminates OS-specific build errors.

The first stage of the Dockerfile uses a standard Golang image to download the OCB binary and run the compilation command. This stage handles all the messy dependencies, go mod caching, and compilation artifacts. The second stage copies only the compiled binary into a scratch or distroless image.

Definition: Multi-stage Build A Docker optimization technique that uses intermediate images to compile code and install dependencies, while the final image contains only the necessary artifacts, resulting in a significantly smaller and more secure production image.

Example: Production-Grade Dockerfile

# Stage 1: Builder
FROM golang:1.22 as builder

# Install the OpenTelemetry Collector Builder (OCB)
ARG OCB_VERSION=0.98.0
RUN curl --proto '=https' --tlsv1.2 -fL -o /usr/bin/ocb \
    https://github.com/open-telemetry/opentelemetry-collector/releases/download/cmd%2Fbuilder%2Fv${OCB_VERSION}/ocb_${OCB_VERSION}_linux_amd64

RUN chmod +x /usr/bin/ocb

WORKDIR /build

# Copy the builder manifest
COPY builder-config.yaml .

# Run the builder with CGO disabled for static linking
# This prevents "missing shared library" errors in the scratch image
RUN CGO_ENABLED=0 ocb --config builder-config.yaml

# Stage 2: Final Image
FROM scratch

# Copy CA certificates for HTTPS/gRPC security
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/

# Copy the compiled binary from the builder stage
COPY --from=builder /build/otelcol-custom /otelcol-custom

# Expose standard OTLP ports
EXPOSE 4317 4318

ENTRYPOINT ["/otelcol-custom"]
CMD ["--config", "/etc/otelcol/config.yaml"]

Ensuring CGO_ENABLED=0 creates a statically linked binary that runs without external library dependencies, which is required when running on a scratch image.

How do you configure the k8sattributes processor for production RBAC?

To configure the k8sattributes processor for production, you must bind the Collector's ServiceAccount to a ClusterRole that grants get, watch, and list permissions for pods, namespaces, and nodes. Without these specific RBAC permissions, the processor cannot query the Kubernetes API server, resulting in 403 Forbidden errors and missing metadata.

Definition: k8sattributes Processor A processor that automatically enriches telemetry data with Kubernetes metadata (Pod name, Namespace, Node IP, Container ID) by correlating the incoming data's IP address or Pod UID with the Kubernetes API state.

In large-scale DaemonSet deployments, inefficient configuration can overload the API server. You should configure the extract block to capture only essential metadata—such as k8s.pod.name, k8s.namespace.name, and specific custom labels—rather than pulling all available metadata.

RBAC Configuration (ClusterRole)

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector-metadata-role
rules:
- apiGroups: [""]
  resources: ["pods", "namespaces", "nodes"]
  verbs: ["get", "watch", "list"]

Processor Configuration (otel-config.yaml)

processors:
  k8sattributes:
    auth_type: "serviceAccount"
    passthrough: false
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.namespace.name
        - k8s.node.name
        - k8s.pod.start_time
    # Use pod_association to minimize API lookups
    pod_association:
      - sources:
        - from: resource_attribute
          name: k8s.pod.ip
      - sources:
        - from: resource_attribute
          name: k8s.pod.uid
      - sources:
        - from: connection

Using specific pod_association rules ensures the processor looks up metadata efficiently using data already present in the telemetry signal (like Pod IP) before falling back to more expensive lookups.

Frequently Asked Questions

Can I use OCB to include private, internal components?

Yes, you can include private components by specifying the full Go module path in the manifest and ensuring the build environment has network access and authentication credentials (such as SSH keys or .netrc files) to reach the private repository.

Why is my custom binary still large?

If your binary is large, ensure you are utilizing a multi-stage Docker build and copying only the compiled binary to a scratch or distroless image. Also, verify that your builder-config.yaml does not inadvertently include the entire contrib dependency tree via a transitive import.

How do I debug cryptic YAML errors in OCB?

Debug YAML errors by using a YAML linter to validate structure and indentation before running the builder. Additionally, check that the dist (output) and module (Go module name) fields in your manifest are correctly formatted, as OCB often fails silently or with generic errors if these paths are invalid.

Is OCB compatible with Windows?

While the OCB binary runs on Windows, it is strongly recommended to perform the actual compilation inside a Linux-based Docker container. This avoids common Windows issues related to path length limits, line ending differences (CRLF vs LF), and CGO header mismatches.