Unlocking GenAI Observability with OpenTelemetry
GenAI observability exists to provide visibility into the operations of generative AI systems. As AI becomes integral to applications, understanding how these systems perform and interact is essential. OpenTelemetry offers a robust framework for capturing telemetry data, allowing you to standardize the recording of GenAI operations and improve your system's observability.
To implement this, you need to configure a few key settings. For instance, in VS Code Copilot, you can enable OpenTelemetry emission by adjusting the github.copilot.chat.otel.enabled parameter, which is set to true by default. You can also capture the full prompt and response content with github.copilot.chat.otel.captureContent, also true by default. Additionally, specify the OTLP collector endpoint using github.copilot.chat.otel.otlpEndpoint, which defaults to http://localhost:4318. This setup ensures that your telemetry data is exported correctly, providing insights into how your GenAI models are functioning.
In production, be aware that by default, no prompt content or tool arguments are captured with GenAI telemetry due to potential sensitivity issues. This means you might miss critical context unless you explicitly configure your settings to capture this data. The last modification to these settings was noted on May 14, 2026, so ensure your configurations are up to date. Also, remember to monitor your telemetry data closely to avoid missing out on valuable insights that could inform your AI operations.
Key takeaways
- →Enable OpenTelemetry emission with `github.copilot.chat.otel.enabled` set to true.
- →Capture full prompt and response content using `github.copilot.chat.otel.captureContent`.
- →Specify the OTLP collector endpoint with `github.copilot.chat.otel.otlpEndpoint`.
- →Run the telemetry collector using Docker for easy setup.
- →Be cautious about sensitive data; default settings do not capture prompt content.
Why it matters
In production, effective observability of GenAI systems can lead to better performance tuning and troubleshooting. Understanding the interactions between prompts and responses allows for more informed decisions and optimizations.
Code examples
docker run --rm -p 18888:18888 -p 4317:18889 -p 4318:18890 -d --name aspire-dashboard \
-e ASPIRE_DASHBOARD_UNSECURED_ALLOW_ANONYMOUS=true \
mcr.microsoft.com/dotnet/aspire-dashboard:latestWhen NOT to use this
The official docs don't call out specific anti-patterns here. Use your judgment based on your scale and requirements.
Want the complete reference?
Read official docsOpenAI & Anthropic-compatible inference API — no GPU provisioning needed. 55+ models, pay-per-token with no minimums. VPC + zero data retention by default.
Try Serverless Inference →Unlocking Performance: Pyroscope 2.0 for Continuous Profiling at Scale
Pyroscope 2.0 revolutionizes continuous profiling, providing insights into why your code is slow or costly. With data co-location and stateless queriers, it optimizes performance and storage efficiency. Dive in to see how it can transform your observability strategy.
OTel-Arrow Phase 2: Building Efficient Telemetry Pipelines
In the world of observability, efficient telemetry pipelines are crucial for performance. The OpenTelemetry Arrow Protocol (OTAP) leverages a NUMA-friendly architecture to streamline data transport and processing. This article dives into how OTAP transforms telemetry handling.
Securing OpenTelemetry in Legacy Systems: Best Practices
Legacy environments pose unique challenges for observability and security. By leveraging the OpenTelemetry Collector as a bridge, you can enforce Zero Trust principles effectively. Discover how to design a secure telemetry pipeline that minimizes risk.
Get the daily digest
One email. 5 articles. Every morning.
No spam. Unsubscribe anytime.