A more elegant way to propagate tracing context

In previous posts,[1] I mentioned that in Rust, tracing is often used alongside opentelemetry to build a local-cluster model. In this setup, tracing is responsible for generating and structuring local spans, while opentelemetry takes care of propagating context across service boundaries.

However, I’ve come to realize that this explanation alone doesn’t quite capture what a real-world implementation of tracing injection looks like.

In this post, I’ll walk you through a more elegant approach to integrating tracing with opentelemetry. This method goes beyond simply extracting metadata—by leveraging opentelemetry::propagation, we can achieve a much cleaner and more robust solution.

The problem

In complex applications, remote calls are common. In a microservices architecture, services communicate with each other through RPC. In message queue models, we often need to propagate context between the publish and consume stages. In these scenarios, it's crucial to pass the tracing context across process boundaries.

In the previous post, I used manual metadata extraction to propagate tracing context. This approach is verbose and error-prone. For example:

req.metadata_mut().insert(
    RPC_TRACE_ID,
    span.context()          // get the opentelemetry context
        .span()             // get the opentelemetry span
        .span_context()     // get the opentelemetry span context, including informations that we need to transmit
        .trace_id()
        .to_string()
        .parse()
        .unwrap(),
);

This method makes us the metadata extractor, manually accessing and parsing data from the span context. Instead of doing this, the opentelemetry::propagation module offers a more elegant solution using the Injector and Extractor traits. Injector is used to insert context into a data structure, and Extractor retrieves it. With these traits, context propagation becomes cleaner and more maintainable.

The solution

To use opentelemetry::propagation, we need to implement the Injector and Extractor traits for our metadata type.

Implementation based on orphan rule

According to the orphan rule, sometimes we need to create a new type. Currently I'm working on a tracing middleware for volo-grpc and a message wrapper for broccoli-queue, and that's what problem I encountered.

#[derive(Debug, Clone, PartialEq, serde::Serialize, serde::Deserialize)]
pub struct MessageWithMetadata<T> {
    metadata: HashMap<String, String>,
    payload: T,
}

impl<T> MessageWithMetadata<T> {
    pub fn new(payload: T) -> Self {
        Self {
            metadata: HashMap::new(),
            payload,
        }
    }
}

impl<T> opentelemetry::propagation::Injector for MessageWithMetadata<T> {
    fn set(
        &mut self,
        key: &str,
        value: String,
    ) {
        self.metadata.insert(key.to_string(), value);
    }
}

impl<T> opentelemetry::propagation::Extractor for MessageWithMetadata<T> {
    fn get(
        &self,
        key: &str,
    ) -> Option<&str> {
        self.metadata.get(key).map(|s| s.as_str())
    }

    fn keys(&self) -> Vec<&str> {
        self.metadata.keys().map(|s| s.as_str()).collect()
    }
}

It’s important to note that I implemented the metadata type myself because broccoli-queue doesn’t support complete metadata transmission out of the box. As a result, this wrapper isn’t just for enabling the propagator—it also fulfills the need to carry metadata alongside the message.

Use the Injector and Extractor

With the Injector and Extractor traits implemented, we can now propagate context in a structured and type-safe way.

The TextMapPropagator trait provides the core methods inject and extract for context propagation. However, you might notice that using these methods directly often results in no context being injected or extracted. This is because they rely on opentelemetry’s internal context, which is separate from the context managed by tracing.

Since tracing stores context in tracing::Dispatch, we must bridge this gap explicitly. To propagate context properly, we need to extract an opentelemetry::span::Span from the current tracing::Span using the tracing_opentelemetry::OpenTelemetrySpanExt trait. We then pass the resulting opentelemetry::Context to the appropriate propagation methods—inject_context for injection and extract_with_context for extraction.

The updated code looks like this:

// Inject context into the outgoing message
let cx = tracing::Span::current().context();

let mut message_wrap = MessageWithMetadata::new(payload);
opentelemetry::global::get_text_map_propagator(|propagator| {
    propagator.inject_context(&cx, &mut message_wrap);
});

// Extract context from the incoming message
let mut cx = tracing::Span::current().context();
opentelemetry::global::get_text_map_propagator(|propagator| {
    cx = propagator.extract(&message.payload);
});
tracing::span::Span::current().set_parent(cx);

Conclusion

In this post, I showed you how to connect tracing with opentelemetry in a more elegant way using the opentelemetry::propagation module. By implementing the Injector and Extractor traits for our metadata type, we can easily propagate context across process boundaries without manually extracting metadata.

Comparing to the previous method, this approach is cleaner and more maintainable. RemoteSpanContext have many fields, and we don't need to access them manually anymore. We can just use the opentelemetry::propagation module to do that. This is a great improvement for observability in Rust applications.


  1. https://www.ahdark.blog/observability-improvement-for-web-backend-applications ↩︎

Read more

Web 后端应用程序的可观测性改进

Web 后端应用程序的可观测性改进

相信日志,即 Logging,对于大多数开发人员是不陌生的。 日志是一种记录应用程序运行状态的重要手段,它可以帮助我们了解应用程序的运行情况,排查问题,甚至是监控应用程序的性能。在 Web 后端应用程序中,日志是不可或缺的一部分,它可以记录用户请求、应用程序的错误、警告等信息,帮助我们更好地了解应用程序的运行情况。 但它真的不可或缺吗?读完这篇文章后我想我给你的答案是:不是。日志的形式很单一,只是文本,这注定了它很冗杂的特点,我们很难从中提取我们需要的信息。即使你使用 AI 如 ChatGPT,也并不一定可以得到一个有建设性的答案。对于自己构建的应用程序,ChatGPT 既不了解也不可能去真的全部了解你的代码,这就带来了问题。 为什么日志不是不可或缺的 日志的形式单一,以纯文本呈现,信息常常显得冗余且难以提取。即便是使用 AI 进行分析,也不一定能提供清晰的洞见。日志的主要问题在于: * 冗余性和庞大的数据量:日志往往包含大量无用信息,查找特定问题的关键信息耗时。 * 缺乏上下文关联:单条日志难以呈现多个服务之间的调用关系和上下文,尤其是在微服务架构中

By AHdark
我如何从零开始学习一门编程语言

我如何从零开始学习一门编程语言

作为一门全栈开发,我理应掌握多门语言。截止 2024 年 9 月,我掌握了超过 30 门编程语言,可以使用它们构建简单的应用程序、为开源社区提交代码、为公司开发产品。我们不讨论对于“掌握”的定义,让我梳理思路,详细阐述我是怎么一步步掌握如 此多的编程语言的。如果你也想学习一门新的编程语言,希望这篇文章能够帮助到你。

By AHdark

通过控制反转降低代码耦合

或许你在学习 Spring 的时候曾听过,Spring 提供的容器叫做 IoC 容器。 IoC 是 Inversion of Control 的缩写,中文翻译为控制反转。控制反转是一种设计原则,早在 2004 年 Martin Fowler 便提出依赖反转,即依赖对象的获得被反转了。 非控制反转的问题 大多数应用程序都是由两个或是更多的类或组件通过彼此的合作来实现业务逻辑,这使得每个对象都需要获取与其合作的对象(也就是它所依赖的对象)的引用。 如果这个获取过程要靠自身实现,那么这将导致代码高度耦合并且难以维护和调试。 直接实例化与紧耦合 比如 A 组件需要调用 B,一般情况下你可能会在 A 的 constructor 或 init 块中显式建立 B 组件然后调用。 class A { private val b = new B(

By AHdark