可观测性
本章演示 OnePath 内建的两类可观测能力:端到端分布式链路追踪(三进程,零代码修改、纯环境变量开关)与 网络拓扑地图(节点局部视图 + 分布式聚合全局图)。追踪机制的完整说明见 追踪指南,拓扑机制见 拓扑场景。
端到端链路追踪
三进程链路追踪演示:sensor → router → storage。sensor 发布消息,router 用一行 onepath_forward 转发到 storage,同一 trace_id 跨三跳不变,验证自动埋点与 TLV attachment 透传。
零代码修改启用——全部开关由环境变量控制:
| 环境变量 | 取值示例 | 作用 |
|---|---|---|
ONEPATH_TRACE_ENABLE | 1 | 追踪总开关;未设则零开销 |
ONEPATH_TRACE_SAMPLE_RATIO | 1.0 | 头部采样率 [0.0, 1.0],1.0 = 全采 |
ONEPATH_TRACE_NDJSON_PATH | /tmp/sensor.ndjson | NDJSON 输出路径,每行一个 span JSON |
ONEPATH_TRACE_SERVICE_NAME | sensor | resource 属性 service.name,用于区分进程 |
关键 OnePath API
onepath_forward(sample, pub)— 核心追踪助手:从线程调用栈顶取父 span,起一个 PRODUCER 子 span,把新的追踪上下文注入 TLV attachment 后转发;trace_id跨跳不变,每跳更新 span_idonepath_declare_publisher/onepath_publisher_put— 发布消息(自动埋点产生 span)onepath_subscribe/onepath_sample_release— 订阅与释放样本
static onepath_publisher_t g_router_pub;
static void router_handler(onepath_sample_t *sample, void *userdata) {
onepath_forward(sample, g_router_pub); /* 起 PRODUCER 子 span 并注入 TLV,trace_id 不变 */
EP_INFO("[router] forward %zu bytes", sample->data_len);
onepath_sample_release(sample);
}
/* sensor 端: */
onepath_declare_publisher(s, &pub, SENSOR_KEY, NULL);
onepath_publisher_put(pub, buf, (size_t)n);按 storage → router → sensor 顺序启动三个进程(每个都带 ONEPATH_TRACE_* 环境变量),默认 peer 模式走组播 P2P,无需路由节点:
# 终端 1: storage(订阅 demo/storage)
ONEPATH_TRACE_ENABLE=1 ONEPATH_TRACE_SAMPLE_RATIO=1.0 \
ONEPATH_TRACE_NDJSON_PATH=/tmp/storage.ndjson ONEPATH_TRACE_SERVICE_NAME=storage \
./examples/build/release/full/onepath_tracing_demo storage
# 终端 2: router(订阅 demo/sensor,用 onepath_forward 转发到 demo/storage)
ONEPATH_TRACE_ENABLE=1 ONEPATH_TRACE_SAMPLE_RATIO=1.0 \
ONEPATH_TRACE_NDJSON_PATH=/tmp/router.ndjson ONEPATH_TRACE_SERVICE_NAME=router \
./examples/build/release/full/onepath_tracing_demo router
# 终端 3: sensor(发布到 demo/sensor)
ONEPATH_TRACE_ENABLE=1 ONEPATH_TRACE_SAMPLE_RATIO=1.0 \
ONEPATH_TRACE_NDJSON_PATH=/tmp/sensor.ndjson ONEPATH_TRACE_SERVICE_NAME=sensor \
./examples/build/release/full/onepath_tracing_demo sensor跑完后用 jq 查看链路(同一 trace_id 应跨三份 NDJSON 出现):
jq -r '.trace_id + " " + .name + " svc=" + .resource["service.name"]' \
/tmp/{sensor,router,storage}.ndjson[2026-06-21-17-20-56:466] [INFO] [sensor] publishing to demo/sensor (rounds=10)
[2026-06-21-17-20-56:466] [INFO] [sensor] -> sensor-msg-0
[2026-06-21-17-20-56:467] [INFO] [router] forward 12 bytes
[2026-06-21-17-20-56:467] [INFO] [storage] <- sensor-msg-0 (e2e via TLV)
[2026-06-21-17-20-56:967] [INFO] [storage] <- sensor-msg-1 (e2e via TLV)每条消息产生 4 个 span(sensor PRODUCER → router CONSUMER → router PRODUCER → storage CONSUMER),trace_id 跨跳不变、span_id 每跳变化、parent_id 链完整。unset ONEPATH_TRACE_ENABLE 即回归零开销。
变体:双后端。两个后端均支持追踪,输出格式一致。
网络拓扑地图
拓扑感知两层能力:本节点局部视图(直连邻居 / 链路 / 是否零拷贝)+ 分布式 agent 聚合出全局拓扑图(节点 + 边 + 传输标签 + 服务归类)。peer 模式组播 P2P,无需路由节点。
关键 OnePath API
onepath_topology_local(s, &local)/onepath_topology_local_free(&local)— 查询/释放本节点局部视图(self id、声明的服务、直连邻居及链路标记)onepath_topology_agent_start(s)/onepath_topology_agent_stop(s)— 启停拓扑 agent(宣告在线、对外提供本节点局部视图供聚合)onepath_topology_snapshot(s, &g, timeout_ms)/onepath_topology_graph_free(&g)— 聚合/释放全局拓扑图(nodes + edges)
onepath_open_peer(&s);
onepath_declare_publisher(s, &pub, key, NULL);
onepath_topology_agent_start(s);
while (g_running) {
onepath_topo_local_t local;
if (onepath_topology_local(s, &local) == ONEPATH_OK) {
printf("self : %s (%s)\n", local.self_zid, whatami_str(local.self_whatami));
onepath_topology_local_free(&local);
}
onepath_topo_graph_t g;
if (onepath_topology_snapshot(s, &g, 2000) == ONEPATH_OK)
onepath_topology_graph_free(&g);
sleep(3);
}
onepath_topology_agent_stop(s);./examples/build/release/full/onepath_topology_map # 单节点查看局部视图
./examples/build/release/full/onepath_topology_map nodeA & # 多终端聚合全局图
./examples/build/release/full/onepath_topology_map nodeB &
./examples/build/release/full/onepath_topology_map nodeC[2026-06-21-17-19-33:379] [ OK ] [agent] started, role=node
self : fb56d32763f82fbcf104ee84855ea5b1 (Peer)
nbr : 3
[Peer] 6abbdc8ccf808671dcb924062b1ae132 (SHM)
link tcp dst=tcp/127.0.0.1:7804
[2026-06-21-17-19-33:379] [INFO] --- global topology graph (4 nodes, 3 edges) ---
EDGE 6abbdc8ccf808671dcb924062b1ae132 <--tcp/shm--> fb56d32763f82fbcf104ee84855ea5b1变体:双后端。拓扑感知与 agent 聚合两个后端均支持,输出格式一致。其中同主机零拷贝标记(
(SHM)/<--tcp/shm-->)仅在完整版出现——这是 SHM 能力边界,精简版不显示该标记。详见 拓扑场景。