ROS2_Moveit2_Ur5e_Grasp项目详解（十）：tracker.py详解 • Soup's Blog

项目的主线已经差不多介绍完毕了，接下来都是介绍一些支线代码。在前面的obj_detect.py代码中，可以看到里面引入了一个目标跟踪tracker的代码。所以本节就来了解一下tracker.py。

目标跟踪在本项目中的作用：

在多目标场景中，目标跟踪确保每个目标都有唯一的ID标识，这样机器人可以按照预定顺序（如从左到右）依次抓取目标，而不会混淆不同目标。这对于实现有序抓取任务非常重要。
当目标被短暂遮挡或检测算法在某些帧中未能检测到目标时，跟踪器可以基于先前的状态预测目标位置，保持对目标的跟踪，直到目标重新被检测到。
在obj_detect.py中，系统根据目标的x坐标（从左到右）对检测结果进行排序，这使得机器人可以按顺序抓取目标。跟踪器确保了这种排序的一致性，即使在连续帧之间目标位置略有变化。

这是一个非常经典的基于卡尔曼滤波（Kalman Filter）的多目标跟踪代码，结合了目标检测、运动预测、数据关联和轨迹管理。

该代码系统两个主要类：

Track：表示一个单个目标的轨迹
Tracker：管理所有 Track，处理新检测、匹配、创建/删除轨迹

接下来分开介绍这两个类：

Track#

class Track:
    def __init__(self, track_id, bbox):
        self.id = track_id
        self.kf = self.create_kalman_filter(bbox)
        self.bbox = bbox  ## 用于输出可视化
        self.age = 0
        self.time_since_update = 0

    def create_kalman_filter(self, bbox):
        x1, y1, x2, y2 = bbox
        cx, cy = (x1 + x2) / 2, (y1 + y2) / 2
        w, h = x2 - x1, y2 - y1
        state = [cx, cy, w, h, 0, 0, 0, 0]

        kf = KalmanFilter(dim_x=8, dim_z=4)
        kf.F = np.eye(8)
        for i in range(4):
            kf.F[i, i+4] = 1  ## 位置-速度关联

        kf.H = np.eye(4, 8)  ## 只测量位置宽高
        kf.R *= 5      ## 观测噪声（保守信任观测）
        kf.P *= 1000    ## 初始协方差（对初始状态不自信）
        kf.Q[4:, 4:] *= 5  ## 从 10 减小到 5


        kf.x[:4] = np.array(state[:4]).reshape((4, 1))
        return kf

    def predict(self):
        self.kf.predict()
        self.age += 1
        self.time_since_update += 1
        self.bbox = self.get_bbox()
        return self.bbox

    def update(self, bbox):
        x1, y1, x2, y2 = bbox
        cx, cy = (x1 + x2) / 2, (y1 + y2) / 2
        w, h = x2 - x1, y2 - y1
        z = np.array([cx, cy, w, h])
        self.kf.update(z)
        self.time_since_update = 0
        self.bbox = self.get_bbox()

    def get_bbox(self):
        cx, cy, w, h = self.kf.x[:4].flatten()
        x1 = cx - w / 2
        y1 = cy - h / 2
        x2 = cx + w / 2
        y2 = cy + h / 2
        return [x1, y1, x2, y2]

python

状态表示（State Vector）

state = [cx, cy, w, h, 0, 0, 0, 0]

python

前4维：位置和尺寸
- cx, cy：包围框中心坐标
- w, h：宽度和高度
后4维：速度（导数）
- vx, vy, vw, vh：中心点和宽高的变化率（初始为0）

这是一个8维状态空间：[cx, cy, w, h, vx, vy, vw, vh]

状态转移矩阵 F

kf.F = np.eye(8)
for i in range(4):
    kf.F[i, i+4] = 1  ## 位置 += 速度 * dt

python

这表示一个恒定速度模型（Constant Velocity Model）：

![$$ \begin{bmatrix} cx \ cy \ w \ h \ vx \ vy \ vw \ vh \ \end{bmatrix}_{t+1} \begin{bmatrix} 1 & 0 & 0 & 0 & \Delta t & 0 & 0 & 0 \ 0 & 1 & 0 & 0 & 0 & \Delta t & 0 & 0 \ 0 & 0 & 1 & 0 & 0 & 0 & \Delta t & 0 \ 0 & 0 & 0 & 1 & 0 & 0 & 0 & \Delta t \ 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \ \end{bmatrix} \cdot \begin{bmatrix} cx \ cy \ w \ h \ vx \ vy \ vw \ vh \ \end{bmatrix}_t

3. 观测矩阵H ```python kf.H = np.eye(4, 8) ## 只测量前4个量 ``` 表示我们只能观测到： $$ z = [c_x, c_y, w, h] $$ 而速度是隐变量，只能通过滤波估计。 4. 噪声协方差设置 ![在这里插入图片描述](2.png) 补充一下： - kf.H —— 观测矩阵（Observation Matrix / Measurement Matrix） - 含义：定义如何从系统状态 x 映射到观测值 z。 - 数学公式：$$ z = H \cdot x + v$$ - z: 实际观测值（例如检测到的bbox） - x: 系统内部状态（如 [cx, cy, w, h, vx, vy, vw, vh]） - v: 观测噪声 - H: 决定状态向量中哪些部分能被观测到 - kf.R —— 观测噪声协方差矩阵（Measurement Noise Covariance) - 含义： 描述观测数据的不确定性或噪声大小。值越大，说明你越不信任观测值。 - R 越大 → 滤波器更相信预测值（平滑更多） - R 越小 → 滤波器更相信观测值（响应更快但可能抖动） - kf.P —— 状态误差协方差矩阵（Error Covariance Matrix） - 含义：表示当前状态估计的不确定性程度。它是卡尔曼增益计算的核心。 - P[i,i]：第 i 个状态分量的方差（不确定性） - P[i,j]：状态 i 和 j 之间的协方差（相关性） `kf.P *= 1000` 表示初始状态非常不确定，比如你只知道大致位置，但不知道精确值。这样滤波器会更快地接受观测数据来修正自己。 - kf.Q —— 过程噪声协方差矩阵（Process Noise Covariance） - 含义：描述系统模型本身的不确定性，也就是“真实世界偏离你模型的程度”。 - Q 越大 → 认为运动模型不准确（比如物体可能突然加速） - Q 越小 → 认为物体运动很稳定（如匀速） ![在这里插入图片描述](3.png) 5. predict() 方法：预测下一时刻状态 ```python def predict(self): self.kf.predict() self.age += 1 self.time_since_update += 1 self.bbox = self.get_bbox() ``` - 调用卡尔曼滤波的 预测步骤（Predict） - 更新轨迹年龄和未更新次数 - 从状态向量恢复 bbox 6. update(bbox) 方法：融合新观测 ```python z = [cx, cy, w, h] ## 新检测的中心和尺寸 self.kf.update(z) self.time_since_update = 0 ``` - 执行卡尔曼滤波的 更新步骤（Update） - 使用观测值修正预测值 - 重置“未更新计数器” 7. get_bbox()：从状态生成包围框 ```python cx, cy, w, h = self.kf.x[:4].flatten() x1 = cx - w / 2 y1 = cy - h / 2 x2 = cx + w / 2 y2 = cy + h / 2 ``` - 将内部状态转换为标准 (x1, y1, x2, y2) 格式的边界框，用于可视化或下游任务。 ## Tracker ```python class Tracker: def __init__(self, iou_threshold=0.05, max_age=10): self.tracks = [] self.next_id = 0 self.iou_threshold = iou_threshold self.max_age = max_age def update(self, detections): ## 1. 预测所有轨迹 for track in self.tracks: track.predict() ## 2. 匹配 matches, unmatched_tracks, unmatched_dets = self.match(detections) ## 3. 更新匹配的轨迹 for t_idx, d_idx in matches: self.tracks[t_idx].update(detections[d_idx]) ## 4. 初始化新的轨迹 for d_idx in unmatched_dets: self.tracks.append(Track(self.next_id, detections[d_idx])) self.next_id += 1 ## 5. 移除太旧的轨迹 self.tracks = [t for t in self.tracks if t.time_since_update < self.max_age] return [{'id': t.id, 'bbox': list(map(int, t.bbox))} for t in self.tracks] def match(self, detections): iou_matrix = np.zeros((len(self.tracks), len(detections)), dtype=np.float32) for t, track in enumerate(self.tracks): if track.time_since_update > self.max_age: ## 太旧的就跳过匹配 continue for d, det in enumerate(detections): iou_matrix[t, d] = self.iou(track.bbox, det) matched_indices = [] unmatched_tracks = list(range(len(self.tracks))) unmatched_dets = list(range(len(detections))) used_dets = set() for t in range(len(self.tracks)): if track.time_since_update > self.max_age: continue best_match = np.argmax(iou_matrix[t]) if iou_matrix[t, best_match] > self.iou_threshold and best_match not in used_dets: matched_indices.append((t, best_match)) unmatched_tracks.remove(t) unmatched_dets.remove(best_match) used_dets.add(best_match) return matched_indices, unmatched_tracks, unmatched_dets def iou(self, boxA, boxB): xA = max(boxA[0], boxB[0]) yA = max(boxA[1], boxB[1]) xB = min(boxA[2], boxB[2]) yB = min(boxA[3], boxB[3]) interArea = max(0, xB - xA) * max(0, yB - yA) boxAArea = (boxA[2] - boxA[0]) * (boxA[3] - boxA[1]) boxBArea = (boxB[2] - boxB[0]) * (boxB[3] - boxB[1]) return interArea / float(boxAArea + boxBArea - interArea + 1e-6) ``` 1. 初始化 ```python def __init__(self, iou_threshold=0.05, max_age=10): self.tracks = [] ## 当前所有轨迹 self.next_id = 0 ## 下一个分配的 ID self.iou_threshold ## 匹配阈值 self.max_age ## 轨迹最大存活时间 ``` 2. update(detections) 主流程 **步骤1**：预测所有现有轨迹 ```python for track in self.tracks: track.predict() ``` 所有轨迹都向前“走一步”，预测当前位置。 **步骤2**：匹配 (match) - 构建 IOU 矩阵：每个轨迹与每个检测之间的交并比 - 使用贪心匹配策略： - 对每个轨迹，找 IOU 最大的检测 - 如果 > iou_threshold 且未被占用 → 匹配成功 **步骤3**：更新匹配的轨迹 ```python for t_idx, d_idx in matches: self.tracks[t_idx].update(detections[d_idx]) ``` 用最新检测更新对应轨迹的卡尔曼滤波器 **步骤4**：创建新轨迹 ```python for d_idx in unmatched_dets: self.tracks.append(Track(self.next_id, detections[d_idx])) self.next_id += 1 ``` 未匹配的检测 → 可能是新出现的目标 → 创建新 Track **步骤5**：删除过期轨迹 ```python self.tracks = [t for t in self.tracks if t.time_since_update < self.max_age] ``` 如果一个轨迹长时间未匹配（time_since_update >= max_age），说明目标已消失 → 删除 工作流程： ```python 每一帧输入： ↓ [检测列表 detections] ↓ Tracker.update() ├── 所有轨迹 predict() → 预测当前位置 ├── 计算 IOU 矩阵 ├── 匹配（轨迹 ↔ 检测） ├── 匹配成功 → update()（融合观测） ├── 未匹配检测 → 创建新轨迹 └── 删除太久未更新的轨迹 ↓ [输出：带 ID 的稳定轨迹列表] ```