Skip to content

trackforge / trackers / deep_ocsort


Module deep_ocsort

Deep OC-SORT: Observation-Centric SORT with appearance

This module implements the Deep OC-SORT algorithm.

Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-Identification Gerard Maggiolino, Adnan Ahmad, Jinkun Cao, Kris Kitani arXiv:2302.11813

Algorithm overview

Deep OC-SORT extends OC-SORT by adding an appearance term to the association:

  • OCM adds a velocity direction-consistency bonus to the IoU before matching.
  • ORU replays interpolated observations to correct the Kalman filter after a track is re-associated following a gap.
  • Appearance association blends a cosine distance to each track's feature gallery with the motion cost. The appearance weight scales with detector confidence (dynamic appearance) and is gated by max_cosine_distance. With appearance_weight = 0 the association reduces to plain OC-SORT.
  • Camera motion compensation (CMC) warps track predictions by a caller-supplied affine transform before association, for moving-camera footage. The transform is shared common::cmc infrastructure reused by other trackers.

This is a clean-room implementation. The tracker applies a camera-motion transform but does not estimate it: the caller supplies the affine (for example from image registration), keeping the core free of heavy computer-vision dependencies.

Builds on

  • utils::kalman - the shared 8-dimensional Kalman filter
  • utils::geometry - iou_batch, tlwh_to_xyah, xyah_to_tlwh
  • utils::assignment - greedy_match
  • trackers::common - KalmanTrack, TrackState, and CameraMotion (CMC)
  • trackers::deepsort - NearestNeighborDistanceMetric for the cosine feature gallery

Parameters

Parameter Default Description
max_age 30 Frames a lost track survives before deletion
min_hits 3 Consecutive matches required to confirm a track
iou_threshold 0.3 Minimum IoU to associate a detection with a track
delta_t 3 Observation window (frames) used to compute velocity (OCM)
inertia 0.2 Weight of the direction-consistency cost bonus (OCM)
appearance_weight 0.5 Blend weight of the appearance cost, scaled by det. score
max_cosine_distance 0.2 Maximum cosine distance for the appearance term to apply
nn_budget 100 Maximum appearance features stored per track

Rust API

```rust,ignore use trackforge::trackers::deep_ocsort::DeepOcSort;

// extractor implements the AppearanceExtractor trait (plug in any Re-ID model). let mut tracker = DeepOcSort::new(extractor, 30, 3, 0.3, 3, 0.2, 0.5, 0.2, 100); let tracks = tracker.update(&image, detections)?; for t in tracks { println!("ID: {}, Box: {:?}", t.track_id, t.tlwh); }

## Python API

```python
from trackforge import DEEPOCSORT

tracker = DEEPOCSORT(
    max_age=30,
    min_hits=3,
    iou_threshold=0.3,
    delta_t=3,
    inertia=0.2,
    appearance_weight=0.5,
    max_cosine_distance=0.2,
    nn_budget=100,
)

detections = [([100.0, 100.0, 50.0, 100.0], 0.9, 0)]
embeddings = [[0.1, 0.2, 0.3]]  # one appearance vector per detection
tracks = tracker.update(detections, embeddings)

Moving camera: pass a [a, b, tx, c, d, ty] affine mapping the previous frame
to the current one (estimate it however you like, e.g. with OpenCV).
camera_motion = [1.0, 0.0, 12.0, 0.0, 1.0, -4.0]
tracks = tracker.update(detections, embeddings, camera_motion)

for track_id, tlwh, score, class_id in tracks:
    print(f"ID: {track_id}, Box: {tlwh}")

Credit

Clean-room Rust implementation of the algorithm described in the paper above. Original reference implementation: GerardMaggiolino/Deep-OC-SORT.

Citation

@inproceedings{maggiolino2023deepocsort,
  title={Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-Identification},
  author={Maggiolino, Gerard and Ahmad, Adnan and Cao, Jinkun and Kitani, Kris},
  booktitle={IEEE International Conference on Image Processing (ICIP)},
  year={2023}
}

Quick Reference

Item Kind Description
DeepOcSort struct Deep OC-SORT tracker.

Types

DeepOcSortTrack

struct DeepOcSortTrack {
    pub tlwh: [f32; 4],
    pub score: f32,
    pub class_id: i64,
    pub track_id: u64,
    pub state: crate::trackers::common::TrackState,
    pub hits: usize,
    pub hit_streak: usize,
    pub time_since_update: usize,
    pub age: usize,
    // [REDACTED: Private Fields]
}

A single tracked object managed by Deep OC-SORT.

Carries the OC-SORT motion state (Kalman filter plus an observation history for OCM and ORU) and a small buffer of appearance embeddings that are flushed into the tracker's feature gallery after each matched update.

Fields

Name Type Description
tlwh [f32; 4] Bounding box in TLWH (top-left x, top-left y, width, height) format.
score f32 Detection confidence of the most recent match.
class_id i64 Class label of the most recent match.
track_id u64 Unique monotonically increasing track identifier.
state crate::trackers::common::TrackState Current lifecycle state.
hits usize Total number of detection matches over the track lifetime.
hit_streak usize Consecutive detection matches without interruption (resets on a missed frame).
time_since_update usize Frames elapsed since the last detection match.
age usize Total frames since track creation.

Trait Implementations

impl Clone for DeepOcSortTrack

fn clone(&self) -> DeepOcSortTrack

impl Debug for DeepOcSortTrack

fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result

impl Read<Exclusive, BecauseExclusive> for DeepOcSortTrack

fn to_subset(&self) -> Option<SS>

fn is_in_subset(&self) -> bool

fn to_subset_unchecked(&self) -> SS

fn from_subset(element: &SS) -> SP

DeepOcSortTracker

struct DeepOcSortTracker {
    pub tracks: Vec<super::track::DeepOcSortTrack>,
    // [REDACTED: Private Fields]
}

Deep OC-SORT tracker core.

Runs the OC-SORT motion association (IoU with an OCM direction bonus, plus an ORU re-update on re-association) and blends in an appearance cost from a cosine feature gallery. The appearance weight scales with detector confidence (dynamic appearance) and is gated by max_cosine_distance. With appearance_weight = 0 the association reduces to plain OC-SORT.

Implementations

fn new(max_age: usize, min_hits: usize, iou_threshold: f32, delta_t: usize, inertia: f32, appearance_weight: f32, max_cosine_distance: f32, metric: NearestNeighborDistanceMetric) -> Self

fn update(&mut self, detections: &[([f32; 4], f32, i64)], embeddings: &[Vec<f32>]) -> Vec<DeepOcSortTrack>

Update the tracker with the current frame's detections and embeddings.

embeddings is parallel to detections. Pass an empty slice to run without

appearance (pure OC-SORT). Returns confirmed tracks matched this frame.

fn update_with_camera_motion(&mut self, detections: &[([f32; 4], f32, i64)], embeddings: &[Vec<f32>], camera_motion: &CameraMotion) -> Vec<DeepOcSortTrack>

Update the tracker, first warping track predictions by camera_motion.

camera_motion maps the previous frame's coordinates into the current frame

(see CameraMotion); pass [CameraMotion::identity] for a static camera.

Trait Implementations

impl Read<Exclusive, BecauseExclusive> for DeepOcSortTracker

fn to_subset(&self) -> Option<SS>

fn is_in_subset(&self) -> bool

fn to_subset_unchecked(&self) -> SS

fn from_subset(element: &SS) -> SP

DeepOcSort<E: AppearanceExtractor>

struct DeepOcSort<E: AppearanceExtractor> {
    // [REDACTED: Private Fields]
}

Deep OC-SORT tracker.

Wraps the association core with an AppearanceExtractor so the caller can pass a frame and detections and have the embeddings produced internally. To pass embeddings directly, drive DeepOcSortTracker instead.

Implementations

fn new(extractor: E, max_age: usize, min_hits: usize, iou_threshold: f32, delta_t: usize, inertia: f32, appearance_weight: f32, max_cosine_distance: f32, nn_budget: usize) -> Self

Create a new Deep OC-SORT tracker.

# Arguments

Argument Description
extractor The appearance feature extractor.
max_age Frames a lost track survives before deletion. Default: 30.
min_hits Consecutive matches required to confirm a track. Default: 3.
iou_threshold Minimum IoU to associate a detection with a track. Default: 0.3.
delta_t Observation window for velocity computation. Default: 3.
inertia Weight for the OCM direction-consistency bonus. Default: 0.2.
appearance_weight Blend weight of the appearance cost. Default: 0.5.
max_cosine_distance Gate for the appearance term. Default: 0.2.
nn_budget Maximum appearance features stored per track. Default: 100.

fn new_default(extractor: E) -> Self

Create a tracker with the default parameters.

fn update(&mut self, image: &DynamicImage, detections: Vec<(BoundingBox, f32, i64)>) -> Result<Vec<DeepOcSortTrack>, Box<dyn Error>>

Update the tracker with a frame and its detections.

Extracts an appearance embedding per detection, then runs the association.

Returns the confirmed tracks matched in this frame.

fn update_with_camera_motion(&mut self, image: &DynamicImage, detections: Vec<(BoundingBox, f32, i64)>, camera_motion: &CameraMotion) -> Result<Vec<DeepOcSortTrack>, Box<dyn Error>>

Update the tracker, first warping track predictions by camera_motion.

Use this on moving-camera footage; estimate the affine transform between the

previous and current frame (for example with image registration) and pass it

in. See CameraMotion.

Trait Implementations

impl Read<Exclusive, BecauseExclusive> for DeepOcSort<E>

fn to_subset(&self) -> Option<SS>

fn is_in_subset(&self) -> bool

fn to_subset_unchecked(&self) -> SS

fn from_subset(element: &SS) -> SP