trackforge / trackers / deep_ocsort
Module deep_ocsort
Deep OC-SORT: Observation-Centric SORT with appearance
This module implements the Deep OC-SORT algorithm.
Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-Identification Gerard Maggiolino, Adnan Ahmad, Jinkun Cao, Kris Kitani arXiv:2302.11813
Algorithm overview
Deep OC-SORT extends OC-SORT by adding an appearance term to the association:
- OCM adds a velocity direction-consistency bonus to the IoU before matching.
- ORU replays interpolated observations to correct the Kalman filter after a track is re-associated following a gap.
- Appearance association blends a cosine distance to each track's feature gallery
with the motion cost. The appearance weight scales with detector confidence (dynamic
appearance) and is gated by
max_cosine_distance. Withappearance_weight = 0the association reduces to plain OC-SORT. - Camera motion compensation (CMC) warps track predictions by a caller-supplied
affine transform before association, for moving-camera footage. The transform is
shared
common::cmcinfrastructure reused by other trackers.
This is a clean-room implementation. The tracker applies a camera-motion transform but does not estimate it: the caller supplies the affine (for example from image registration), keeping the core free of heavy computer-vision dependencies.
Builds on
utils::kalman- the shared 8-dimensional Kalman filterutils::geometry-iou_batch,tlwh_to_xyah,xyah_to_tlwhutils::assignment-greedy_matchtrackers::common-KalmanTrack,TrackState, andCameraMotion(CMC)trackers::deepsort-NearestNeighborDistanceMetricfor the cosine feature gallery
Parameters
| Parameter | Default | Description |
|---|---|---|
max_age |
30 | Frames a lost track survives before deletion |
min_hits |
3 | Consecutive matches required to confirm a track |
iou_threshold |
0.3 | Minimum IoU to associate a detection with a track |
delta_t |
3 | Observation window (frames) used to compute velocity (OCM) |
inertia |
0.2 | Weight of the direction-consistency cost bonus (OCM) |
appearance_weight |
0.5 | Blend weight of the appearance cost, scaled by det. score |
max_cosine_distance |
0.2 | Maximum cosine distance for the appearance term to apply |
nn_budget |
100 | Maximum appearance features stored per track |
Rust API
```rust,ignore use trackforge::trackers::deep_ocsort::DeepOcSort;
// extractor implements the AppearanceExtractor trait (plug in any Re-ID model).
let mut tracker = DeepOcSort::new(extractor, 30, 3, 0.3, 3, 0.2, 0.5, 0.2, 100);
let tracks = tracker.update(&image, detections)?;
for t in tracks {
println!("ID: {}, Box: {:?}", t.track_id, t.tlwh);
}
## Python API
```python
from trackforge import DEEPOCSORT
tracker = DEEPOCSORT(
max_age=30,
min_hits=3,
iou_threshold=0.3,
delta_t=3,
inertia=0.2,
appearance_weight=0.5,
max_cosine_distance=0.2,
nn_budget=100,
)
detections = [([100.0, 100.0, 50.0, 100.0], 0.9, 0)]
embeddings = [[0.1, 0.2, 0.3]] # one appearance vector per detection
tracks = tracker.update(detections, embeddings)
Moving camera: pass a [a, b, tx, c, d, ty] affine mapping the previous frame
to the current one (estimate it however you like, e.g. with OpenCV).
camera_motion = [1.0, 0.0, 12.0, 0.0, 1.0, -4.0]
tracks = tracker.update(detections, embeddings, camera_motion)
for track_id, tlwh, score, class_id in tracks:
print(f"ID: {track_id}, Box: {tlwh}")
Credit
Clean-room Rust implementation of the algorithm described in the paper above. Original reference implementation: GerardMaggiolino/Deep-OC-SORT.
Citation
@inproceedings{maggiolino2023deepocsort,
title={Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-Identification},
author={Maggiolino, Gerard and Ahmad, Adnan and Cao, Jinkun and Kitani, Kris},
booktitle={IEEE International Conference on Image Processing (ICIP)},
year={2023}
}
Quick Reference
| Item | Kind | Description |
|---|---|---|
DeepOcSort |
struct | Deep OC-SORT tracker. |
Types
DeepOcSortTrack
struct DeepOcSortTrack {
pub tlwh: [f32; 4],
pub score: f32,
pub class_id: i64,
pub track_id: u64,
pub state: crate::trackers::common::TrackState,
pub hits: usize,
pub hit_streak: usize,
pub time_since_update: usize,
pub age: usize,
// [REDACTED: Private Fields]
}
A single tracked object managed by Deep OC-SORT.
Carries the OC-SORT motion state (Kalman filter plus an observation history for OCM and ORU) and a small buffer of appearance embeddings that are flushed into the tracker's feature gallery after each matched update.
Fields
| Name | Type | Description |
|---|---|---|
tlwh |
[f32; 4] |
Bounding box in TLWH (top-left x, top-left y, width, height) format. |
score |
f32 |
Detection confidence of the most recent match. |
class_id |
i64 |
Class label of the most recent match. |
track_id |
u64 |
Unique monotonically increasing track identifier. |
state |
crate::trackers::common::TrackState |
Current lifecycle state. |
hits |
usize |
Total number of detection matches over the track lifetime. |
hit_streak |
usize |
Consecutive detection matches without interruption (resets on a missed frame). |
time_since_update |
usize |
Frames elapsed since the last detection match. |
age |
usize |
Total frames since track creation. |
Trait Implementations
DeepOcSortTracker
struct DeepOcSortTracker {
pub tracks: Vec<super::track::DeepOcSortTrack>,
// [REDACTED: Private Fields]
}
Deep OC-SORT tracker core.
Runs the OC-SORT motion association (IoU with an OCM direction bonus, plus an
ORU re-update on re-association) and blends in an appearance cost from a cosine
feature gallery. The appearance weight scales with detector confidence (dynamic
appearance) and is gated by max_cosine_distance. With appearance_weight = 0
the association reduces to plain OC-SORT.
Implementations
fn new(max_age: usize, min_hits: usize, iou_threshold: f32, delta_t: usize, inertia: f32, appearance_weight: f32, max_cosine_distance: f32, metric: NearestNeighborDistanceMetric) -> Self
fn update(&mut self, detections: &[([f32; 4], f32, i64)], embeddings: &[Vec<f32>]) -> Vec<DeepOcSortTrack>
Update the tracker with the current frame's detections and embeddings.
embeddings is parallel to detections. Pass an empty slice to run without
appearance (pure OC-SORT). Returns confirmed tracks matched this frame.
fn update_with_camera_motion(&mut self, detections: &[([f32; 4], f32, i64)], embeddings: &[Vec<f32>], camera_motion: &CameraMotion) -> Vec<DeepOcSortTrack>
Update the tracker, first warping track predictions by camera_motion.
camera_motion maps the previous frame's coordinates into the current frame
(see CameraMotion); pass [CameraMotion::identity] for a static camera.
Trait Implementations
DeepOcSort<E: AppearanceExtractor>
Deep OC-SORT tracker.
Wraps the association core with an AppearanceExtractor so the caller can pass a
frame and detections and have the embeddings produced internally. To pass
embeddings directly, drive DeepOcSortTracker instead.
Implementations
fn new(extractor: E, max_age: usize, min_hits: usize, iou_threshold: f32, delta_t: usize, inertia: f32, appearance_weight: f32, max_cosine_distance: f32, nn_budget: usize) -> Self
Create a new Deep OC-SORT tracker.
# Arguments
| Argument | Description |
|---|---|
extractor |
The appearance feature extractor. |
max_age |
Frames a lost track survives before deletion. Default: 30. |
min_hits |
Consecutive matches required to confirm a track. Default: 3. |
iou_threshold |
Minimum IoU to associate a detection with a track. Default: 0.3. |
delta_t |
Observation window for velocity computation. Default: 3. |
inertia |
Weight for the OCM direction-consistency bonus. Default: 0.2. |
appearance_weight |
Blend weight of the appearance cost. Default: 0.5. |
max_cosine_distance |
Gate for the appearance term. Default: 0.2. |
nn_budget |
Maximum appearance features stored per track. Default: 100. |
Create a tracker with the default parameters.
fn update(&mut self, image: &DynamicImage, detections: Vec<(BoundingBox, f32, i64)>) -> Result<Vec<DeepOcSortTrack>, Box<dyn Error>>
Update the tracker with a frame and its detections.
Extracts an appearance embedding per detection, then runs the association.
Returns the confirmed tracks matched in this frame.
fn update_with_camera_motion(&mut self, image: &DynamicImage, detections: Vec<(BoundingBox, f32, i64)>, camera_motion: &CameraMotion) -> Result<Vec<DeepOcSortTrack>, Box<dyn Error>>
Update the tracker, first warping track predictions by camera_motion.
Use this on moving-camera footage; estimate the affine transform between the
previous and current frame (for example with image registration) and pass it
in. See CameraMotion.
Trait Implementations