Multiple Object Tracking

Multiple Object Tracking

Status: emerging
Last updated: 2026-06-01
Sources: 156856888X00122.Pdf
Tags: [multiple-object-tracking, visual-attention, FINST, parallel-processing, preattentive, attention, cognitive-foundations]

Summary

Multiple object tracking (MOT) is the ability to keep track of several moving items at once among identical distractors. Pylyshyn and Storm (1988) showed that observers can track up to about five independently moving targets in a field of ten identical objects, performing far better than a serial scanning strategy could allow, and argued that visual attention can therefore operate in parallel across several locations rather than at a single locus. They interpreted the finding through the FINST model, which posits a primitive, preattentive mechanism that assigns "sticky" reference tokens to a limited number of visual objects and keeps them bound to those objects as they move. The work bears on the limited-capacity, selective-attention account in Information Processing and on the visual basis of Situation Awareness.

Body

Context

Pylyshyn and Storm (1988), in Tracking multiple independent targets: Evidence for a parallel tracking mechanism, examine whether visual attention is confined to a single locus that must be moved serially, or whether some attentional operations can run in parallel across the visual field. They report two experiments using a tracking paradigm and relate the results to the FINST model of feature binding. Within this knowledge base the article sharpens the treatment of selective and focused attention in Information Processing: where that article describes attention as a gateway with limited parallel capacity, this one provides direct evidence that identity-tracking can proceed in parallel for a small number of objects. It also connects to the perceptual coding of Sensation And Perception and to the element-perception (Level 1) demands of Situation Awareness. It is held at emerging status as a single-source article. Visual-attention content here is also relevant to eye-tracking-research-kb (link, do not duplicate).

Key Points

The central finding concerns the limits of parallel tracking. In Experiment I a field of ten identical objects was shown; a subset was briefly cued as targets, after which all ten moved randomly and independently, and the observer had to keep track of which were targets in order to detect a change in a target versus a distractor (PDF pp. 3–4, orig. pp. 181–182). Observers tracked subsets of up to five targets among the ten objects with high accuracy — about 87% — and error rates rose only modestly with set size (PDF pp. 1, 7, orig. pp. 179, 185).

The result rules out a purely serial account. Pylyshyn and Storm (1988) calculated that, under conservative assumptions about the speed of attention movement and encoding, a serial scanning-and-updating algorithm could not exceed about 40% accuracy on the task, yet observers reached 87% (PDF p. 1, orig. p. 179). The large gap implies that at least one attentional operation — indexing objects and maintaining their identity over motion — is carried out in parallel across several independent loci, contradicting the common assumption that attention attends to only one region at a time.

The authors explain the result with the FINST model. The model posits a primitive operation that assigns internal reference tokens, called FINSTs (fingers of instantiation), to a limited number of visual features or feature-clusters. A FINST is a "sticky" pointer: once bound to an object it continues to refer to that object as it moves, providing automatic tracking, and it does so preattentively and in parallel at several places at once rather than through serial focal scanning (PDF pp. 2–3, orig. pp. 180–181). The function of these indexes is to bind visual features to objects prior to later pattern recognition, so that higher-level processes can refer to a tracked object without re-searching for it. The General Discussion treats the convergence between the independently motivated FINST model and the tracking data as mutually supporting (PDF pp. 17–18, orig. pp. 195–196).

Conclusion

Pylyshyn and Storm (1988) conclude that visual attention is not restricted to a single serially moved locus: observers can index and track about four or five independently moving objects in parallel, well beyond what serial scanning would permit. They locate this ability in a primitive, preattentive indexing mechanism — the FINST — that maintains object identity through motion and supplies the binding on which later recognition depends. The study thus establishes both an empirical capacity limit for parallel tracking and a mechanism to account for it.

References

Pylyshyn, Z.W. & Storm, R.W. (1988) 'Tracking multiple independent targets: Evidence for a parallel tracking mechanism', Spatial Vision, 3(3), pp. 179–197. doi: 10.1163/156856888X00122. pylyshyn1988tracking

Open Questions

  • How does the four-to-five-object tracking limit relate to the roughly four-chunk capacity limit of the focus of attention (see Working Memory Capacity)? Are they expressions of the same constraint?
  • To what extent does parallel object tracking support real-world monitoring of multiple moving elements (air traffic, vessel traffic, multi-target surveillance), and how does it interact with the vigilance decrement (see Vigilance And Sustained Attention)?