Potential Field DML

Potential Field Based Deep Metric Learning
Computer Vision and Pattern Recognition Conference (CVPR) 2025

University of Illinois at Urbana-Champaign

Abstract

Deep metric learning (DML) involves training a network to learn a semantically meaningful representation space. Many current approaches mine n-tuples of examples and model interactions within each tuplets. We present Potential Field based metric learning (PFML), a novel compositional DML model, inspired by electrostatic fields in physics that, instead of in tuples, represents the influence of each example (embedding) by a continuous potential field, and superposes the fields to obtain their combined global potential field. We use attractive/repulsive potential fields to represent interactions among embeddings from images of the same/different classes. Contrary to typical learning methods, where mutual influence of samples is proportional to their distance, we enforce reduction in such influence with distance, leading to a decaying field. We show that such decay helps improve performance on real world datasets with large intra-class variations and label noise. Like other proxy-based methods, we also use proxies to succinctly represent sub-populations of examples. We evaluate our method on three standard DML benchmarks- Cars-196, CUB-200-2011, and SOP datasets where it outperforms state-of-the-art baselines.

TL;DR:

Use a continous potential field to represent interactions between a set of example embeddings instead of using subsets of examples (triplets/tuplets) or proxies

Intuition

An example of PFML on a toy problem with embeddings from 2 classes shown. PFML creates a potential field (visualized) by superposing attractive and repulsive fields generated by individual embeddings. It draws embeddings towards other nearby embeddings belonging to the same class, while also being driven away from embeddings of other classes which is a mirror image of the behavior of an isolated system of electric charges. Such a potential field is defined for each embeddings of each class, with the field for the blue embeddings being visualized in the animation . The movement shown in the animation is a result of the net effect of all potentials (blue and red).

Advantages of PFML vs Previous DML Approaches

Models all Interactions:Potential field based representation enables modelling interactions between all sample embeddings, as opposed to modeling those between small subsets done in all previous methods using e.g. point-tuplets based (e.g., contrastive, triplet, N-tuplet ) loss and Proxy-based losses (e.g., Proxy NCA, Proxy Anchor).
Improved Feature Robustness: Modeling interactions of all points, made possible by the use of potentials helps:
- improve the quality of features learned while also
- increasing robustness to noise since the effect of noise on interactions among a smaller number of samples will have a larger variance.
Decaying Strength of Interaction: A major difference of our potential field based approach compared to previous approaches is in the variation of strength of interaction between two points as the distance between them increases: instead of remaining constant or even becoming stronger, as is the case with most existing methods, in our model it becomes weaker with distance. This decay in interaction strength is helpful in several ways:
- Improved performance for datasets with large intra-class variance: As PFML ensures the intuitive expectation that two distant positive samples are too different to be considered as variants of each other, helping treat them as different varieties (e.g., associated with different proxies).
- Improved performance for datasets with label noise: The decay improves performance when noise affects labels, e.g., due to annotation errors common in real-world datasets .
- Improves effectiveness of proxies: Learned proxies remain closer to (at smaller W2 distance) the embeddings they represent than current methods where interactions strengthen with distance.

Further details can be found in our paper.

Potential Field Based Deep Metric Learning
Computer Vision and Pattern Recognition Conference (CVPR) 2025

Paper

Paper

Arxiv

Poster

Slides

Abstract

TL;DR:

Intuition

Advantages of PFML vs Previous DML Approaches

Method

Performance on DML Benchmarks

Visual Resuts

t-SNE Visual of Embedding Space

Zero-shot Retrieval Examples

Citation

Potential Field Based Deep Metric Learning Computer Vision and Pattern Recognition Conference (CVPR) 2025

Paper

Paper

Arxiv

Poster

Slides

Abstract

TL;DR:

Intuition

Advantages of PFML vs Previous DML Approaches

Method

Performance on DML Benchmarks

Visual Resuts

t-SNE Visual of Embedding Space

Zero-shot Retrieval Examples

Citation

Potential Field Based Deep Metric Learning
Computer Vision and Pattern Recognition Conference (CVPR) 2025