Potential Field Based Deep Metric Learning
Abstract
Deep metric learning (DML) involves training a network to learn a semantically meaningful representation space. Many current approaches mine n-tuples of examples and model interactions within each tuplets. We present Potential Field based metric learning (PFML), a novel compositional DML model, inspired by electrostatic fields in physics that, instead of in tuples, represents the influence of each example (embedding) by a continuous potential field, and superposes the fields to obtain their combined global potential field. We use attractive/repulsive potential fields to represent interactions among embeddings from images of the same/different classes. Contrary to typical learning methods, where mutual influence of samples is proportional to their distance, we enforce reduction in such influence with distance, leading to a decaying field. We show that such decay helps improve performance on real world datasets with large intra-class variations and label noise. Like other proxy-based methods, we also use proxies to succinctly represent sub-populations of examples. We evaluate our method on three standard DML benchmarks- Cars-196, CUB-200-2011, and SOP datasets where it outperforms state-of-the-art baselines.
TL;DR:
Use a continous potential field to represent interactions between a set of example embeddings instead of using subsets of examples (triplets/tuplets) or proxies
Intuition
Advatages of PFML vs Previous DML Approaches
- Potential field representation enables modelling interactions between all sample embeddings, as opposed to modeling those between small subsets (of sample or proxy points) as done in all previous methods using e.g. point-tuplets based (e.g., contrastive, triplet, N-tuplet ) loss and Proxy-based losses (e.g., Proxy NCA, Proxy Anchor).
- Modeling interactions of all points, made possible by the use of potentials helps:
- improve the quality of features learned while also
- increasing robustness to noise since the effect of noise on interactions among a smaller number of samples will have a larger variance.
- A major difference of our potential field based approach compared to previous approaches
is in the variation of strength of interaction between two points
as the distance between them increases: instead of remaining constant or even becoming stronger,
as is the case with most existing methods,
in our model it becomes weaker with distance.
This decay in interaction strength is helpful in several ways:
- It ensures the intuitive expectation that two distant positive samples are too different to be considered as variants of each other, helping treat them as different varieties (e.g., associated with different proxies).
- The decay property also significantly improves performance for the specific type of noise affecting labels, e.g., due to annotation errors common in real-world datasets .
- As a result of the decay, the learned proxies remain closer to (at smaller Wasserstein distance W2 from) the sample embeddings they represent than for (e.g., current proxy-based) methods where interactions strengthen with distance, thereby enhancing their desired role.