The Lovasz-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks

2020年5月13日 23:33

Arxiv

Tensorflow / Pytorch implementation

Demo

1. Abstract

1-1. Loss function for direct optimization w.r.t Jaccard index

1-2. Optimizing the Jaccard index in a continuous optimization framework

1-3. Lovasz hinge loss: binary image segmentation
Lovasz Softmax loss: multi class image segmentation

2. Optimization surrogates for submodular loss functions

2-1. Optimization of Jaccard loss (a problem to select a class for each pixel) is a discrete optimization problem and NP-hardness (2^p)

2-2. Jaccard set function (6) has been shown to be submodular ( Yu, 2015, The Lovász Hinge: A Novel Convex Surrogate for Submodular Losses ) and can be computed in polynomial time.

2-3. The Lovasz extension of a set function (=Jaccard loss) can be expressed as equation (8) and (9).

2-4. By submodularity, the lovasz extension of a set function is the tight convex closure, piecewise linear and interpolates the value of an original set function.

2-5. The vector g(m) of which the components are defined in equation (9) directly corresponds to the derivative of the Lovasz extension of a set function w.r.t m.

3. Lovasz hinge

3-1. Figure above shows the Lovasz hinge applied to the Jaccard loss.

3-2. Substituting m_i to equation (8), can derive Lovasz hinge applied to the Jaccard loss.

3-3. In the case of a single prediction (only one pixel), the Lovasz hinge reduces to the standard hinge loss.

3-4. In the figure above, red points indicate the values of Jaccard index. Dots are on the Lovasz hinge convex surface.

3-5. In the process of optimization, network is trained as a relative margin vector r goes to the lowest red point.

4. Lovasz Softmax

4-1. Figure above shows the Lovasz Softmax applied to the Jaccard loss.

4-2. Instead of using max-margin setting used in Lovasz hinge setting, Softmax probabilities are used to derive m_i(c) in equation (11).

4-3. When considering the class-averaged mIoU metric, average the class-specific loss surrogates in equation (12).

5. Optimization of intersection over union

5-1. The Lovasz extension (Equation (8)) can be achieved by sorting the elements of m in O(p log p) time and doing O(p) calls due to pixel size (p).

5-2. The algorithm below can derive gradient of the Jaccard loss extension ( =[g_1(m), g_2(m), ..., g_p(m)] ) in time O(p log p).

6. Reference

この記事が気に入ったらサポートをしてみませんか？