4.18. L0 Norm

This example shows how to package the nonconvex sparsity penalty \(\|x\|_0\) as a UDF. The full model is

\[\begin{split}\begin{array}{ll} \min\limits_x & \tfrac{1}{2}\|x-y\|_2^2 + \lambda \|x\|_0 \\ \text{s.t.} & 0 \le x \le 1. \end{array}\end{split}\]

Here \(\|x\|_0\) counts the number of nonzero entries of \(x\), so the objective trades off fidelity to the observed vector \(y\) against sparsity. The \(L_0\) term is nonconvex, so the solver acts as a practical local method and the returned point should be interpreted as a locally optimal solution or stationary point.

The function value returned by UDFBase.eval() is exactly the \(L_0\) count:

\[f(x) = \|x\|_0 = \#\{i : x_i \neq 0\}.\]

So eval just counts how many entries are numerically nonzero:

def eval(self, arglist):
    x = np.asarray(arglist[0], dtype=float)
    return float(np.count_nonzero(np.abs(x) > 1e-12))

The proximal step returned by UDFBase.argmin() solves

\[\begin{split}\operatorname{prox}_{\lambda \|\cdot\|_0}(v) = \operatorname*{argmin}_x \; \lambda \|x\|_0 + \tfrac{1}{2}\|x-v\|_2^2, \qquad x_i^\star = \begin{cases} 0, & |v_i| \le \sqrt{2\lambda}, \\ v_i, & |v_i| > \sqrt{2\lambda}. \end{cases}\end{split}\]

This is the classical hard-thresholding rule, so the implementation is short:

def argmin(self, lamb, arglist):
    v = np.asarray(arglist[0], dtype=float)
    threshold = np.sqrt(2.0 * lamb)
    prox = np.where(np.abs(v) <= threshold, 0.0, v)
    return [prox.tolist()]

The UDFBase.arguments() method tells ADMM which symbolic object this UDF depends on. In this case the custom term is a function of one vector variable \(x\), so arguments returns a one-element list:

def arguments(self):
    return [self.arg]

Complete runnable example:

import numpy as np
import admm

class L0Norm(admm.UDFBase):
    def __init__(self, arg):
        self.arg = arg

    def arguments(self):
        return [self.arg]

    def eval(self, arglist):
        x = np.asarray(arglist[0], dtype=float)
        return float(np.count_nonzero(np.abs(x) > 1e-12))

    def argmin(self, lamb, arglist):
        v = np.asarray(arglist[0], dtype=float)
        threshold = np.sqrt(2.0 * lamb)
        prox = np.where(np.abs(v) <= threshold, 0.0, v)
        return [prox.tolist()]

y = np.array([0.2, 1.7, 0.6, 1.9])
lam = 1.0

model = admm.Model()
x = admm.Var("x", len(y))
model.setObjective(0.5 * admm.sum(admm.square(x - y)) + lam * L0Norm(x))
model.addConstr(x >= 0)
model.addConstr(x <= 1)
model.setOption(admm.Options.admm_max_iteration, 10000)  # Give the constrained nonconvex solve enough iterations
model.optimize()

print(" * x: ", np.asarray(x.X))  # Expected: ≈ [0, 1, 0, 1]
print(" * model.ObjVal: ", model.ObjVal)  # Expected: ≈ 2.85

This example is available as a standalone script in the examples/ folder of the ADMM repository:

python examples/udf_l0_norm.py

In this concrete example, \(y = [0.2, 1.7, 0.6, 1.9]\) and \(\lambda = 1\), so the threshold is

\[\sqrt{2\lambda} = \sqrt{2} \approx 1.414.\]

The entries \(0.2\) and \(0.6\) fall below that threshold and are driven to zero. The entries \(1.7\) and \(1.9\) survive the hard-thresholding step, but they do not yet satisfy the box constraint. The active upper bound \(x \le 1\) clips them, so the final feasible point is

\[[0,\; 1.7,\; 0,\; 1.9] \xrightarrow{\;x \le 1\;} x^\star \approx [0,\; 1,\; 0,\; 1].\]

Geometrically: small entries are hard-thresholded to zero, surviving large entries are clipped by the active box constraint \(x \le 1\).