4.13. Entropy MaximizationΒΆ
Entropy maximization starts by treating \(x\) as a probability vector. Each entry \(x_i\) is the probability assigned to outcome \(i\), so the conditions \(x \ge 0\) and \(\mathbf{1}^\top x = 1\) force \(x\) to live on the probability simplex.
Among all probability vectors that satisfy the side information, we want the one with the largest Shannon entropy, meaning the least concentrated and most diffuse distribution.
Here \(A x = b\) and \(F x \le g\) encode additional information we want the distribution to respect.
The API used in these examples is a minimization API, so the code will not write the objective with a leading minus sign. Instead, it minimizes \(\sum_i x_i \log x_i\), which is exactly the negative of the Shannon entropy. Minimizing negative entropy is the same as maximizing entropy.
Step 1: Build a feasible probability model
We first create a random probability vector x0 and then use it to generate
the right-hand sides b and g. That guarantees the optimization problem
is feasible, because x0 itself already satisfies the constraints.
import numpy as np
import admm
np.random.seed(1)
n = 12
m = 3
p = 2
x0 = np.random.rand(n)
x0 = x0 / np.sum(x0)
A = np.random.randn(m, n)
b = A @ x0
F = np.random.randn(p, n)
g = F @ x0 + np.random.rand(p)
Step 2: Create the model and the probability variable
The decision variable is another length-n vector called x. It will be
the probability distribution chosen by the solver.
model = admm.Model()
x = admm.Var("x", n)
Step 3: Write the entropy objective in minimization form
The Shannon entropy is
\(H(x) = -\sum_i x_i \log x_i\). The symbolic function
admm.entropy(x) gives the entrywise term \(x_i \log x_i\), so
admm.sum(admm.entropy(x)) is \(\sum_i x_i \log x_i = -H(x)\).
Minimizing that quantity is therefore the same as maximizing Shannon entropy.
model.setObjective(admm.sum(admm.entropy(x)))
Step 4: Add the affine side information and simplex constraints
The equality and inequality constraints encode the external information we want
the distribution to satisfy. The last two constraints,
admm.sum(x) == 1 and x >= 0, are what make x a probability vector.
model.addConstr(A @ x == b)
model.addConstr(F @ x <= g)
model.addConstr(admm.sum(x) == 1)
model.addConstr(x >= 0)
Step 5: Solve and inspect the result
After solving, model.ObjVal is the optimal value of the minimization form
\(\sum_i x_i \log x_i\), while model.StatusString tells us whether the
solver succeeded.
model.optimize()
print(" * model.ObjVal: ", model.ObjVal) # Expected: -2.4722786823012264
print(" * model.StatusString: ", model.StatusString) # Expected: SOLVE_OPT_SUCCESS
Complete runnable example:
import numpy as np
import admm
np.random.seed(1)
n = 12
m = 3
p = 2
x0 = np.random.rand(n)
x0 = x0 / np.sum(x0)
A = np.random.randn(m, n)
b = A @ x0
F = np.random.randn(p, n)
g = F @ x0 + np.random.rand(p)
model = admm.Model()
x = admm.Var("x", n)
model.setObjective(admm.sum(admm.entropy(x))) # minimize sum(x*log(x)) = maximize Shannon entropy
model.addConstr(A @ x == b)
model.addConstr(F @ x <= g)
model.addConstr(admm.sum(x) == 1)
model.addConstr(x >= 0)
model.optimize()
print(" * model.ObjVal: ", model.ObjVal) # Expected: -2.4722786823012264
print(" * model.StatusString: ", model.StatusString) # Expected: SOLVE_OPT_SUCCESS
This example is available as a standalone script in the examples/ folder of the ADMM repository:
python examples/entropy_maximization.py
The solution is the maximum-entropy distribution on the simplex
\(\{x \ge 0,\; \mathbf{1}^\top x = 1\}\) that satisfies the moment constraints
\(Ax = b\) and \(Fx \le g\). ADMM handles admm.entropy() and the
simplex projection natively.