Reflecting to optimise

2 days ago

Author admits lack of in-depth optimization knowledge despite familiarity with algorithms like Adam and AdaGrad.
Discusses optimization on a categorical probability distribution simplex, inspired by protein binder design problem.
Three methods presented: softmax reparameterization, projected gradient descent (PGD), and mirror descent with negative entropy.
PGD tends to produce sparse solutions in high dimensions as optimization progresses.
Mirror descent uses Bregman divergences (e.g., KL divergence) to handle constraints naturally.
Softmax reparameterization can suffer from vanishing gradients near simplex vertices.
Choice of method depends on problem specifics; experiments show mirror descent often outperforms softmax reparameterization.

Hasty Briefsbeta