The Training Example Lie Bracket

8 hours ago

An ideal machine learning model's training shouldn't depend on the order of training examples, but neural nets trained with gradient descent do show order effects.
The Lie bracket of vector fields from training examples quantifies the difference in parameter updates when swapping the order of two examples.
In experiments with a convnet on CelebA, Lie bracket magnitudes correlate tightly with gradient magnitudes, suggesting consistent non-commutativity across parameters.
Predictions for features like Black_Hair and Brown_Hair are particularly sensitive to example order, possibly due to loss function inadequacies in handling mutual exclusivity.

Hasty Briefsbeta