Created: 2022-05-30T21:37:59-05:00
This card pertains to a resource available on the internet.
Attempts to optimize "random search" methods because of their conceptual simplicity.
Successive halving: run a set of candidates constrained by some budget and throw out the worst performing halves.
Works by rationing the amount of a resource used for each candidate.
Rationing can be training epochs for neural networks, amount of RAM, amount of data used for training.
for s in S_{max}: n = ceil((B/R) (nu^s)/(s+1)) r = R * (nu ^ -s) T = get_hyperparameter_configuration(n) for i in 0..s: n_i = floor(n * (nu ^ -i)) r_i = r * (nu^i) L = {run_then_return_val_loss(t, r_i): t in T} T = top_k(T, L, floor(n_i / nu))
top_k: returns the top k winners given algorithms and their loss scores
Standard deviations of a model's score during training create the envelope.
If the envelops do not overlap on termination the models can be seen as meaningfully different.