Duplicate sentence

#8
by nohurry - opened

Thank you very much for this fantastic guide! I've been reading it multiple times, and every time I learn something new.

Something I noticed in the Ablation - LR sweeps section underneath the graphs, is this sentence occuring twice very close to each other:

but running sweeps for every model size gets expensive quickly, and more importantly, it doesn’t account for the planned number of training tokens as we previously stated. This is where scaling laws become invaluable.

You might want to correct this.

Once again, thank you very much for releasing this writeup.

Cause it’s all AI generated text. Obviously it’s low quality

Sign up or log in to comment