You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great work.
I am trying to reproduce the memory results in Table 2: Comparison with low-rank algorithms on pre-training various sizes of LLaMA models on C4 dataset.
In your paper, the estimated memory for LLaMa 350M is 1.22G. I conducted experiment for 350M and got the following:
Total Params: 367.97M
GaLore enabled: 302.38M
Is there an exact equation for estimating the memory usage for GaLore for different models?
The text was updated successfully, but these errors were encountered:
zqOuO
changed the title
Question on reproducing the estimated memory of GaLore
Question on the estimated memory of GaLore
Dec 14, 2024
Thank you for your great work.
I am trying to reproduce the memory results in Table 2: Comparison with low-rank algorithms on pre-training various sizes of LLaMA models on C4 dataset.
In your paper, the estimated memory for LLaMa 350M is 1.22G. I conducted experiment for 350M and got the following:
Total Params: 367.97M
GaLore enabled: 302.38M
Is there an exact equation for estimating the memory usage for GaLore for different models?
The text was updated successfully, but these errors were encountered: