Question: What is NOT used by ZeRO to boost memory efficiency for training large deep learning models? ( A ) It partitions model states and replicates
What is NOT used by ZeRO to boost memory efficiency for training large deep learning models? A It partitions model states and replicates them across data parallel processes; B It reduces activation memory; C It reduces the residual memory by temporary buffers and memory fragmentation; D It eliminates memory redundancies and makes the full aggregate memory capacity of a cluster available. Which of these choices is not used by ZERO
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
