Out of Memory Error #31

Closed
opened 2023-09-08 22:56:10 +08:00 by mshahbazi72 · 5 comments
mshahbazi72 commented 2023-09-08 22:56:10 +08:00 (Migrated from github.com)

Hi

I am trying to run the training on an A100 GPU with 40G memory, but I receive OOM error in the first stage (coare training). Here are the commands and the config I'm using to run the two stages:

# Run stage 1
!python main.py -O \
--text "A high-resolution DSLR image of a soccer ball"  \
--sd_version 1.5 \
--image "/content/input/rgba.png" \
--workspace out/magic123-${RUN_ID}-coarse/$dataset/magic123_${FILENAME}_${RUN_ID}_coarse \
--optim adam \
--iters 5000 \
--guidance SD zero123 \
--lambda_guidance 1.0 40 \
--guidance_scale 100 5 \
--latent_iter_ratio 0 \
--normal_iter_ratio 0.2 \
--t_range 0.2 0.6 \
--bg_radius -1 \
--save_mesh

# Run stage 2
!python main.py -O \
--text "A high-resolution DSLR image of a soccer ball"  \
--sd_version 1.5 \
--image "/content/input/rgba.png" \
--workspace out/magic123-${RUN_ID}-${RUN_ID2}/$dataset/magic123_${FILENAME}_${RUN_ID}_${RUN_ID2} \
--dmtet --init_ckpt out/magic123-${RUN_ID}-coarse/$dataset/magic123_${FILENAME}_${RUN_ID}_coarse/checkpoints/magic123_${FILENAME}_${RUN_ID}_coarse.pth \
--iters 5000 \
--optim adam \
--known_view_interval 4 \
--latent_iter_ratio 0 \
--guidance SD zero123 \
--lambda_guidance 1e-3 0.01 \
--guidance_scale 100 5 \
--rm_edge \
--bg_radius -1 \
--save_mesh

I would appreciate your help. In the README file, it is mentioned that the code should be able to run on a V100 32G GPU, which has less memory than what I am using.

Hi I am trying to run the training on an A100 GPU with 40G memory, but I receive OOM error in the first stage (coare training). Here are the commands and the config I'm using to run the two stages: ``` # Run stage 1 !python main.py -O \ --text "A high-resolution DSLR image of a soccer ball" \ --sd_version 1.5 \ --image "/content/input/rgba.png" \ --workspace out/magic123-${RUN_ID}-coarse/$dataset/magic123_${FILENAME}_${RUN_ID}_coarse \ --optim adam \ --iters 5000 \ --guidance SD zero123 \ --lambda_guidance 1.0 40 \ --guidance_scale 100 5 \ --latent_iter_ratio 0 \ --normal_iter_ratio 0.2 \ --t_range 0.2 0.6 \ --bg_radius -1 \ --save_mesh # Run stage 2 !python main.py -O \ --text "A high-resolution DSLR image of a soccer ball" \ --sd_version 1.5 \ --image "/content/input/rgba.png" \ --workspace out/magic123-${RUN_ID}-${RUN_ID2}/$dataset/magic123_${FILENAME}_${RUN_ID}_${RUN_ID2} \ --dmtet --init_ckpt out/magic123-${RUN_ID}-coarse/$dataset/magic123_${FILENAME}_${RUN_ID}_coarse/checkpoints/magic123_${FILENAME}_${RUN_ID}_coarse.pth \ --iters 5000 \ --optim adam \ --known_view_interval 4 \ --latent_iter_ratio 0 \ --guidance SD zero123 \ --lambda_guidance 1e-3 0.01 \ --guidance_scale 100 5 \ --rm_edge \ --bg_radius -1 \ --save_mesh ``` I would appreciate your help. In the README file, it is mentioned that the code should be able to run on a V100 32G GPU, which has less memory than what I am using.
guochengqian commented 2023-09-09 06:35:19 +08:00 (Migrated from github.com)

You did not change any thing, right?
Yes, the code should run in less than 32 G.

You did not change any thing, right? Yes, the code should run in less than 32 G.
hdacnw commented 2023-10-12 14:41:32 +08:00 (Migrated from github.com)

Same here. Were you able to solve the issue? @mshahbazi72

Same here. Were you able to solve the issue? @mshahbazi72
guochengqian commented 2023-10-13 16:37:26 +08:00 (Migrated from github.com)

@mshahbazi72 @hdacnw Could you share detailed errors? I have run all examples in 32G V100, which works fine.

@mshahbazi72 @hdacnw Could you share detailed errors? I have run all examples in 32G V100, which works fine.
J3AAA commented 2023-10-22 21:57:44 +08:00 (Migrated from github.com)

@mshahbazi72 @hdacnw Could you share detailed errors? I have run all examples in 32G V100, which works fine.

I have four GPUs, each 16GB, but I don't know how to use them. Can you help me? Thank you very much!

> @mshahbazi72 @hdacnw Could you share detailed errors? I have run all examples in 32G V100, which works fine. I have four GPUs, each 16GB, but I don't know how to use them. Can you help me? Thank you very much!
guochengqian commented 2024-04-23 04:36:49 +08:00 (Migrated from github.com)

we run at least in 32GB GPUs. For your usage, you can use mixed precision to save the memory

we run at least in 32GB GPUs. For your usage, you can use mixed precision to save the memory
Sign in to join this conversation.