-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Issues: hpcaitech/ColossalAI
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[FEATURE]: Is it Possible to integrate Liger-Kernel?
enhancement
New feature or request
#6047
opened Sep 6, 2024 by
ericxsun
[BUG]: remove Something isn't working
.github/workflows/submodule.yml
bug
#6039
opened Aug 28, 2024 by
BoxiangW
1 task done
[FEATURE]: Support Zerobubble pipeline
enhancement
New feature or request
#6037
opened Aug 28, 2024 by
duanjunwen
[BUG]: errror Colossalai 0.4.0/0.4.2 /usr/bin/supervisord
bug
Something isn't working
#6032
opened Aug 23, 2024 by
Storm0921
1 task done
[BUG]: AttributeError: 'GeminiDDP' object has no attribute 'module'
bug
Something isn't working
#6021
opened Aug 20, 2024 by
dheerj188
1 task done
[BUG]: Torch compile causes multi-process to hang with python 3.9
bug
Something isn't working
#5987
opened Aug 10, 2024 by
Edenzzzz
1 task done
[FEATURE]: How to skip a custom node from generating strategies in colossal-auto?
enhancement
New feature or request
#5983
opened Aug 8, 2024 by
robotsp
[BUG]: Pytest with a specific config failed after PR #5868
bug
Something isn't working
shardformer
#5949
opened Jul 29, 2024 by
GuangyaoZhang
1 task done
[FEATURE]: Request updates for pretraining roberta
enhancement
New feature or request
#5948
opened Jul 29, 2024 by
jiahuanluo
[BUG]: Something isn't working
pip install .
error: identifier "__hsub" is undefined
bug
#5929
opened Jul 19, 2024 by
jtmer
1 task done
[BUG]: Shardformer FP8 communication training accuracy degradation
bug
Something isn't working
#5920
opened Jul 18, 2024 by
GuangyaoZhang
1 task done
[BUG]: Low_Level_Zero plugin crashes with LoRA
bug
Something isn't working
#5909
opened Jul 15, 2024 by
Fallqs
1 task done
[PROPOSAL]: Does the LowLevelZero Plugin Support Lora, This Code Is Confusing
enhancement
New feature or request
#5908
opened Jul 15, 2024 by
YeAnbang
1 task
[BUG]: run opt inference but failed with No module named 'energonai'
bug
Something isn't working
#5906
opened Jul 13, 2024 by
munger1985
1 task done
Whether to support the training acceleration of the StableDiffusion3 algorithm model?
enhancement
New feature or request
#5900
opened Jul 10, 2024 by
tensorflowt
[BUG]: Colossal AI failed to load ChatGLM2
bug
Something isn't working
#5861
opened Jun 26, 2024 by
hiprince
1 task done
Use gemini plugin and LowLevelZero to run llama2_7b. In the pulgin in gemini, set the policy to static, shard_param_frac, offload_optim_frac, and offload_param_frac to 0.0, making gemini equal to zero2, and set stage to 2 in LowLevelZero. Using bf16 for training, and comparing the two plugins, we found that the GPU memory usage of gemini is higher than that of LowLevelZero. Why is this? In principle, gemini should save more GPU memory
#5830
opened Jun 18, 2024 by
JJGSBGQ
[FEATURE]: LoRA with sharded model
enhancement
New feature or request
#5826
opened Jun 17, 2024 by
KaiLv69
[BUG]: Shardformer failure with torch 2.3
bug
Something isn't working
#5757
opened May 27, 2024 by
Edenzzzz
1 task done
[BUG]: docker build cuda extension error
bug
Something isn't working
#5732
opened May 20, 2024 by
apachemycat
1 task done
[BUG]: TypeError: LlamaInferenceForwards.llama_causal_lm_forward() got an unexpected keyword argument 'shard_config'
bug
Something isn't working
#5729
opened May 17, 2024 by
hiprince
1 task done
[BUG]: No module named 'dropout_layer_norm'
bug
Something isn't working
#5726
opened May 17, 2024 by
apachemycat
1 task done
Previous Next
ProTip!
Updated in the last three days: updated:>2024-09-16.