forked from NVIDIA/Megatron-LM
-
Notifications
You must be signed in to change notification settings - Fork 366
Open
Description
This issue acted as a PR tracker to Intel customer support related PRs. The purpose is to get understanding of what each PR does and how important are they compared to other customer support related PRs. This also help us to aware of merged PRs and PRs progress.
Under review
o Model Pretrain Enable
- Support Llama2Tokenizer: Support Llama2Tokenizer #375
o Finetune
- [Finetune] enable converting checkpoints without optimizer state generation: [Finetune] enable converting checkpoints without optimizer state generation #424
o Ulysess
- ds-sequence-parallel(ulysses) for rope.: ds-sequence-parallel(ulysses) for rope. #392
- support split qkv linear and sp overlap comm: support split qkv linear and sp overlap comm #415
o Others
- collect grad_norm for non pipeline path: collect grad_norm for non pipeline path #370
Merged
o FineTune
- Supervised Fine-tuning for HugginFace pretrained weight.: Supervised Fine-tuning for HugginFace pretrained weight. #318
- [LLaMa] Adding support converting checkpoint from mds to hf: [LLaMa] Adding support converting checkpoint from mds to hf #432
o Others - fix reshape for split qga: fix reshape for split qga #307
- add RMSnorm torch fallback path: add RMSnorm torch fallback path #312
- [Wandb] Refine wandb logging function: [Wandb] Refine wandb logging function #416
- [Bug] Fix crash when logging optimizer state to tensorboard: [Bug] Fix crash when logging optimizer state to tensorboard #417
- [wandb] disable wandb more gracefully: [wandb] disable wandb more gracefully #422
Metadata
Metadata
Assignees
Labels
No labels