Hi,
In stage2_sft.py (line 292), the targets are all set to -100 because cur_len isn't updated to match expected_len. This seems like a bug. Could you please help to verify this? Thanks!
if cur_len < tokenizer.model_max_length: if cur_len != expected_len: for k in range(total_len): target[k] = IGNORE_TOKEN_ID rank0_print( f"WARNING: tokenization mismatch: {cur_len} vs. {total_len}." f" #turn = {len(turns) - 1}. (ignored)" )