Skip to content

finetune_lora on gemma bug #2020

Open
Open
@qqandy0120

Description

@qqandy0120

Bug description

I am trying to use finetune_lora to do PEFT on gemma model, and I have tried:

  • litgpt0.5.8.dev1: gemma-3-12b-it, gemma-3-27b-it
  • litgpt0.5.7: gemma-2-27b-it

both encouter IndexError. I have also tried other series models like QwQ and llama etc, all look fine.
It seems some people met similar bug( but on gemma-7b), not sure whether they are some problem.

What operating system are you using?

Linux

LitGPT Version

litgpt0.5.7 & litgpt0.5.8.dev1


Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/4
Initializing distributed: GLOBAL_RANK: 3, MEMBER: 4/4
Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/4
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 4 processes
----------------------------------------------------------------------------------------------------

[rank: 0] Seed set to 1337
[rank: 1] Seed set to 1337
[rank: 2] Seed set to 1337
[rank: 3] Seed set to 1337
Number of trainable parameters: 10,616,832
Number of non-trainable parameters: 12,772,421,376
The longest sequence length in the train data is 460, the model's maximum sequence length is 460 and context length is 131072
Verifying settings ...
[rank1]: Traceback (most recent call last):
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank1]:     sys.exit(main())
[rank1]:              ^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank1]:     CLI(parser_data)
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank1]:     return auto_cli(*args, _stacklevel=3, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank1]:     return _run_component(component, init.get(subcommand))
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank1]:     return component(**cfg)
[rank1]:            ^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank1]:     fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank1]:     return self._wrap_and_launch(function, self, *args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 923, in _wrap_and_launch
[rank1]:     return to_run(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank1]:     return to_run(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank1]:     token_counts = fit(
[rank1]:                    ^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank1]:     validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False)  # sanity check
[rank1]:     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank1]:     return func(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank1]:     logits = model(input_ids)
[rank1]:              ^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank1]:     output = self._forward_module(*args, **kwargs)
[rank1]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank1]:     output = self._fsdp_wrapped_module(*args, **kwargs)
[rank1]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank1]:     x = block(
[rank1]:         ^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank1]:     return self.checkpoint_fn(  # type: ignore[misc]
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank1]:     return disable_fn(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank1]:     return fn(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank1]:     ret = function(*args, **kwargs)
[rank1]:           ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank1]:     attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank1]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank1]:     qkv = self.qkv(x)  # (B, T, 3xC*)
[rank1]:           ^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]:     return self._call_impl(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]:     return forward_call(*args, **kwargs)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank1]:     lora = self.zero_pad(after_B) * self.scaling  # (64, 64, 256) after zero_pad (64, 64, 384)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^
[rank1]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank1]:     return result.index_copy_(dim=-1, index=self.lora_ind, source=x)  # (64, 64, 384)
[rank1]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank0]:     sys.exit(main())
[rank0]:              ^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank0]:     CLI(parser_data)
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank0]:     return auto_cli(*args, _stacklevel=3, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank0]:     return _run_component(component, init.get(subcommand))
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank0]:     return component(**cfg)
[rank0]:            ^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank0]:     fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank0]:     return self._wrap_and_launch(function, self, *args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 922, in _wrap_and_launch
[rank0]:     return launcher.launch(to_run, *args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/strategies/launchers/subprocess_script.py", line 108, in launch
[rank0]:     return function(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank0]:     return to_run(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank0]:     token_counts = fit(
[rank0]:                    ^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank0]:     validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False)  # sanity check
[rank0]:     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]:     return func(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank0]:     logits = model(input_ids)
[rank0]:              ^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank0]:     output = self._forward_module(*args, **kwargs)
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank0]:     output = self._fsdp_wrapped_module(*args, **kwargs)
[rank0]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank0]:     x = block(
[rank0]:         ^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank0]:     return self.checkpoint_fn(  # type: ignore[misc]
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank0]:     return disable_fn(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank0]:     return fn(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank0]:     ret = function(*args, **kwargs)
[rank0]:           ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank0]:     attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank0]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank0]:     qkv = self.qkv(x)  # (B, T, 3xC*)
[rank0]:           ^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]:     return self._call_impl(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]:     return forward_call(*args, **kwargs)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank0]:     lora = self.zero_pad(after_B) * self.scaling  # (64, 64, 256) after zero_pad (64, 64, 384)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^
[rank0]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank0]:     return result.index_copy_(dim=-1, index=self.lora_ind, source=x)  # (64, 64, 384)
[rank0]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)
[rank3]: Traceback (most recent call last):
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank3]:     sys.exit(main())
[rank3]:              ^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank3]:     CLI(parser_data)
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank3]:     return auto_cli(*args, _stacklevel=3, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank3]:     return _run_component(component, init.get(subcommand))
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank3]:     return component(**cfg)
[rank3]:            ^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank3]:     fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank3]:     return self._wrap_and_launch(function, self, *args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 923, in _wrap_and_launch
[rank3]:     return to_run(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank3]:     return to_run(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank3]:     token_counts = fit(
[rank3]:                    ^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank3]:     validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False)  # sanity check
[rank3]:     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank3]:     return func(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank3]:     logits = model(input_ids)
[rank3]:              ^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank3]:     output = self._forward_module(*args, **kwargs)
[rank3]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank3]:     output = self._fsdp_wrapped_module(*args, **kwargs)
[rank3]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank3]:     x = block(
[rank3]:         ^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank3]:     return self.checkpoint_fn(  # type: ignore[misc]
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank3]:     return disable_fn(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank3]:     return fn(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank3]:     ret = function(*args, **kwargs)
[rank3]:           ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank3]:     attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank3]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank3]:     qkv = self.qkv(x)  # (B, T, 3xC*)
[rank3]:           ^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]:     return self._call_impl(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]:     return forward_call(*args, **kwargs)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank3]:     lora = self.zero_pad(after_B) * self.scaling  # (64, 64, 256) after zero_pad (64, 64, 384)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^
[rank3]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank3]:     return result.index_copy_(dim=-1, index=self.lora_ind, source=x)  # (64, 64, 384)
[rank3]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)
[rank2]: Traceback (most recent call last):
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank2]:     sys.exit(main())
[rank2]:              ^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank2]:     CLI(parser_data)
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank2]:     return auto_cli(*args, _stacklevel=3, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank2]:     return _run_component(component, init.get(subcommand))
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank2]:     return component(**cfg)
[rank2]:            ^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank2]:     fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank2]:     return self._wrap_and_launch(function, self, *args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 923, in _wrap_and_launch
[rank2]:     return to_run(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank2]:     return to_run(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank2]:     token_counts = fit(
[rank2]:                    ^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank2]:     validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False)  # sanity check
[rank2]:     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank2]:     return func(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank2]:     logits = model(input_ids)
[rank2]:              ^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank2]:     output = self._forward_module(*args, **kwargs)
[rank2]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank2]:     output = self._fsdp_wrapped_module(*args, **kwargs)
[rank2]:              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank2]:     x = block(
[rank2]:         ^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank2]:     return self.checkpoint_fn(  # type: ignore[misc]
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank2]:     return disable_fn(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank2]:     return fn(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank2]:     ret = function(*args, **kwargs)
[rank2]:           ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank2]:     attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank2]:                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank2]:     qkv = self.qkv(x)  # (B, T, 3xC*)
[rank2]:           ^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]:     return self._call_impl(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]:     return forward_call(*args, **kwargs)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank2]:     lora = self.zero_pad(after_B) * self.scaling  # (64, 64, 256) after zero_pad (64, 64, 384)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^
[rank2]:   File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank2]:     return result.index_copy_(dim=-1, index=self.lora_ind, source=x)  # (64, 64, 384)
[rank2]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions