Open
Description
Bug description
I am trying to use finetune_lora to do PEFT on gemma model, and I have tried:
- litgpt0.5.8.dev1: gemma-3-12b-it, gemma-3-27b-it
- litgpt0.5.7: gemma-2-27b-it
both encouter IndexError. I have also tried other series models like QwQ and llama etc, all look fine.
It seems some people met similar bug( but on gemma-7b), not sure whether they are some problem.
What operating system are you using?
Linux
LitGPT Version
litgpt0.5.7 & litgpt0.5.8.dev1
Initializing distributed: GLOBAL_RANK: 1, MEMBER: 2/4
Initializing distributed: GLOBAL_RANK: 3, MEMBER: 4/4
Initializing distributed: GLOBAL_RANK: 2, MEMBER: 3/4
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 4 processes
----------------------------------------------------------------------------------------------------
[rank: 0] Seed set to 1337
[rank: 1] Seed set to 1337
[rank: 2] Seed set to 1337
[rank: 3] Seed set to 1337
Number of trainable parameters: 10,616,832
Number of non-trainable parameters: 12,772,421,376
The longest sequence length in the train data is 460, the model's maximum sequence length is 460 and context length is 131072
Verifying settings ...
[rank1]: Traceback (most recent call last):
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank1]: sys.exit(main())
[rank1]: ^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank1]: CLI(parser_data)
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank1]: return auto_cli(*args, _stacklevel=3, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank1]: return _run_component(component, init.get(subcommand))
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank1]: return component(**cfg)
[rank1]: ^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank1]: fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank1]: return self._wrap_and_launch(function, self, *args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 923, in _wrap_and_launch
[rank1]: return to_run(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank1]: return to_run(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank1]: token_counts = fit(
[rank1]: ^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank1]: validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False) # sanity check
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank1]: return func(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank1]: logits = model(input_ids)
[rank1]: ^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank1]: output = self._forward_module(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank1]: output = self._fsdp_wrapped_module(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank1]: x = block(
[rank1]: ^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank1]: return self.checkpoint_fn( # type: ignore[misc]
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank1]: return disable_fn(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank1]: return fn(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank1]: ret = function(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank1]: attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank1]: qkv = self.qkv(x) # (B, T, 3xC*)
[rank1]: ^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank1]: lora = self.zero_pad(after_B) * self.scaling # (64, 64, 256) after zero_pad (64, 64, 384)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank1]: return result.index_copy_(dim=-1, index=self.lora_ind, source=x) # (64, 64, 384)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank0]: sys.exit(main())
[rank0]: ^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank0]: CLI(parser_data)
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank0]: return auto_cli(*args, _stacklevel=3, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank0]: return _run_component(component, init.get(subcommand))
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank0]: return component(**cfg)
[rank0]: ^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank0]: fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank0]: return self._wrap_and_launch(function, self, *args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 922, in _wrap_and_launch
[rank0]: return launcher.launch(to_run, *args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/strategies/launchers/subprocess_script.py", line 108, in launch
[rank0]: return function(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank0]: return to_run(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank0]: token_counts = fit(
[rank0]: ^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank0]: validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False) # sanity check
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank0]: return func(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank0]: logits = model(input_ids)
[rank0]: ^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank0]: output = self._forward_module(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank0]: output = self._fsdp_wrapped_module(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank0]: x = block(
[rank0]: ^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank0]: return self.checkpoint_fn( # type: ignore[misc]
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank0]: return disable_fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank0]: return fn(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank0]: ret = function(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank0]: attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank0]: qkv = self.qkv(x) # (B, T, 3xC*)
[rank0]: ^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank0]: lora = self.zero_pad(after_B) * self.scaling # (64, 64, 256) after zero_pad (64, 64, 384)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank0]: return result.index_copy_(dim=-1, index=self.lora_ind, source=x) # (64, 64, 384)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)
[rank3]: Traceback (most recent call last):
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank3]: sys.exit(main())
[rank3]: ^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank3]: CLI(parser_data)
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank3]: return auto_cli(*args, _stacklevel=3, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank3]: return _run_component(component, init.get(subcommand))
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank3]: return component(**cfg)
[rank3]: ^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank3]: fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank3]: return self._wrap_and_launch(function, self, *args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 923, in _wrap_and_launch
[rank3]: return to_run(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank3]: return to_run(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank3]: token_counts = fit(
[rank3]: ^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank3]: validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False) # sanity check
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank3]: return func(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank3]: logits = model(input_ids)
[rank3]: ^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank3]: output = self._forward_module(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank3]: output = self._fsdp_wrapped_module(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank3]: x = block(
[rank3]: ^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank3]: return self.checkpoint_fn( # type: ignore[misc]
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank3]: return disable_fn(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank3]: return fn(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank3]: ret = function(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank3]: attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank3]: qkv = self.qkv(x) # (B, T, 3xC*)
[rank3]: ^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank3]: lora = self.zero_pad(after_B) * self.scaling # (64, 64, 256) after zero_pad (64, 64, 384)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank3]: return result.index_copy_(dim=-1, index=self.lora_ind, source=x) # (64, 64, 384)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)
[rank2]: Traceback (most recent call last):
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/bin/litgpt", line 8, in <module>
[rank2]: sys.exit(main())
[rank2]: ^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/__main__.py", line 69, in main
[rank2]: CLI(parser_data)
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 23, in CLI
[rank2]: return auto_cli(*args, _stacklevel=3, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 125, in auto_cli
[rank2]: return _run_component(component, init.get(subcommand))
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/jsonargparse/_cli.py", line 210, in _run_component
[rank2]: return component(**cfg)
[rank2]: ^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 170, in setup
[rank2]: fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval, optimizer, num_nodes)
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 837, in launch
[rank2]: return self._wrap_and_launch(function, self, *args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 923, in _wrap_and_launch
[rank2]: return to_run(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/fabric.py", line 928, in _wrap_with_setup
[rank2]: return to_run(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 229, in main
[rank2]: token_counts = fit(
[rank2]: ^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 298, in fit
[rank2]: validate(fabric, model, val_dataloader, dataclasses.replace(eval, max_iters=2), verbose=False) # sanity check
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[rank2]: return func(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/finetune/lora.py", line 426, in validate
[rank2]: logits = model(input_ids)
[rank2]: ^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/lightning/fabric/wrappers.py", line 136, in forward
[rank2]: output = self._forward_module(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/fsdp/fully_sharded_data_parallel.py", line 864, in forward
[rank2]: output = self._fsdp_wrapped_module(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 159, in forward
[rank2]: x = block(
[rank2]: ^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/distributed/algorithms/_checkpoint/checkpoint_wrapper.py", line 170, in forward
[rank2]: return self.checkpoint_fn( # type: ignore[misc]
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_compile.py", line 32, in inner
[rank2]: return disable_fn(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py", line 745, in _fn
[rank2]: return fn(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/utils/checkpoint.py", line 496, in checkpoint
[rank2]: ret = function(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 318, in forward
[rank2]: attention_output = self.attn(x_normed, cos, sin, mask, input_pos, input_pos_maxp1)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/model.py", line 381, in forward
[rank2]: qkv = self.qkv(x) # (B, T, 3xC*)
[rank2]: ^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 409, in forward
[rank2]: lora = self.zero_pad(after_B) * self.scaling # (64, 64, 256) after zero_pad (64, 64, 384)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/demo/miniconda3/envs/dobby_dev/lib/python3.11/site-packages/litgpt/lora.py", line 322, in zero_pad
[rank2]: return result.index_copy_(dim=-1, index=self.lora_ind, source=x) # (64, 64, 384)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: IndexError: index_copy_(): Number of indices (6272) should be equal to source.size(dim) (6144)