Skip to content

full sft设置了val_dataset后,在eval时报错 #4159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Dr-Corgi opened this issue May 10, 2025 · 3 comments
Open

full sft设置了val_dataset后,在eval时报错 #4159

Dr-Corgi opened this issue May 10, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@Dr-Corgi
Copy link

Describe the bug
full sft设置了val_dataset后,在eval时报错。不指定val_dataset则没有这个问题。

2025-05-10 13:53:19 [rank4]: Traceback (most recent call last):
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/cli/sft.py", line 7, in <module>
2025-05-10 13:53:19 [rank4]:     sft_main()
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/llm/train/sft.py", line 284, in sft_main
2025-05-10 13:53:19 [rank4]:     return SwiftSft(args).main()
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/llm/base.py", line 47, in main
2025-05-10 13:53:19 [rank4]:     result = self.run()
2025-05-10 13:53:19 [rank4]:              ^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/llm/train/sft.py", line 150, in run
2025-05-10 13:53:19 [rank4]:     return self.train(trainer)
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/llm/train/sft.py", line 210, in train
2025-05-10 13:53:19 [rank4]:     trainer.train(trainer.args.resume_from_checkpoint)
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/trainers/mixin.py", line 318, in train
2025-05-10 13:53:19 [rank4]:     res = super().train(*args, **kwargs)
2025-05-10 13:53:19 [rank4]:           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer.py", line 2245, in train
2025-05-10 13:53:19 [rank4]:     return inner_training_loop(
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer.py", line 2627, in _inner_training_loop
2025-05-10 13:53:19 [rank4]:     self._maybe_log_save_evaluate(
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/trainers/mixin.py", line 376, in _maybe_log_save_evaluate
2025-05-10 13:53:19 [rank4]:     super()._maybe_log_save_evaluate(tr_loss, *args, **kwargs)
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer.py", line 3096, in _maybe_log_save_evaluate
2025-05-10 13:53:19 [rank4]:     metrics = self._evaluate(trial, ignore_keys_for_eval)
2025-05-10 13:53:19 [rank4]:               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer.py", line 3045, in _evaluate
2025-05-10 13:53:19 [rank4]:     metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
2025-05-10 13:53:19 [rank4]:               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/trainers/trainers.py", line 170, in evaluate
2025-05-10 13:53:19 [rank4]:     res = super().evaluate(*args, **kwargs)
2025-05-10 13:53:19 [rank4]:           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer_seq2seq.py", line 197, in evaluate
2025-05-10 13:53:19 [rank4]:     return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer.py", line 4154, in evaluate
2025-05-10 13:53:19 [rank4]:     output = eval_loop(
2025-05-10 13:53:19 [rank4]:              ^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/transformers/trainer.py", line 4338, in evaluation_loop
2025-05-10 13:53:19 [rank4]:     for step, inputs in enumerate(dataloader):
2025-05-10 13:53:19 [rank4]:                         ^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/accelerate/data_loader.py", line 566, in __iter__
2025-05-10 13:53:19 [rank4]:     current_batch = next(dataloader_iter)
2025-05-10 13:53:19 [rank4]:                     ^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/torch/utils/data/dataloader.py", line 733, in __next__
2025-05-10 13:53:19 [rank4]:     data = self._next_data()
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/torch/utils/data/dataloader.py", line 1515, in _next_data
2025-05-10 13:53:19 [rank4]:     return self._process_data(data, worker_id)
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/torch/utils/data/dataloader.py", line 1550, in _process_data
2025-05-10 13:53:19 [rank4]:     data.reraise()
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/torch/_utils.py", line 750, in reraise
2025-05-10 13:53:19 [rank4]:     raise exception
2025-05-10 13:53:19 [rank4]: AssertionError: Caught AssertionError in DataLoader worker process 0.
2025-05-10 13:53:19 [rank4]: Original Traceback (most recent call last):
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/torch/utils/data/_utils/worker.py", line 349, in _worker_loop
2025-05-10 13:53:19 [rank4]:     data = fetcher.fetch(index)  # type: ignore[possibly-undefined]
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/torch/utils/data/_utils/fetch.py", line 33, in fetch
2025-05-10 13:53:19 [rank4]:     data.append(next(self.dataset_iter))
2025-05-10 13:53:19 [rank4]:                 ^^^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/accelerate/data_loader.py", line 344, in __iter__
2025-05-10 13:53:19 [rank4]:     for element in self.dataset:
2025-05-10 13:53:19 [rank4]:                    ^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]:   File "/usr/local/lib/python3.12/dist-packages/swift/llm/dataset/utils.py", line 283, in __iter__
2025-05-10 13:53:19 [rank4]:     assert len(self.workers) > 0, f'self.workers: {self.workers}'
2025-05-10 13:53:19 [rank4]:            ^^^^^^^^^^^^^^^^^^^^^
2025-05-10 13:53:19 [rank4]: AssertionError: self.workers: []
@Dr-Corgi
Copy link
Author

补充:数据集比较大,所以使用了streaming和packing参数,不指定val_dataset的话好像就不做eval了

@Jintao-Huang
Copy link
Collaborator

assert len(self.workers) > 0, f'self.workers: {self.workers}'

如何复现这个呢 有shell吗

@Dr-Corgi
Copy link
Author

assert len(self.workers) > 0, f'self.workers: {self.workers}'

如何复现这个呢 有shell吗

@Jintao-Huang 您好,复现的shell如下,会在step=50时报错:

NPROC_PER_NODE=8 \
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
swift sft \
    --model /workspace/archived_base_models/Qwen2.5-1.5B \
    --train_type full \
    --template default \
    --dataset './data/all_in_one-train.json' \
    --val_dataset './data/all_in_one-test.json' \
    --torch_dtype bfloat16 \
    --num_train_epochs 2 \
    --per_device_train_batch_size 16 \
    --per_device_eval_batch_size 8 \
    --learning_rate 1e-5 \
    --gradient_accumulation_steps 2 \
    --eval_steps 50 \
    --save_steps 50 \
    --save_total_limit 2 \
    --logging_steps 5 \
    --max_length 8192 \
    --output_dir output \
    --system "You are a helpful assistant." \
    --warmup_ratio 0.05 \
    --dataloader_num_workers 8 \
    --dataset_num_proc 8 \
    --save_total_limit 2 \
    --deepspeed zero3 \
    --use_liger_kernel true \
    --report_to wandb \
    --run_name find-bugs \
    --attn_impl flash_attn \
    --packing true \
    --streaming true \
    --max_steps 5000

(应该跟数据集没关系,可以使用其他数据集复现,但还是列一下)
数据集是我处理后的thinking数据集,格式类似如下:

[
    {
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Write a python function that determines if a given string 'str1' is a scrambled string of another string 'str2'. A string is considered a scrambled string if it can be obtained by applying a series of rearrangements on 'str2'. Each rearrangement involves dividing the string into two non-empty substrings, swapping their positions, and repeating this process arbitrarily many times on any of the substrings."
            },
            {
                "role": "assistant",
                "content": "<think>\nOkay, I need to write a Python function to determine if str1 is a scrambled version of str2 based on the given definition of scrambling. Let's think through what this means.\n\nFirst, the problem states that a scrambled string is formed by repeatedly swapping two non-empty substrings. Each swap is a rearrangement where you split a string into two parts and swap them. This can be done at any level, not just the top level. So for example, if the original string is \"great\", one possible rearrangement is splitting into \"gr\" and \"eat\", swapping to get \"eatgr\". Then, maybe split \"eat\" into \"e\" and \"at\", swap to get \"ate\", so the string becomes \"atgr\" + \"e\" → \"atg re\"? Wait, maybe not. The process can be recursive. \n\nThe key point is that scrambling can happen at any step, recursively. Wait, but the problem here might be different. Wait, the definition says that each rearrangement involves dividing the string into two non-empty substrings, swapping them, and repeating this on any of the substrings. So each step is a split into two parts, swap, and then you can apply the same operation to either of the resulting substrings. So this is similar to building a binary tree of splits and swaps. \n\nWait, but the end result is a string that can be formed by any number of such operations. So how can we model this? For example, the scrambled string must have the same character counts as the original. Because swapping parts doesn't add or remove any characters. So the first check is that str1 and str2 have the same length and the same characters. If they don't, return False immediately.\n\nSo step one: check if the two strings are of equal length. If not, return False. Then check if they are anagrams. If not, return False. Otherwise, proceed.\n\nBut anagram check is necessary but not sufficient. For example, \"abc\" and \"acb\" are anagrams. But can they be scrambled via the allowed operations? Let's see. Let's take \"abc\". Split into \"a\" and \"bc\", swap to get \"bca\". Then split \"bc\" into \"b\" and \"c\", swap to get \"cba\". Wait, but maybe not. Or split \"abc\" into \"ab\" and \"c\", swap to get \"cab\". But how to get \"acb\"?\n\nWait, maybe there's a way. Let's see. Original is \"abc\". Split into \"a\" and \"bc\" → \"bca\". Then split \"bca\" into \"bc\" and \"a\" → swap to get \"abc\" again. Hmm. Not helpful. Alternatively, split \"abc\" into \"ab\" and \"c\", swap to get \"cab\". Then split \"cab\" into \"c\" and \"ab\", then split \"ab\" into \"a\" and \"b\", swap to get \"ba\" → so the string becomes \"c\" + \"ba\" → \"cba\". But how do we get \"acb\"?\n\nAlternatively, maybe \"abc\" can't be transformed into \"acb\" via these operations. So the anagram check is necessary but not sufficient. Therefore, even if two strings are anagrams, they might not be scrambled versions according to the problem's definition.\n\nSo the question is, what is the condition for two strings to be scrambled under these operations?\n\nThis problem seems similar to the \"scramble string\" problem that is sometimes asked in coding interviews, where the definition is similar. Let me recall. In that problem, a scramble is defined as splitting the string into two non-empty parts, swapping them, and doing the same recursively for each part. The solution for that problem uses dynamic programming with a recursive approach, checking all possible splits and seeing if any split in either string (swapped or not) leads to a valid scramble.\n\nWait, but according to the problem here, the rearrangement can be applied \"arbitrarily many times on any of the substrings\". So each rearrangement is a split and swap. But after each swap, the process can be applied again to any of the substrings. So the scrambled string is built by performing multiple split and swap steps at any level.\n\nSo the approach would involve checking all possible splits and swaps recursively. But that's computationally expensive. However, given that the problem might be expecting a solution similar to the scramble string problem, perhaps a memoized recursive approach with some optimizations is needed.\n\nSo here's the plan:\n\nThe function is_scramble(str1, str2) returns True if str1 is a scrambled version of str2.\n\nBase cases:\n1. If the two strings are equal, return True.\n2. If they are not anagrams (different character counts), return False.\n\nRecursive case:\nWe need to check all possible splits of the string into two parts (left and right) for both possible orders (original and swapped), and see if there's a split where either:\na) The left part of str1 is a scramble of the left part of str2, and the right part of str1 is a scramble of the right part of str2. (No swap)\nOR\nb) The left part of str1 is a scramble of the right part of str2, and the right part of str1 is a scramble of the left part of str2. (After swapping the split parts)\n\nBut since each split can be at any position, we need to check all possible split points (from 1 to len-1) for each possible split.\n\nHowever, this leads to O(n^2) possible splits for each recursive call, and with memoization, the time complexity can be reduced, but it's still high. But for practical purposes and given that the problem might not have strict constraints on input size, this approach might be acceptable.\n\nSo implementing this recursively with memoization.\n\nBut how to handle memoization? Since the function parameters are two strings, memoizing can be done using their concatenation as a key, but that could be memory intensive. Alternatively, we can memoize based on the start and end indices of the substrings in each original string. Wait, but the function is called with two full strings. Maybe we can memoize based on the pair (str1, str2), but since strings can be long, it's better to use indices. Alternatively, perhaps we can use the sorted version of the strings as part of the key, but that's not sufficient. Wait, but if the sorted strings are not equal, we can return False immediately.\n\nSo here's the outline:\n\ndef is_scramble(str1, str2):\n    # Check if the two strings are of equal length\n    if len(str1) != len(str2):\n        return False\n    # Check if they are the same\n    if str1 == str2:\n        return True\n    # Check if they are anagrams\n    if sorted(str1) != sorted(str2):\n        return False\n    # Now, check all possible splits\n    n = len(str1)\n    for i in range(1, n):\n        # Case 1: No swap. Split str1 into left1 and right1, and str2 into left2 and right2 of the same lengths.\n        # Check if left1 is scramble of left2 and right1 is scramble of right2\n        if is_scramble(str1[:i], str2[:i]) and is_scramble(str1[i:], str2[i:]):\n            return True\n        # Case 2: Swap. Check if left1 is scramble of right2's part and right1 is scramble of left2's part.\n        if is_scramble(str1[:i], str2[n-i:]) and is_scramble(str1[i:], str2[:n-i]):\n            return True\n    # If none of the splits work, return False\n    return False\n\nBut wait, this approach is correct, but it's very inefficient because it's making a lot of recursive calls. For example, for each split, it makes two recursive calls. So the time complexity is O(n^something), which for even small n (like 10) would be slow.\n\nSo the code above, while logically correct, would not be efficient for larger strings. However, in practice, the problem may require this approach but with memoization to cache the results.\n\nSo adding memoization. How?\n\nWe can memoize the function based on the two input strings. Since the same pair of strings can be checked multiple times in different splits, memoizing the results can save a lot of time.\n\nBut using strings as keys in a memoization dictionary can be expensive. Alternatively, we can use a dictionary where the key is a tuple of the two strings. But for long strings, this can take up a lot of memory.\n\nAnother approach is to use lru_cache with the parameters. But in Python, for memoization, using functools.lru_cache can be helpful. However, since the strings can be long, and the number of possible pairs is huge, this might not be feasible. But for the purposes of the problem, perhaps it's acceptable.\n\nAlternatively, the problem might have constraints on the input size. For example, if the input strings are up to 20 characters, the code would work. But for longer strings, this approach is not feasible.\n\nBut given that the problem is presented as a code writing question, perhaps the intended solution is this recursive approach with memoization.\n\nSo modifying the code to include memoization:\n\nWe can use the functools.lru_cache decorator. But since the function is called with strings, which are hashable, it's possible. However, for very long strings, this could be a problem, but perhaps the problem doesn't consider that.\n\nAlternatively, we can memoize by converting the strings into tuples, but that's not necessary. Python can handle strings as keys in the memoization.\n\nSo here's the code with memoization:\n\nfrom functools import lru_cache\n\ndef is_scramble(str1, str2):\n    @lru_cache(maxsize=None)\n    def scramble(s1, s2):\n        if s1 == s2:\n            return True\n        if len(s1) != len(s2):\n            return False\n        if sorted(s1) != sorted(s2):\n            return False\n        n = len(s1)\n        for i in range(1, n):\n            if (scramble(s1[:i], s2[:i]) and scramble(s1[i:], s2[i:])) or \\\n               (scramble(s1[:i], s2[n-i:]) and scramble(s1[i:], s2[:n-i])):\n                return True\n        return False\n    return scramble(str1, str2)\n\nWait, but in the original code, when the split is i, for the swap case, the left part of str1 (s1[:i]) is compared to the right part of s2 (s2[n-i:]), and the right part of s1 (s1[i:]) is compared to the left part of s2 (s2[:n-i]).\n\nThis is correct. For example, if s2 is split into left and right parts, and swapped, then the left part of the swapped s2 is the original right part of s2. So for the swap case, s1's left part (after swapping) would correspond to s2's right part.\n\nBut what about split points? For example, when i is the split point in s1, then the split point in s2 for the swap case is n-i. Because if s2 is split into two parts, left and right, which are then swapped. So the length of the right part is i (since s1's left part is i length). Therefore, the right part of s2 would be i length, which implies that the split in s2 is at (n-i) position, because s2's right part after split is s2[n-i:], which has length i. So the split in s2 is at position n-i, which is the same as split s1 at i, and swap.\n\nYes, that's correct.\n\nBut this code uses memoization via the lru_cache decorator. However, using the strings directly as parameters to the helper function 'scramble' can be memory-intensive. But for the purposes of this problem, perhaps it's acceptable.\n\nTesting this code with some examples:\n\nTest case 1: str1 = \"great\", str2 = \"rgeat\"\n\nExpected output: True.\n\nLet's see:\n\nThey are anagrams.\n\nPossible splits: For i=2.\n\ns1[:2] = \"gr\", s2[:2] = \"rg\". Are they scrambles? \"gr\" vs \"rg\" are anagrams. Then, split i=1 in each. For \"gr\" and \"rg\":\n\nCheck if either split at 1 for \"gr\" and \"rg\". For \"gr\", split into \"g\" and \"r\". For \"rg\", check either no swap: \"r\" and \"g\" → \"rg\" split into \"r\" and \"g\" → scramble is \"rg\" vs \"rg\" no. Or swap case: \"g\" vs \"g\", \"r\" vs \"r\". So yes. So \"gr\" and \"rg\" are scrambles. Then the right parts: \"eat\" and \"eat\" are same. So overall, returns True.\n\nAnother test case: str1 = \"abcde\", str2 = \"caebd\" → should return False.\n\nThe code would check all possible splits. For each split, the recursive checks would fail.\n\nAnother test case: str1 = \"a\", str2 = \"a\" → returns True.\n\nTest case: str1 = \"ab\", str2 = \"ba\" → returns True. Split at i=1. Check scramble of \"a\" and \"b\" (swap case). So \"a\" is scramble of \"b\" → no. Wait, in the code, for split i=1:\n\nCase 1: compare \"a\" and \"b\". Since they are not equal and not anagrams (sorted(a) is a, sorted(b) is b), returns False.\n\nCase 2: compare \"a\" (s1[:1]) with s2[2-1:] → s2[1:] → \"a\". s1[1:] is empty. Wait, no. Wait, when i=1, n=2. For the swap case:\n\ns1[:i] is \"a\", s2[n-i:] is s2[2-1=1:] → s2[1:] is \"a\". Then s1[i:] is \"b\", which is compared to s2[:n-i] → s2[:1] is \"b\".\n\nSo \"a\" and \"a\" are equal → True. \"b\" and \"b\" are equal → True. So the condition (scramble(\"a\", \"a\") and scramble(\"b\", \"b\")) → True. So returns True.\n\nWhich is correct because swapping the split \"ab\" into \"a\" and \"b\" and swapping gives \"ba\".\n\nSo the code works.\n\nBut what about when the strings are longer and require multiple splits?\n\nHowever, the code is correct but inefficient. To optimize, we can pre-check the sorted characters and return early if they are not anagrams.\n\nBut the code already includes that check.\n\nSo the final code would look like this. But the code uses the helper function with memoization. However, the original problem statement says to write a Python function. So the code would need to handle these steps.\n\nBut I need to make sure that the code is written correctly. Also, in Python, the strings are passed as parameters to the helper function, and the lru_cache decorator works for them.\n\nSo the final code would be:\n\nImplementing the code with memoization using lru_cache.\n\nAnother note: The helper function 'scramble' is nested inside 'is_scramble' and decorated with lru_cache. This should work.\n\nTesting this code with the sample cases should work.\n\nTherefore, the Python function is as written above.\n</think>\nTo solve this problem, we need to determine if a given string `str1` can be transformed into another string `str2` through a series of allowed rearrangements. A rearrangement involves splitting the string into two non-empty substrings, swapping them, and recursively applying the same process to any of the resulting substrings.\n\n### Approach\n1. **Check Length and Equality**: First, check if the two strings are of the same length. If not, return `False`. If they are identical, return `True`.\n2. **Anagram Check**: Verify if the two strings are anagrams. If they are not, return `False` since they cannot be scrambled versions of each other.\n3. **Recursive Check with Memoization**: Use a recursive approach to explore all possible splits of the strings. For each split, check both the non-swapped and swapped scenarios:\n   - **Non-swapped Case**: Check if the left part of `str1` can be scrambled into the left part of `str2` and the right part of `str1` can be scrambled into the right part of `str2`.\n   - **Swapped Case**: Check if the left part of `str1` can be scrambled into the right part of `str2` and the right part of `str1` can be scrambled into the left part of `str2`.\n\nTo optimize the recursive solution, we use memoization to cache results of previously computed subproblems, reducing redundant calculations.\n\n### Solution Code\n```python\nfrom functools import lru_cache\n\ndef is_scramble(str1: str, str2: str) -> bool:\n    @lru_cache(maxsize=None)\n    def helper(s1: str, s2: str) -> bool:\n        if s1 == s2:\n            return True\n        if len(s1) != len(s2):\n            return False\n        if sorted(s1) != sorted(s2):\n            return False\n        n = len(s1)\n        for i in range(1, n):\n            if (helper(s1[:i], s2[:i]) and helper(s1[i:], s2[i:])) or \\\n               (helper(s1[:i], s2[n-i:]) and helper(s1[i:], s2[:n-i])):\n                return True\n        return False\n    \n    return helper(str1, str2)\n```\n\n### Explanation\n- **Memoization**: The `helper` function is decorated with `lru_cache` to store results of subproblems, ensuring each pair of substrings is only checked once.\n- **Base Cases**: Directly return `True` if the input strings are identical, and `False` if they have different lengths or are not anagrams.\n- **Recursive Splits**: For each possible split point, recursively check both non-swapped and swapped scenarios. If any split point leads to a valid scramble, return `True`.\n\nThis approach efficiently checks all possible ways to split and swap the strings, leveraging memoization to handle the exponential number of potential subproblems in a feasible manner."
            }
        ]
    }
]

@Jintao-Huang Jintao-Huang added the bug Something isn't working label May 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants