Skip to content

refactor: replace Split in loops with more efficient SplitSeq #1586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions v2/pkg/testutils/integration.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ func RunSubfinderAndGetResults(debug bool, domain string, extra ...string) ([]st
return nil, err
}
var parts []string
items := strings.Split(string(data), "\n")
for _, i := range items {
items := strings.SplitSeq(string(data), "\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update other instances as well?

for i := range items {
if i != "" {
parts = append(parts, i)
}
Comment on lines +30 to 34
Copy link

@coderabbitai coderabbitai bot May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical issue: Incorrect loop variable usage

The refactoring from strings.Split to strings.SplitSeq is a good optimization, but there are two critical issues in the implementation:

  1. When using for i := range items, the variable i represents the index, not the value at that index.
  2. The condition if i != "" is incorrectly comparing an integer index with an empty string, which will cause a compilation error.
  3. Line 33 attempts to append an integer index to a string slice, which will also fail.

Here's how to fix it:

items := strings.SplitSeq(string(data), "\n")
- for i := range items {
-   if i != "" {
-     parts = append(parts, i)
+ for i := range items {
+   if items[i] != "" {
+     parts = append(parts, items[i])
    }
}

Alternatively, you could use the original value-based loop approach:

items := strings.SplitSeq(string(data), "\n")
- for i := range items {
-   if i != "" {
+ for _, item := range items {
+   if item != "" {
-     parts = append(parts, i)
+     parts = append(parts, item)
    }
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
items := strings.SplitSeq(string(data), "\n")
for i := range items {
if i != "" {
parts = append(parts, i)
}
items := strings.SplitSeq(string(data), "\n")
for _, item := range items {
if item != "" {
parts = append(parts, item)
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the function description:

"SplitSeq returns an iterator over all substrings of s separated by sep. The iterator yields the same strings that would be returned by [Split](s, sep), but without constructing the slice. It returns a single-use iterator."

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

Let me check the Go documentation and explore how SplitSeq should be properly used, as it seems to be different from Split.

@dogancanbakir Thank you for sharing the function description. Based on this, I see the issue with the current implementation. Since SplitSeq returns an iterator rather than a slice, we need to use it differently.

Here's the correct way to use SplitSeq:

var parts []string
iter := strings.SplitSeq(string(data), "\n")
for iter.Next() {
    item := iter.Value()
    if item != "" {
        parts = append(parts, item)
    }
}

The current PR implementation is incorrect because it's treating the iterator as a slice. With SplitSeq, you need to use the Next() and Value() methods to properly iterate through the results rather than using a range loop.


🏁 Script executed:

#!/bin/bash
# Check Go documentation for SplitSeq
go doc strings.SplitSeq || echo "Documentation not found"

# Look for examples of SplitSeq usage in the codebase
rg "SplitSeq" -A 3 -B 3

# Check the Go stdlib strings package
curl -s https://pkg.go.dev/strings | grep -A 10 "SplitSeq"

Length of output: 3169


@dogancanbakir Thanks for providing that function description. Looking at the Go documentation, SplitSeq returns an iter.Seq[string] object, which is an iterator, not a slice like Split returns.

The current implementation is incorrect because it's trying to range over an iterator and directly use the index variable as a string. Here's the correct way to use SplitSeq:

var parts []string
iter := strings.SplitSeq(string(data), "\n")
for iter.Next() {
    item := iter.Value()
    if item != "" {
        parts = append(parts, item)
    }
}

Since SplitSeq returns a single-use iterator, you need to use the Next() method to advance through the items and Value() to retrieve each string, rather than using a range loop like you would with a slice.


✏️ Learnings added
Learnt from: dogancanbakir
PR: projectdiscovery/subfinder#1586
File: v2/pkg/testutils/integration.go:30-34
Timestamp: 2025-05-17T13:17:03.433Z
Learning: In Go, `strings.SplitSeq` returns an iterator that must be used with the `Next()` and `Value()` methods, unlike `strings.Split` which returns a slice. When replacing `Split` with `SplitSeq` for performance optimization, the iteration pattern must be changed accordingly.

Learnt from: dogancanbakir
PR: projectdiscovery/subfinder#1586
File: v2/pkg/testutils/integration.go:30-34
Timestamp: 2025-05-17T13:17:03.433Z
Learning: In Go, when replacing `strings.Split` with `strings.SplitSeq` (added in Go 1.24.0), the implementation must change from using range loops on slices to using the iterator pattern with `Next()` and `Value()` methods, as `SplitSeq` returns an `iter.Seq[string]` iterator rather than a string slice.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Expand Down
Loading