Skip to content

Add HashSet.duplicate() #107377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aaronfranke
Copy link
Member

Follow-up to #106537 (comment)

HashSet has a copy constructor, but it's not obvious whether the constructor is truly copying the underlying data just from reading the usage. This makes the intent clearer, at the cost of adding a very small function to HashSet. There is already a similar API in Vector<> for duplicating those.

@aaronfranke aaronfranke added this to the 4.x milestone Jun 10, 2025
@aaronfranke aaronfranke requested review from a team as code owners June 10, 2025 19:32
@Ivorforce Ivorforce removed the request for review from a team June 10, 2025 19:53
Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some thoughts about this.
First off, I think this is generally a great idea.

What we have currently is the implicit copy constructor, which is very disingenuous about its real cost:

// Whoops, actually extremely expensive, because all strings are copied!
HashSet<String> set = p_hash_set;

It would already be much better to make copy constructors explicit only:

// Now it's somewhat clear we're making a new hash set.
HashSet<String> set = HashSet(p_hash_set);

However, I do prefer the .duplicate you suggest, even if it is a little non-C++ style.

Still, our priority for things like this needs to be consistency. If we're adding .duplicate, we need to remove the copy constructor. Otherwise, both APIs will be used, and it won't be clear which function is preferred.

Likewise, we can't just add .duplicate to one or two isolated container types. Otherwise, people will always have to guess if it's "one of those containers with .duplicate" or "one of those other containers with a copy constructor". We'd need to add them to all containers, and remove the copy constructors, in one fell swoop.

All of this makes this a bit bigger than a few innocuous lines, and it would need to be agreed on by the core team, in my opinion.

There is already a similar API in Vector<> for duplicating those.

That's a good point. I think the reason it exists for Vector is that it's exposed to GDScript (and it's probably in the codebase for consistency). It acts the same as the copy constructor from C++.
However, in GDScript it acts different from the copy constructor, because regular assignment references the original array, rather than making a CoW copy (like .duplicate). This creates a need for .duplicate, which likely prompted this function to be added in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants