Skip to content

Clean up PrimitiveColumnContainer #6656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

JakeRadMSFT
Copy link
Contributor

@JakeRadMSFT JakeRadMSFT commented May 5, 2023

I was inspired by a community members PR (#6642) to try and clean up PrimitiveColumnContainer.

I'd say this is Part 1 of a few part series for improving DataFrame.

  • Updates the code to be more consistent with the Buffer and nullBufferBitMap usage
  • Updated all the PrimitiveColumn related things that used to be T : struct and move to T : unmanaged.
  • Hopefully improved readability

Not Fixed:

  • It's still not thread safe

Try to clean up PrimitiveColumnContainer
@ghost ghost assigned JakeRadMSFT May 5, 2023
}
_modifyNullCountWhileIndexing = true;
count -= allocatable;

}
}

public void ApplyElementwise(Func<T?, long, T?> func)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, what is the contract for this method? It's going to swap out the buffer in the array with a new one -- what if someone's holding on to the old one? How do you handle reference invalidation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just inherited this but my understanding is the following:

  • This method is going to apply func for each element and update the value in buffer
    • It will make the buffer mutable if it needs to
    • It will update the nullBitMapBuffer

If something is holding on to the old buffer - it shouldn't be/it's outdated.

JakeRadMSFT and others added 9 commits May 5, 2023 19:59
Fix helper
Remove duplicate
More clean up
Fix slice
Update name to GetOrCreateMutable
Use null instead of default for nullable values.
Clean up Apply methods.
…cleanup' into u/jakerad/column-container-cleanup
@codecov
Copy link

codecov bot commented May 6, 2023

Codecov Report

Merging #6656 (e33608d) into main (3986fcf) will decrease coverage by 0.01%.
The diff coverage is 90.74%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6656      +/-   ##
==========================================
- Coverage   68.59%   68.59%   -0.01%     
==========================================
  Files        1200     1201       +1     
  Lines      250326   250296      -30     
  Branches    26096    26094       -2     
==========================================
- Hits       171701   171679      -22     
+ Misses      71809    71803       -6     
+ Partials     6816     6814       -2     
Flag Coverage Δ
Debug 68.59% <90.74%> (-0.01%) ⬇️
production 63.07% <90.74%> (-0.01%) ⬇️
test 88.84% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Microsoft.Data.Analysis/DataFrameBuffer.cs 82.60% <ø> (ø)
...lysis/PrimitiveColumnContainer.BinaryOperations.cs 84.21% <ø> (ø)
...ata.Analysis/PrimitiveDataFrameColumnArithmetic.cs 48.98% <ø> (ø)
...a.Analysis/PrimitiveDataFrameColumnComputations.cs 45.66% <ø> (ø)
...Microsoft.Data.Analysis/ReadOnlyDataFrameBuffer.cs 47.36% <ø> (ø)
...icrosoft.Data.Analysis/PrimitiveColumnContainer.cs 83.04% <88.63%> (+0.03%) ⬆️
...t.Data.Analysis/PrimitiveColumnContainerHelpers.cs 100.00% <100.00%> (ø)

... and 6 files with indirect coverage changes

@JakeRadMSFT JakeRadMSFT changed the title (WIP) Try to clean up PrimitiveColumnContainer Clean up PrimitiveColumnContainer May 6, 2023
@michaelgsharp michaelgsharp merged commit 247f3a0 into dotnet:main May 8, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Jun 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants