-
Notifications
You must be signed in to change notification settings - Fork 957
Insights: rapidsai/cudf
Overview
Could not load contribution data
Please try again later
33 Pull requests merged by 12 people
-
Improve high-multiplicity joins benchmark
#19287 merged
Jul 9, 2025 -
Enable chunked reading of PQ sources with
>2B
rows#19245 merged
Jul 9, 2025 -
Clean up cudf._lib.strings_udf.pyx
#19335 merged
Jul 9, 2025 -
Add support for
pandas-2.3.1
#19334 merged
Jul 9, 2025 -
Add single-file streaming
Sink
support#19317 merged
Jul 9, 2025 -
Refactor hash join with multiset
#18021 merged
Jul 9, 2025 -
Fix null mask assignment in aggregators and cleanup with C++20
#19302 merged
Jul 9, 2025 -
Allow setting
StreamingExecutor.target_partition_size
with an environment variable#19316 merged
Jul 9, 2025 -
Optimize object listing in pandas-tests diff CI
#19328 merged
Jul 9, 2025 -
Raised
MixedTypeErrors
for condition that lead to mixed types#19232 merged
Jul 9, 2025 -
Serialize
ConfigOptions
in pdsh benchmark output#19272 merged
Jul 9, 2025 -
Support Expr.str.find & Expr.str.join for non string data in cudf_polars
#19275 merged
Jul 9, 2025 -
Fix job filters for
pandas-tests
#19322 merged
Jul 9, 2025 -
[pre-commit.ci] pre-commit autoupdate
#19301 merged
Jul 9, 2025 -
Autodoc DateOffset
#19297 merged
Jul 8, 2025 -
Add support for StructFunction expressions in cudf_polars
#19052 merged
Jul 8, 2025 -
Always represent datetime aware data as UTC in strftime
#19304 merged
Jul 8, 2025 -
Support
Expr.str.extract/extract_groups
in cudf_polars#19271 merged
Jul 8, 2025 -
Remove pytest pin
#19127 merged
Jul 8, 2025 -
Move ast expression function definitions to .cpp files
#19250 merged
Jul 8, 2025 -
Update cudf.pandas test skips for pandas==2.3.1
#19313 merged
Jul 8, 2025 -
Fix cudf::column_device_view::element() doxygen
#19296 merged
Jul 8, 2025 -
Leverage new pylibcudf grouped_range_rolling_window for cuDF classic rolling(window: timedelta)
#19230 merged
Jul 8, 2025 -
Fixed type annotation for 'state' in make_recursive
#19294 merged
Jul 7, 2025 -
Implement row group pruning with dictionaries in experimental PQ reader
#18836 merged
Jul 7, 2025 -
Add some basic streaming engine documentation
#19088 merged
Jul 7, 2025 -
Raise
NotImplementedError
forLazyFrame.profile
with the streaming exeuctor#19257 merged
Jul 7, 2025 -
Move shuffle method defaulting to config options creation
#19274 merged
Jul 7, 2025 -
Do not pass cupy objects objects to numba kernels directly
#19283 merged
Jul 3, 2025 -
Support Expr.str.json_path_match/len_bytes/len_chars in cudf_polars
#19277 merged
Jul 3, 2025 -
Document aggregations for cudf::reduce in doxygen
#19264 merged
Jul 3, 2025 -
Rename "cardinality_factor" configuration to "unique_fraction"
#19273 merged
Jul 3, 2025 -
Move get_mask_offset_word utility to null_mask.cuh
#19226 merged
Jul 3, 2025
25 Pull requests opened by 10 people
-
Latest cuco commit 541bdc0
#19288 opened
Jul 3, 2025 -
Add cudf::strings::split_part API
#19289 opened
Jul 3, 2025 -
Support Expr.str.splitn/split_exact in cudf_polars
#19290 opened
Jul 3, 2025 -
Deprecate cudf::round for float types
#19298 opened
Jul 7, 2025 -
Refactor Source Type for UDFs
#19300 opened
Jul 7, 2025 -
Add new cudf::top_k API
#19303 opened
Jul 7, 2025 -
Move the Parquet `reader_impl` class declaration out of the `parquet::detail::reader`
#19305 opened
Jul 7, 2025 -
Support Expr.str.json_decode in cudf_polars
#19307 opened
Jul 8, 2025 -
🚧 Materialize tables in the experimental Parquet reader
#19308 opened
Jul 8, 2025 -
Support the JNI build with nvCOMP 5.0
#19309 opened
Jul 8, 2025 -
Fix a use-after-free issue in TDigest aggregation code.
#19311 opened
Jul 8, 2025 -
Support null_count expression
#19314 opened
Jul 8, 2025 -
Remove unnecessary compute for integer windows
#19315 opened
Jul 8, 2025 -
Support fill_null with fill strategy in cudf-polars
#19318 opened
Jul 8, 2025 -
[DO NOT MERGE] Implement JIT Conditional Join
#19319 opened
Jul 8, 2025 -
Fix compile warning in interop_stringview.cpp
#19320 opened
Jul 8, 2025 -
[TEST-ONLY] test cuda/std/span in cuco
#19321 opened
Jul 8, 2025 -
[TEST-ONLY] test just including span
#19325 opened
Jul 8, 2025 -
Add cudf::strings::find_instance API
#19326 opened
Jul 8, 2025 -
Fix bit shift overflow in segmented_offset_bitmask_binop utility
#19329 opened
Jul 9, 2025 -
Re-enable std/var reductions for libcudf debug builds
#19331 opened
Jul 9, 2025 -
Allow comparison binop to datetime.date
#19333 opened
Jul 9, 2025 -
Fix Union-Slice bug
#19336 opened
Jul 9, 2025 -
Update rapids_config files to support new branching strategy
#19337 opened
Jul 9, 2025 -
Support rank expression in cudf-polars
#19340 opened
Jul 10, 2025
20 Issues closed by 3 people
-
[BUG] Parquet reader fails to read a chunk for parquet file with more than 2^32 rows
#19238 closed
Jul 9, 2025 -
[BUG] assignment through loc[] breaks DataFrame
#12505 closed
Jul 9, 2025 -
[FEA] Support `polars.Expr.str.find` in `cudf-polars`
#18993 closed
Jul 9, 2025 -
[FEA] Support `polars.Expr.str.join` in `cudf-polars`
#18996 closed
Jul 9, 2025 -
DOCS: document `DateOffset`
#19292 closed
Jul 8, 2025 -
[FEA] Support struct data types in Polars
#16725 closed
Jul 8, 2025 -
[BUG] `dt.strftime` ignores time zone
#19295 closed
Jul 8, 2025 -
[FEA] Support `polars.Expr.str.extract_groups` in `cudf-polars`
#18991 closed
Jul 8, 2025 -
[FEA] Support `polars.Expr.str.extract` in `cudf-polars`
#18989 closed
Jul 8, 2025 -
[BUG] Element access in column_device_view leads to copy
#19291 closed
Jul 8, 2025 -
[BUG] `RollingGroupBy` objects cannot be pickled
#14845 closed
Jul 8, 2025 -
[Story] Rewrite cudf.pandas rolling functionality using the new C++ rolling functionality
#18709 closed
Jul 8, 2025 -
[FEA] Add a custom decoder to convert Parquet dictionary pages into `cuco::static_set` objects
#18046 closed
Jul 7, 2025 -
[BUG]: `TypeError`: in `LazyFrame.profile()` with streaming executor
#19253 closed
Jul 7, 2025 -
[QST]cuda12.8
#19260 closed
Jul 7, 2025 -
RUN TPCH 10K on DGXA100/DGXH100/DGXB100
#18812 closed
Jul 3, 2025 -
[FEA] Support `polars.Expr.str.len_chars` in `cudf-polars`
#19000 closed
Jul 3, 2025 -
[FEA] Support `polars.Expr.str.len_bytes` in `cudf-polars`
#18999 closed
Jul 3, 2025 -
[FEA] Support `polars.Expr.str.json_path_match` in `cudf-polars`
#18998 closed
Jul 3, 2025
12 Issues opened by 6 people
-
[FEA] Add a `pylibcudf.Column.as_struct(columns: Iterable[Column])` to create a struct column
#19339 opened
Jul 9, 2025 -
[FEA] Support string column to struct column (and vice versa) conversions in `read_json`/`write_json`
#19338 opened
Jul 9, 2025 -
[BUG] Incorrect result from `tail(1)` after `concat` with streaming GPU executor
#19332 opened
Jul 9, 2025 -
[FEA] Allow setting all cudf-polars configuration options with environment variables
#19330 opened
Jul 9, 2025 -
[FEA] support comparison between DatetimeColumn and native python date object
#19327 opened
Jul 9, 2025 -
[FEA] Allow controlling rounding behavior for `quantile(interp="nearest")` to match Polars
#19324 opened
Jul 8, 2025 -
[BUG] Difference in `groupby().all()` with `cudf.pandas` and string dtype
#19312 opened
Jul 8, 2025 -
[FEA] Single-file ``Sink`` support for streaming executor
#19310 opened
Jul 8, 2025 -
[FEA] Create post op CPU callback for cudf_polars output types that are not supported by libcudf(?)
#19306 opened
Jul 8, 2025 -
[FEA] `contiguous_split` should accept span of split points, rather than vector
#19293 opened
Jul 7, 2025 -
[FEA] Support `Expr.name.map_fields` in cudf_polars
#19285 opened
Jul 3, 2025 -
[FEA] Support `Expr.struct.with_fields` in cudf_polars
#19284 opened
Jul 3, 2025
34 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Implement UDF Filters
#19070 commented on
Jul 9, 2025 • 40 new comments -
Introduce classes for collecting source statistics
#19276 commented on
Jul 9, 2025 • 23 new comments -
Change default cudf-polars executor to "streaming"
#19263 commented on
Jul 9, 2025 • 6 new comments -
Added Concurrent Polynomial Transform Benchmark
#19199 commented on
Jul 8, 2025 • 4 new comments -
Manage strings with NRT
#18453 commented on
Jul 7, 2025 • 4 new comments -
Support output_dtype in cudf::reduce for nunique aggregation
#19265 commented on
Jul 9, 2025 • 3 new comments -
Support ternary expression inside groupby context
#19242 commented on
Jul 8, 2025 • 2 new comments -
Experimental API to read a parquet table, build a custom index column, and apply roaring bitmap deletion vector
#19237 commented on
Jul 7, 2025 • 2 new comments -
Add SUM_ANSI aggregation for INT64
#19282 commented on
Jul 3, 2025 • 1 new comment -
Support Expr.str.strip_prefix/suffix in cudf_polars
#19278 commented on
Jul 8, 2025 • 1 new comment -
Require `numba-cuda>=0.16.0`
#19213 commented on
Jul 9, 2025 • 1 new comment -
[BUG] Assertion error when running JOIN_NVBENCH with debug build
#19255 commented on
Jul 3, 2025 • 0 new comments -
Add data types axis to joins benchmarks
#19281 commented on
Jul 7, 2025 • 0 new comments -
[FEA] overflow detection for SUM aggregations
#19243 commented on
Jul 3, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.split_exact` in `cudf-polars`
#19027 commented on
Jul 3, 2025 • 0 new comments -
Fix includes for segmented-reduce source files
#19266 commented on
Jul 8, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.zfill` in `cudf-polars`
#19035 commented on
Jul 3, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.pad_start` in `cudf-polars`
#19024 commented on
Jul 3, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.pad_end` in `cudf-polars`
#19023 commented on
Jul 3, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.strip_prefix` in `cudf-polars`
#19029 commented on
Jul 3, 2025 • 0 new comments -
Remove cudautils.py
#19233 commented on
Jul 9, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.splitn` in `cudf-polars`
#19028 commented on
Jul 3, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.strip_suffix` in `cudf-polars`
#19030 commented on
Jul 4, 2025 • 0 new comments -
Use GCC 14 in conda builds.
#19192 commented on
Jul 7, 2025 • 0 new comments -
Use KvikIO's implementation of file-backed memory mapping
#19164 commented on
Jul 7, 2025 • 0 new comments -
[FEA] Support NaN-aware operators in libcudf that conform to ETL engine expectations
#18930 commented on
Jul 7, 2025 • 0 new comments -
Add multi-column support for primitive row operator dispatch
#18940 commented on
Jul 3, 2025 • 0 new comments -
[BUG] Uninitialized read detected in parquet chunked reader
#19247 commented on
Jul 7, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.json_decode` in `cudf-polars`
#18997 commented on
Jul 9, 2025 • 0 new comments -
[BUG] Rolling window aggregations are very slow with large windows
#15119 commented on
Jul 8, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.to_decimal` in `cudf-polars`
#19032 commented on
Jul 8, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.decode` in `cudf-polars`
#18986 commented on
Jul 8, 2025 • 0 new comments -
[FEA] Support `polars.Expr.str.to_time` in `cudf-polars`
#19033 commented on
Jul 8, 2025 • 0 new comments -
[FEA] Add a TopK API to cudf
#19096 commented on
Jul 8, 2025 • 0 new comments