-
Notifications
You must be signed in to change notification settings - Fork 73
Fix concat #673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Maybe it's useful to add |
I think this is the intended behavior. The key of the group is something temporary and usually consists of columns already in the DF. The original DataFrame can be retrieved using |
Ok, anyway new concat is needed for the purpose I described. |
maybe a |
I think it won't hurt to make do it by default. One might say that |
if we do it by default, then we would get duplicate columns, because the key columns are often in the groups as well |
Andrey's implementation only adds "new" columns (or so i understood) |
But then, what qualifies as "new"?
I think we should be careful here |
There's also the case where a user creates a new Something like: internal fun GroupBy<*, *>.concatWithKeys(): DataFrame<*> =
mapToFrames {
DynamicDataFrameBuilder()
.apply {
for (column in group.columns()) {
add(column)
}
val rowsCount = group.rowsCount()
for ((name, value) in key.toMap()) {
add(List(rowsCount) { value }.toColumn(name))
}
}
.toDataFrame()
.moveToLeft { takeLast(key.count()) }
}.concat() |
Alternatively, what's arguably a lot simpler, we could just explode the groups column. Like: internal fun GroupBy<*, *>.concatWithKeys(): DataFrame<*> =
toDataFrame().explode { groups } This will generate extra key values where necessary and keep the grouped columns in a column group, avoiding any potential name clashes :). |
@AndreiKingsley could you please finish the story |
concat
removes key column entirely (name and values)The text was updated successfully, but these errors were encountered: