Commit 631ee67
[SPARK-43055][CONNECT][PYTHON] Support duplicated nested field names
### What changes were proposed in this pull request?
Supports duplicated nested field names when `spark.createDataFrame` or `df.collect`.
### Why are the changes needed?
If there are duplicated nested field names, the following error is raised:
```py
>>> from pyspark.sql.types import *
>>>
>>> data = [Row(Row("a", 1), Row(2, 3, "b", 4, "c")), Row(Row("x", 6), Row(7, 8, "y", 9, "z"))]
>>> schema = (
... StructType()
... .add("struct", StructType().add("x", StringType()).add("x", IntegerType()))
... .add(
... "struct",
... StructType()
... .add("a", IntegerType())
... .add("x", IntegerType())
... .add("x", StringType())
... .add("y", IntegerType())
... .add("y", StringType()),
... )
... )
>>> df = spark.createDataFrame(data, schema=schema)
Traceback (most recent call last):
...
pyarrow.lib.ArrowTypeError: Expected bytes, got a 'int' object
```
### Does this PR introduce _any_ user-facing change?
The duplicated nested field names will be available.
### How was this patch tested?
Added a test.
Closes apache#40692 from ueshin/issues/SPARK-43055/duplicate_fields.
Authored-by: Takuya UESHIN <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>1 parent a31ac04 commit 631ee67
File tree
5 files changed
+135
-49
lines changed- connector/connect
- client/jvm/src/main/scala/org/apache/spark/sql/connect/client
- server/src/main/scala/org/apache/spark/sql/connect/service
- python/pyspark/sql
- connect
- tests
5 files changed
+135
-49
lines changedLines changed: 10 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
| 34 | + | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| |||
60 | 61 | | |
61 | 62 | | |
62 | 63 | | |
63 | | - | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
64 | 69 | | |
65 | 70 | | |
66 | 71 | | |
67 | 72 | | |
68 | 73 | | |
69 | | - | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
70 | 78 | | |
71 | 79 | | |
72 | 80 | | |
| |||
Lines changed: 34 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
| 21 | + | |
20 | 22 | | |
21 | 23 | | |
22 | 24 | | |
| |||
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
41 | | - | |
| 43 | + | |
42 | 44 | | |
43 | 45 | | |
44 | 46 | | |
| |||
120 | 122 | | |
121 | 123 | | |
122 | 124 | | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
123 | 155 | | |
124 | | - | |
| 156 | + | |
125 | 157 | | |
126 | 158 | | |
127 | 159 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
678 | 678 | | |
679 | 679 | | |
680 | 680 | | |
681 | | - | |
682 | | - | |
683 | 681 | | |
684 | 682 | | |
685 | 683 | | |
686 | 684 | | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
687 | 688 | | |
688 | 689 | | |
689 | 690 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
31 | 32 | | |
32 | 33 | | |
33 | 34 | | |
| 35 | + | |
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
40 | 42 | | |
| 43 | + | |
41 | 44 | | |
42 | 45 | | |
43 | 46 | | |
44 | 47 | | |
45 | 48 | | |
46 | 49 | | |
47 | 50 | | |
| 51 | + | |
48 | 52 | | |
49 | 53 | | |
50 | 54 | | |
| |||
99 | 103 | | |
100 | 104 | | |
101 | 105 | | |
102 | | - | |
103 | | - | |
| 106 | + | |
| 107 | + | |
104 | 108 | | |
105 | | - | |
| 109 | + | |
106 | 110 | | |
107 | 111 | | |
108 | 112 | | |
| |||
113 | 117 | | |
114 | 118 | | |
115 | 119 | | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | | - | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
134 | 129 | | |
135 | 130 | | |
136 | 131 | | |
| |||
255 | 250 | | |
256 | 251 | | |
257 | 252 | | |
258 | | - | |
259 | | - | |
260 | 253 | | |
261 | 254 | | |
262 | 255 | | |
| |||
276 | 269 | | |
277 | 270 | | |
278 | 271 | | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
279 | 293 | | |
280 | 294 | | |
281 | 295 | | |
| |||
319 | 333 | | |
320 | 334 | | |
321 | 335 | | |
322 | | - | |
323 | | - | |
324 | | - | |
325 | | - | |
326 | | - | |
327 | | - | |
328 | | - | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
329 | 360 | | |
330 | 361 | | |
331 | 362 | | |
332 | 363 | | |
333 | 364 | | |
334 | 365 | | |
335 | 366 | | |
336 | | - | |
337 | | - | |
338 | | - | |
339 | | - | |
340 | | - | |
341 | | - | |
342 | | - | |
343 | | - | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
344 | 372 | | |
345 | 373 | | |
346 | 374 | | |
| |||
425 | 453 | | |
426 | 454 | | |
427 | 455 | | |
428 | | - | |
429 | | - | |
430 | | - | |
431 | 456 | | |
432 | 457 | | |
433 | 458 | | |
434 | 459 | | |
435 | | - | |
436 | | - | |
| 460 | + | |
| 461 | + | |
437 | 462 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1700 | 1700 | | |
1701 | 1701 | | |
1702 | 1702 | | |
| 1703 | + | |
| 1704 | + | |
| 1705 | + | |
| 1706 | + | |
| 1707 | + | |
| 1708 | + | |
| 1709 | + | |
| 1710 | + | |
| 1711 | + | |
| 1712 | + | |
| 1713 | + | |
| 1714 | + | |
| 1715 | + | |
| 1716 | + | |
| 1717 | + | |
| 1718 | + | |
| 1719 | + | |
| 1720 | + | |
| 1721 | + | |
| 1722 | + | |
1703 | 1723 | | |
1704 | 1724 | | |
1705 | 1725 | | |
| |||
0 commit comments