Skip to content

[feat](catalog) Support reading Hive table with MultiDelimitSerDe #51936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 4, 2025

Conversation

felixwluo
Copy link
Contributor

@felixwluo felixwluo commented Jun 19, 2025

What problem does this PR solve?

Issue Number: close #51846

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label
Page not found · GitHub · GitHub
Skip to content
404 “This is not the web page you are looking for”
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34300 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5d3c2242bb02b76bc07d666d2bbf98cf1e552855, data reload: false

------ Round 1 ----------------------------------
q1	17619	5230	5097	5097
q2	1918	287	184	184
q3	10352	1335	748	748
q4	10226	1001	516	516
q5	7548	2375	2323	2323
q6	181	163	133	133
q7	904	739	626	626
q8	9321	1250	1117	1117
q9	6830	5129	5128	5128
q10	6948	2370	2007	2007
q11	504	286	271	271
q12	350	351	208	208
q13	17792	3690	3156	3156
q14	249	237	214	214
q15	587	497	485	485
q16	425	439	380	380
q17	603	866	369	369
q18	7568	7269	7218	7218
q19	2007	965	565	565
q20	330	341	221	221
q21	3761	2631	2369	2369
q22	1061	1039	965	965
Total cold run time: 107084 ms
Total hot run time: 34300 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5257	5075	5093	5075
q2	254	316	225	225
q3	2135	2655	2300	2300
q4	1339	1782	1368	1368
q5	4242	4113	4290	4113
q6	212	175	134	134
q7	2151	1932	1754	1754
q8	2603	2507	2527	2507
q9	7109	7085	7099	7085
q10	3079	3271	2804	2804
q11	577	510	485	485
q12	675	767	642	642
q13	3518	3816	3198	3198
q14	289	288	278	278
q15	514	485	491	485
q16	436	487	435	435
q17	1142	1544	1364	1364
q18	7485	7182	6997	6997
q19	764	750	905	750
q20	1945	1968	1794	1794
q21	4770	4330	4364	4330
q22	1094	1064	1005	1005
Total cold run time: 51590 ms
Total hot run time: 49128 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185362 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5d3c2242bb02b76bc07d666d2bbf98cf1e552855, data reload: false

query1	985	402	382	382
query2	6548	1840	1794	1794
query3	6745	221	221	221
query4	26773	23297	23422	23297
query5	4374	628	459	459
query6	308	208	200	200
query7	4635	515	293	293
query8	279	235	211	211
query9	8638	2619	2609	2609
query10	447	359	290	290
query11	15845	15019	14830	14830
query12	173	104	107	104
query13	1660	521	427	427
query14	9582	6198	6116	6116
query15	208	189	173	173
query16	7376	665	484	484
query17	1198	727	585	585
query18	1996	439	299	299
query19	185	183	152	152
query20	128	122	117	117
query21	212	127	109	109
query22	4274	4119	3958	3958
query23	34005	33044	33074	33044
query24	8456	2372	2353	2353
query25	526	440	395	395
query26	723	264	156	156
query27	2715	508	343	343
query28	4198	2121	2099	2099
query29	686	553	459	459
query30	287	224	197	197
query31	906	850	802	802
query32	74	66	61	61
query33	563	368	306	306
query34	784	839	543	543
query35	774	808	716	716
query36	934	981	889	889
query37	112	101	75	75
query38	4191	4172	4096	4096
query39	1501	1434	1437	1434
query40	208	118	106	106
query41	64	60	56	56
query42	128	104	111	104
query43	515	499	478	478
query44	1355	824	813	813
query45	180	171	164	164
query46	861	1010	635	635
query47	1743	1790	1700	1700
query48	392	446	319	319
query49	708	489	407	407
query50	648	669	398	398
query51	4099	4124	4110	4110
query52	111	110	103	103
query53	227	256	190	190
query54	567	572	493	493
query55	87	82	81	81
query56	311	302	297	297
query57	1169	1211	1141	1141
query58	271	261	257	257
query59	2581	2634	2576	2576
query60	332	320	294	294
query61	118	134	123	123
query62	811	742	659	659
query63	219	192	189	189
query64	3063	1008	654	654
query65	4272	4221	4248	4221
query66	961	450	307	307
query67	15782	15605	15240	15240
query68	7814	885	519	519
query69	473	306	278	278
query70	1150	1137	1097	1097
query71	481	338	305	305
query72	5305	4653	4603	4603
query73	642	581	357	357
query74	9074	9056	8683	8683
query75	3919	3205	2702	2702
query76	3464	1200	757	757
query77	790	456	296	296
query78	10064	9879	9351	9351
query79	2728	791	588	588
query80	675	507	434	434
query81	503	272	224	224
query82	479	123	103	103
query83	254	246	234	234
query84	253	112	90	90
query85	788	409	312	312
query86	382	303	282	282
query87	4500	4432	4309	4309
query88	3857	2290	2300	2290
query89	382	325	283	283
query90	1893	212	212	212
query91	145	140	111	111
query92	88	64	59	59
query93	2425	947	579	579
query94	682	419	319	319
query95	379	299	287	287
query96	497	585	279	279
query97	2746	2744	2707	2707
query98	239	213	204	204
query99	1693	1382	1315	1315
Total cold run time: 274551 ms
Total hot run time: 185362 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.37 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5d3c2242bb02b76bc07d666d2bbf98cf1e552855, data reload: false

query1	0.04	0.04	0.02
query2	0.08	0.03	0.03
query3	0.24	0.06	0.07
query4	1.61	0.11	0.10
query5	0.42	0.41	0.40
query6	1.19	0.68	0.66
query7	0.02	0.02	0.02
query8	0.05	0.03	0.04
query9	0.58	0.51	0.52
query10	0.56	0.58	0.57
query11	0.15	0.11	0.11
query12	0.15	0.11	0.12
query13	0.63	0.62	0.60
query14	0.80	0.82	0.82
query15	0.88	0.86	0.89
query16	0.39	0.39	0.40
query17	1.06	1.07	1.06
query18	0.22	0.21	0.21
query19	2.02	1.85	1.84
query20	0.01	0.01	0.01
query21	15.42	0.91	0.55
query22	0.75	1.14	0.92
query23	14.69	1.40	0.65
query24	7.12	1.18	0.32
query25	0.28	0.22	0.08
query26	0.63	0.18	0.15
query27	0.07	0.05	0.05
query28	9.20	0.93	0.46
query29	12.55	3.97	3.34
query30	0.25	0.09	0.06
query31	2.83	0.59	0.39
query32	3.23	0.55	0.47
query33	3.16	3.13	3.10
query34	16.11	5.42	4.76
query35	4.83	4.84	4.86
query36	0.67	0.52	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.02	0.02
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 103.4 s
Total hot run time: 29.37 s

@@ -88,6 +88,13 @@ public static String getFieldDelimiter(Table table) {
DEFAULT_FIELD_DELIMITER, fieldDelim, serFormat));
}

public static String getMultiDelimitFieldDelimiter(Table table) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is same as getFieldDelimiter()?

CREATE DATABASE IF NOT EXISTS regression;
USE regression;

CREATE TABLE `multi_delimit_test`(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding array and map type to test 'mapkey.delim' and 'collection.delim' too?

}
process_value_func(data, value_start, size - value_start, _trimming_char, splitted_values);
} else {
size_t start = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add unit test for this algorithm


try {
// Test 1: MultiDelimitSerDe with |+| delimiter
hive_docker """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you already add hql in docker/thirdparties/docker-compose/hive/scripts/data/regression/multi_delimit_serde/create_table.hql, why need to create here again?


hive_docker """INSERT OVERWRITE TABLE multi_delimit_test3 VALUES ('field1', 'field2', 'field3'), ('a', 'b', 'c')"""

qt_03 """SELECT * FROM multi_delimit_test3 ORDER BY col1"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test the insert command (Use Doris to write to Hive) for each table type

Comment on lines 475 to 476
} else if (serDeLib.equals(HiveMetaStoreClientHelper.HIVE_MULTI_DELIMIT_SERDE)) {
TFileTextScanRangeParams textParams = new TFileTextScanRangeParams();
// set properties of MultiDelimitSerDe
// 1. set column separator (support multi-character delimiters)
textParams.setColumnSeparator(HiveProperties.getMultiDelimitFieldDelimiter(table));
// 2. set line delimiter
textParams.setLineDelimiter(HiveProperties.getLineDelimiter(table));
// 3. set mapkv delimiter
textParams.setMapkvDelimiter(HiveProperties.getMapKvDelimiter(table));
// 4. set collection delimiter
textParams.setCollectionDelimiter(HiveProperties.getCollectionDelimiter(table));
// 5. set escape delimiter
HiveProperties.getEscapeDelimiter(table).ifPresent(d -> textParams.setEscape(d.getBytes()[0]));
// 6. set null format
textParams.setNullFormat(HiveProperties.getNullFormat(table));
fileAttributes.setTextParams(textParams);
fileAttributes.setHeaderType("");
fileAttributes.setEnableTextValidateUtf8(
sessionVariable.enableTextValidateUtf8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These code is similar to org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, should be merged together

// hive will escape the field separator in string
if (_escape_char != 0 && i > 0 && data[i - 1] == _escape_char) {
continue;
if (_value_sep_len == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isbetter to abstract the branch code into two functions, refer to PlainCsvTextFieldSplitter:

void PlainCsvTextFieldSplitter::do_split(const Slice& line, std::vector<Slice>* splitted_values) {
    if (is_single_char_delim) {
        _split_field_single_char(line, splitted_values);
    } else {
        _split_field_multi_char(line, splitted_values);
    }
}

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from 5d3c224 to 1e01919 Compare June 23, 2025 08:59
@felixwluo
Copy link
Contributor Author

run buildall

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from 7d03769 to 1fe5222 Compare June 23, 2025 09:19
@doris-robot
Copy link

TPC-H: Total hot run time: 33690 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7d037698b232388e99690cdaacbe7a83608d5ecb, data reload: false

------ Round 1 ----------------------------------
q1	17577	5199	5052	5052
q2	1933	289	194	194
q3	10359	1273	711	711
q4	10224	1020	532	532
q5	7533	2351	2269	2269
q6	190	168	142	142
q7	911	746	612	612
q8	9352	1252	1045	1045
q9	6675	5068	5097	5068
q10	6954	2388	1955	1955
q11	482	288	268	268
q12	358	349	221	221
q13	17785	3628	3052	3052
q14	234	237	224	224
q15	555	491	495	491
q16	426	421	368	368
q17	591	848	358	358
q18	7455	7065	7199	7065
q19	2050	1084	528	528
q20	332	334	217	217
q21	3559	3249	2350	2350
q22	1025	1017	968	968
Total cold run time: 106560 ms
Total hot run time: 33690 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5260	5057	5100	5057
q2	245	319	228	228
q3	2179	2594	2297	2297
q4	1324	1815	1373	1373
q5	4188	4111	4351	4111
q6	207	173	128	128
q7	2000	1902	1755	1755
q8	2593	2537	2409	2409
q9	7194	7075	7081	7075
q10	3049	3273	2845	2845
q11	572	504	492	492
q12	663	764	611	611
q13	3504	3915	3250	3250
q14	283	290	285	285
q15	529	482	474	474
q16	429	490	437	437
q17	1118	1526	1393	1393
q18	7487	7159	6942	6942
q19	784	801	926	801
q20	1890	1956	1841	1841
q21	4631	4263	4271	4263
q22	1097	995	973	973
Total cold run time: 51226 ms
Total hot run time: 49040 ms

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34109 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1fe52225732fca0eecfbfc7edcbcabd0c81e0605, data reload: false

------ Round 1 ----------------------------------
q1	17451	5230	5050	5050
q2	1955	293	189	189
q3	7504	1332	748	748
q4	9775	994	540	540
q5	8635	2398	2371	2371
q6	200	165	137	137
q7	923	759	601	601
q8	9206	1294	1128	1128
q9	6807	5067	5077	5067
q10	6955	2362	1967	1967
q11	477	291	278	278
q12	345	349	221	221
q13	4584	3696	3117	3117
q14	241	243	227	227
q15	564	479	482	479
q16	437	426	376	376
q17	594	867	362	362
q18	7444	7134	7036	7036
q19	1188	945	557	557
q20	353	360	227	227
q21	3904	3296	2463	2463
q22	1062	1025	968	968
Total cold run time: 90604 ms
Total hot run time: 34109 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5120	5147	5078	5078
q2	263	320	222	222
q3	2173	2631	2295	2295
q4	1408	1791	1352	1352
q5	4222	4137	4131	4131
q6	203	164	125	125
q7	1922	1835	1711	1711
q8	2484	2484	2390	2390
q9	6751	6678	6717	6678
q10	2924	3149	2732	2732
q11	582	514	507	507
q12	640	737	567	567
q13	3389	3740	3166	3166
q14	268	281	263	263
q15	517	467	473	467
q16	437	477	421	421
q17	1122	1545	1307	1307
q18	7384	7046	6960	6960
q19	795	945	1072	945
q20	1927	1963	1870	1870
q21	4826	4306	4330	4306
q22	1078	996	984	984
Total cold run time: 50435 ms
Total hot run time: 48477 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186356 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1fe52225732fca0eecfbfc7edcbcabd0c81e0605, data reload: false

query1	985	395	399	395
query2	6546	1841	1853	1841
query3	6737	233	218	218
query4	26215	23367	23308	23308
query5	4885	648	481	481
query6	324	212	217	212
query7	4626	509	292	292
query8	279	243	222	222
query9	8662	2649	2667	2649
query10	500	327	270	270
query11	15950	15080	14819	14819
query12	181	119	111	111
query13	1668	545	438	438
query14	10069	6408	6375	6375
query15	206	199	188	188
query16	7647	619	473	473
query17	1295	737	602	602
query18	2043	428	317	317
query19	224	199	171	171
query20	125	116	117	116
query21	216	128	107	107
query22	4014	4066	4055	4055
query23	34049	33067	33297	33067
query24	8193	2381	2371	2371
query25	527	473	398	398
query26	1235	269	156	156
query27	2686	508	352	352
query28	4313	2133	2107	2107
query29	735	564	448	448
query30	295	217	193	193
query31	954	851	729	729
query32	75	71	64	64
query33	567	386	313	313
query34	821	848	550	550
query35	824	826	718	718
query36	968	974	896	896
query37	113	104	76	76
query38	4177	4098	4041	4041
query39	1504	1422	1412	1412
query40	207	125	112	112
query41	62	59	56	56
query42	129	111	110	110
query43	491	506	465	465
query44	1361	826	822	822
query45	187	177	167	167
query46	890	1020	633	633
query47	1710	1781	1716	1716
query48	379	444	303	303
query49	748	492	407	407
query50	647	676	411	411
query51	4119	4250	4129	4129
query52	113	110	103	103
query53	224	261	188	188
query54	579	577	511	511
query55	86	89	90	89
query56	295	324	305	305
query57	1160	1178	1127	1127
query58	276	268	264	264
query59	2678	2749	2552	2552
query60	345	328	318	318
query61	131	130	128	128
query62	818	717	658	658
query63	259	195	192	192
query64	4247	993	669	669
query65	4267	4209	4192	4192
query66	1066	431	340	340
query67	15932	15931	15474	15474
query68	8225	901	539	539
query69	491	312	280	280
query70	1224	1087	1062	1062
query71	494	338	305	305
query72	5516	4642	4608	4608
query73	692	574	360	360
query74	8813	8964	8982	8964
query75	3893	3194	2687	2687
query76	3771	1200	776	776
query77	821	454	306	306
query78	10174	10188	9385	9385
query79	2896	790	572	572
query80	644	531	463	463
query81	480	257	229	229
query82	499	125	100	100
query83	279	259	234	234
query84	298	191	105	105
query85	777	358	314	314
query86	387	325	291	291
query87	4450	4399	4294	4294
query88	3497	2331	2282	2282
query89	375	322	285	285
query90	1939	216	214	214
query91	155	136	113	113
query92	75	62	62	62
query93	1820	962	589	589
query94	663	416	291	291
query95	378	289	286	286
query96	488	573	283	283
query97	2725	2733	2626	2626
query98	244	212	204	204
query99	1441	1386	1255	1255
Total cold run time: 277463 ms
Total hot run time: 186356 ms

Copy link
Contributor

@suxiaogang223 suxiaogang223 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

ClickBench: Total hot run time: 29.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1fe52225732fca0eecfbfc7edcbcabd0c81e0605, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.03	0.04
query3	0.24	0.07	0.07
query4	1.61	0.10	0.11
query5	0.44	0.42	0.41
query6	1.18	0.65	0.66
query7	0.03	0.01	0.02
query8	0.04	0.03	0.04
query9	0.59	0.52	0.52
query10	0.57	0.59	0.57
query11	0.15	0.11	0.10
query12	0.14	0.11	0.11
query13	0.63	0.60	0.60
query14	0.80	0.80	0.82
query15	0.89	0.87	0.88
query16	0.39	0.38	0.40
query17	1.05	1.05	1.06
query18	0.23	0.21	0.22
query19	1.98	1.83	1.78
query20	0.01	0.01	0.01
query21	15.40	0.90	0.54
query22	0.76	1.20	0.64
query23	14.92	1.42	0.67
query24	7.42	1.29	0.53
query25	0.51	0.25	0.05
query26	0.49	0.17	0.14
query27	0.06	0.05	0.06
query28	9.17	0.91	0.45
query29	12.54	4.05	3.42
query30	0.26	0.09	0.06
query31	2.82	0.60	0.39
query32	3.23	0.55	0.48
query33	3.11	3.20	3.05
query34	16.02	5.36	4.73
query35	4.86	4.86	4.81
query36	0.70	0.51	0.50
query37	0.09	0.07	0.07
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.18	0.15	0.15
query41	0.08	0.03	0.03
query42	0.03	0.02	0.03
query43	0.05	0.04	0.03
Total cold run time: 103.87 s
Total hot run time: 29.2 s

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e91eff9ba77a616544ebd142146650a6953fba17, data reload: false

------ Round 1 ----------------------------------
q1	17587	5170	5050	5050
q2	1936	298	191	191
q3	10442	1315	727	727
q4	10267	1015	515	515
q5	7846	2234	2343	2234
q6	177	165	136	136
q7	900	729	601	601
q8	9293	1226	1081	1081
q9	6756	5073	5089	5073
q10	6939	2372	1946	1946
q11	479	287	281	281
q12	337	344	217	217
q13	17772	3601	3068	3068
q14	217	232	212	212
q15	573	484	479	479
q16	433	421	368	368
q17	584	876	365	365
q18	7428	7272	7055	7055
q19	2186	963	555	555
q20	340	333	214	214
q21	3912	2529	2337	2337
q22	1063	1016	974	974
Total cold run time: 107467 ms
Total hot run time: 33679 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5220	5128	5205	5128
q2	246	324	216	216
q3	2170	2640	2268	2268
q4	1375	1814	1344	1344
q5	4169	4108	4380	4108
q6	207	168	131	131
q7	2011	1932	1738	1738
q8	2584	2438	2523	2438
q9	7109	7059	7028	7028
q10	3077	3268	2815	2815
q11	585	535	493	493
q12	668	722	631	631
q13	3491	3925	3199	3199
q14	282	322	275	275
q15	507	473	467	467
q16	447	493	437	437
q17	1153	1507	1370	1370
q18	7590	7289	6933	6933
q19	746	739	760	739
q20	1936	1949	1811	1811
q21	4855	4345	4287	4287
q22	1064	1039	999	999
Total cold run time: 51492 ms
Total hot run time: 48855 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185403 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e91eff9ba77a616544ebd142146650a6953fba17, data reload: false

query1	969	374	395	374
query2	6543	1875	1877	1875
query3	6745	221	217	217
query4	25742	23246	23008	23008
query5	4336	604	468	468
query6	313	209	201	201
query7	4620	494	289	289
query8	282	227	213	213
query9	8646	2643	2629	2629
query10	493	320	287	287
query11	15521	15461	14794	14794
query12	175	114	107	107
query13	1673	537	422	422
query14	9713	6162	6292	6162
query15	200	193	171	171
query16	7635	593	467	467
query17	1208	706	579	579
query18	2034	414	310	310
query19	196	196	161	161
query20	121	116	117	116
query21	219	129	108	108
query22	4120	4138	4060	4060
query23	33869	33016	33006	33006
query24	8414	2329	2365	2329
query25	523	460	381	381
query26	1231	264	151	151
query27	2704	513	330	330
query28	4303	2111	2087	2087
query29	721	542	435	435
query30	294	219	190	190
query31	933	833	742	742
query32	75	64	60	60
query33	565	390	306	306
query34	797	862	545	545
query35	779	816	732	732
query36	917	974	885	885
query37	111	101	79	79
query38	4071	4040	4001	4001
query39	1451	1421	1409	1409
query40	206	119	110	110
query41	64	57	60	57
query42	127	109	112	109
query43	495	510	492	492
query44	1297	819	822	819
query45	177	177	169	169
query46	843	1009	624	624
query47	1727	1749	1723	1723
query48	379	417	307	307
query49	723	486	396	396
query50	630	683	383	383
query51	4197	4176	4142	4142
query52	117	110	103	103
query53	223	254	181	181
query54	570	557	494	494
query55	91	91	91	91
query56	298	300	288	288
query57	1188	1172	1109	1109
query58	279	281	256	256
query59	2658	2674	2641	2641
query60	321	328	299	299
query61	122	124	123	123
query62	805	702	685	685
query63	227	186	187	186
query64	4260	998	730	730
query65	4255	4223	4200	4200
query66	1063	413	312	312
query67	15642	15761	15363	15363
query68	9124	931	531	531
query69	463	306	284	284
query70	1210	1123	1047	1047
query71	466	330	295	295
query72	5288	4685	4825	4685
query73	731	635	354	354
query74	8793	8909	8969	8909
query75	4178	3200	2699	2699
query76	3603	1179	748	748
query77	829	382	301	301
query78	9898	10194	9290	9290
query79	1409	785	640	640
query80	619	520	473	473
query81	477	255	223	223
query82	420	128	97	97
query83	289	258	240	240
query84	297	105	83	83
query85	797	358	312	312
query86	333	299	276	276
query87	4430	4423	4298	4298
query88	3016	2295	2281	2281
query89	381	321	282	282
query90	1955	201	202	201
query91	140	138	112	112
query92	83	61	58	58
query93	1105	943	584	584
query94	672	411	307	307
query95	368	291	284	284
query96	493	570	279	279
query97	2757	2727	2690	2690
query98	223	206	199	199
query99	1430	1429	1263	1263
Total cold run time: 272523 ms
Total hot run time: 185403 ms

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from e91eff9 to f1b29b0 Compare June 23, 2025 16:36
@felixwluo
Copy link
Contributor Author

run buildall

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 95.45% (42/44) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.07% (15372/26934)
Line Coverage 46.16% (139504/302207)
Region Coverage 45.49% (70696/155398)
Branch Coverage 40.26% (37354/92788)

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33846 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2bbaf343eb182333925e8c03a5676177ea2e2418, data reload: false

------ Round 1 ----------------------------------
q1	17575	5249	5092	5092
q2	1934	279	182	182
q3	10471	1289	713	713
q4	10307	1038	519	519
q5	8815	2260	2379	2260
q6	205	160	130	130
q7	872	746	583	583
q8	9308	1488	1107	1107
q9	7122	5158	5083	5083
q10	6960	2372	1984	1984
q11	494	294	278	278
q12	342	361	223	223
q13	17805	3715	3115	3115
q14	229	245	217	217
q15	564	485	479	479
q16	420	430	379	379
q17	609	881	358	358
q18	7456	7225	7067	7067
q19	1129	944	553	553
q20	331	336	215	215
q21	3636	2581	2335	2335
q22	1052	1033	974	974
Total cold run time: 107636 ms
Total hot run time: 33846 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5205	5116	5145	5116
q2	250	335	227	227
q3	2197	2675	2301	2301
q4	1346	1796	1301	1301
q5	4221	4488	4486	4486
q6	212	164	123	123
q7	1973	1893	1824	1824
q8	2629	2605	2557	2557
q9	7230	7154	7116	7116
q10	3069	3253	2828	2828
q11	568	513	503	503
q12	719	771	591	591
q13	3480	3897	3232	3232
q14	275	316	275	275
q15	516	471	469	469
q16	452	471	426	426
q17	1172	1544	1381	1381
q18	7712	7709	7387	7387
q19	805	797	938	797
q20	1964	2041	1883	1883
q21	4877	4432	4339	4339
q22	1079	1009	990	990
Total cold run time: 51951 ms
Total hot run time: 50152 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184266 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2bbaf343eb182333925e8c03a5676177ea2e2418, data reload: false

query1	1002	392	377	377
query2	6524	1710	1697	1697
query3	6751	215	212	212
query4	26271	23225	23397	23225
query5	4346	586	426	426
query6	299	222	234	222
query7	4610	479	282	282
query8	263	203	210	203
query9	8616	2590	2613	2590
query10	454	315	260	260
query11	15541	15050	14739	14739
query12	155	108	99	99
query13	1650	524	407	407
query14	8550	5686	5702	5686
query15	217	199	182	182
query16	7176	637	458	458
query17	1212	711	593	593
query18	1995	416	304	304
query19	194	196	170	170
query20	126	121	111	111
query21	211	125	107	107
query22	3986	4340	4029	4029
query23	33843	33032	32826	32826
query24	8517	2321	2351	2321
query25	538	470	396	396
query26	1238	263	146	146
query27	2762	507	333	333
query28	4276	2116	2108	2108
query29	771	584	438	438
query30	282	218	186	186
query31	926	831	767	767
query32	76	63	57	57
query33	549	355	307	307
query34	789	839	535	535
query35	796	864	715	715
query36	953	991	897	897
query37	101	98	75	75
query38	4110	4108	4010	4010
query39	1489	1399	1431	1399
query40	212	112	104	104
query41	56	53	51	51
query42	120	105	102	102
query43	473	509	457	457
query44	1309	831	833	831
query45	176	167	164	164
query46	837	1004	625	625
query47	1802	1840	1719	1719
query48	371	410	309	309
query49	749	493	380	380
query50	642	699	400	400
query51	4105	4198	4085	4085
query52	110	103	97	97
query53	225	248	185	185
query54	571	559	492	492
query55	83	83	79	79
query56	294	318	274	274
query57	1189	1214	1105	1105
query58	260	259	269	259
query59	2571	2650	2482	2482
query60	339	305	316	305
query61	123	122	125	122
query62	818	731	657	657
query63	219	191	193	191
query64	4333	1028	655	655
query65	4352	4185	4182	4182
query66	1137	407	318	318
query67	15898	15463	15251	15251
query68	7768	877	525	525
query69	475	312	261	261
query70	1195	1108	1153	1108
query71	441	303	281	281
query72	5629	4775	4872	4775
query73	664	637	347	347
query74	9049	9141	8924	8924
query75	3676	3186	2652	2652
query76	3482	1135	719	719
query77	763	368	278	278
query78	9987	9964	9301	9301
query79	2000	824	598	598
query80	614	538	471	471
query81	466	265	224	224
query82	430	132	95	95
query83	275	246	241	241
query84	262	116	158	116
query85	799	361	320	320
query86	322	294	301	294
query87	4486	4497	4421	4421
query88	3543	2293	2277	2277
query89	384	315	286	286
query90	1968	200	204	200
query91	139	138	115	115
query92	73	59	54	54
query93	1588	938	593	593
query94	703	405	311	311
query95	370	293	280	280
query96	490	567	281	281
query97	2730	2711	2665	2665
query98	237	201	205	201
query99	1441	1405	1276	1276
Total cold run time: 271988 ms
Total hot run time: 184266 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.65 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2bbaf343eb182333925e8c03a5676177ea2e2418, data reload: false

query1	0.04	0.03	0.04
query2	0.07	0.03	0.04
query3	0.24	0.07	0.07
query4	1.64	0.10	0.10
query5	0.43	0.43	0.41
query6	1.15	0.66	0.66
query7	0.03	0.02	0.02
query8	0.04	0.04	0.03
query9	0.60	0.53	0.50
query10	0.57	0.56	0.56
query11	0.16	0.11	0.11
query12	0.14	0.12	0.12
query13	0.63	0.60	0.62
query14	0.86	0.79	0.80
query15	0.90	0.85	0.86
query16	0.38	0.38	0.39
query17	1.03	1.09	1.05
query18	0.23	0.21	0.21
query19	1.99	1.85	1.83
query20	0.01	0.02	0.02
query21	15.41	0.90	0.54
query22	0.77	1.12	0.65
query23	15.04	1.39	0.62
query24	6.45	1.52	1.17
query25	0.52	0.24	0.08
query26	0.57	0.16	0.15
query27	0.08	0.05	0.04
query28	9.67	0.92	0.43
query29	12.53	3.90	3.26
query30	0.25	0.09	0.06
query31	2.84	0.61	0.41
query32	3.24	0.56	0.48
query33	3.04	3.10	3.06
query34	16.10	5.38	4.73
query35	4.81	4.80	4.82
query36	0.68	0.50	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.03
query40	0.17	0.15	0.13
query41	0.07	0.03	0.02
query42	0.03	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 103.62 s
Total hot run time: 29.65 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 95.45% (42/44) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.05% (15371/26945)
Line Coverage 46.14% (139491/302336)
Region Coverage 45.46% (70677/155464)
Branch Coverage 40.23% (37345/92830)

@felixwluo
Copy link
Contributor Author

run cloud_p0

@felixwluo felixwluo requested a review from morningman June 30, 2025 03:07
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jul 4, 2025
Copy link
Contributor

github-actions bot commented Jul 4, 2025

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit b465326 into apache:master Jul 4, 2025
24 of 26 checks passed
morningman pushed a commit to morningman/doris that referenced this pull request Jul 4, 2025
github-actions bot pushed a commit that referenced this pull request Jul 4, 2025
seawinde pushed a commit to seawinde/doris that referenced this pull request Jul 4, 2025
morrySnow pushed a commit that referenced this pull request Jul 5, 2025
…mitSerDe #51936 (#52772)

bp #51936

Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
etah000 pushed a commit to etah000/doris that referenced this pull request Jul 7, 2025
…mitSerDe apache#51936 (apache#52772)

bp apache#51936

Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
morrySnow pushed a commit that referenced this pull request Jul 9, 2025
…mitSerDe #51936 (#52773)

Cherry-picked from #51936

Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Support reading Hive table with MultiDelimitSerDe
6 participants