Skip to content

[feat](catalog) Support reading Hive table with MultiDelimitSerDe #51936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

felixwluo
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #51846

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34300 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5d3c2242bb02b76bc07d666d2bbf98cf1e552855, data reload: false

------ Round 1 ----------------------------------
q1	17619	5230	5097	5097
q2	1918	287	184	184
q3	10352	1335	748	748
q4	10226	1001	516	516
q5	7548	2375	2323	2323
q6	181	163	133	133
q7	904	739	626	626
q8	9321	1250	1117	1117
q9	6830	5129	5128	5128
q10	6948	2370	2007	2007
q11	504	286	271	271
q12	350	351	208	208
q13	17792	3690	3156	3156
q14	249	237	214	214
q15	587	497	485	485
q16	425	439	380	380
q17	603	866	369	369
q18	7568	7269	7218	7218
q19	2007	965	565	565
q20	330	341	221	221
q21	3761	2631	2369	2369
q22	1061	1039	965	965
Total cold run time: 107084 ms
Total hot run time: 34300 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5257	5075	5093	5075
q2	254	316	225	225
q3	2135	2655	2300	2300
q4	1339	1782	1368	1368
q5	4242	4113	4290	4113
q6	212	175	134	134
q7	2151	1932	1754	1754
q8	2603	2507	2527	2507
q9	7109	7085	7099	7085
q10	3079	3271	2804	2804
q11	577	510	485	485
q12	675	767	642	642
q13	3518	3816	3198	3198
q14	289	288	278	278
q15	514	485	491	485
q16	436	487	435	435
q17	1142	1544	1364	1364
q18	7485	7182	6997	6997
q19	764	750	905	750
q20	1945	1968	1794	1794
q21	4770	4330	4364	4330
q22	1094	1064	1005	1005
Total cold run time: 51590 ms
Total hot run time: 49128 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185362 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5d3c2242bb02b76bc07d666d2bbf98cf1e552855, data reload: false

query1	985	402	382	382
query2	6548	1840	1794	1794
query3	6745	221	221	221
query4	26773	23297	23422	23297
query5	4374	628	459	459
query6	308	208	200	200
query7	4635	515	293	293
query8	279	235	211	211
query9	8638	2619	2609	2609
query10	447	359	290	290
query11	15845	15019	14830	14830
query12	173	104	107	104
query13	1660	521	427	427
query14	9582	6198	6116	6116
query15	208	189	173	173
query16	7376	665	484	484
query17	1198	727	585	585
query18	1996	439	299	299
query19	185	183	152	152
query20	128	122	117	117
query21	212	127	109	109
query22	4274	4119	3958	3958
query23	34005	33044	33074	33044
query24	8456	2372	2353	2353
query25	526	440	395	395
query26	723	264	156	156
query27	2715	508	343	343
query28	4198	2121	2099	2099
query29	686	553	459	459
query30	287	224	197	197
query31	906	850	802	802
query32	74	66	61	61
query33	563	368	306	306
query34	784	839	543	543
query35	774	808	716	716
query36	934	981	889	889
query37	112	101	75	75
query38	4191	4172	4096	4096
query39	1501	1434	1437	1434
query40	208	118	106	106
query41	64	60	56	56
query42	128	104	111	104
query43	515	499	478	478
query44	1355	824	813	813
query45	180	171	164	164
query46	861	1010	635	635
query47	1743	1790	1700	1700
query48	392	446	319	319
query49	708	489	407	407
query50	648	669	398	398
query51	4099	4124	4110	4110
query52	111	110	103	103
query53	227	256	190	190
query54	567	572	493	493
query55	87	82	81	81
query56	311	302	297	297
query57	1169	1211	1141	1141
query58	271	261	257	257
query59	2581	2634	2576	2576
query60	332	320	294	294
query61	118	134	123	123
query62	811	742	659	659
query63	219	192	189	189
query64	3063	1008	654	654
query65	4272	4221	4248	4221
query66	961	450	307	307
query67	15782	15605	15240	15240
query68	7814	885	519	519
query69	473	306	278	278
query70	1150	1137	1097	1097
query71	481	338	305	305
query72	5305	4653	4603	4603
query73	642	581	357	357
query74	9074	9056	8683	8683
query75	3919	3205	2702	2702
query76	3464	1200	757	757
query77	790	456	296	296
query78	10064	9879	9351	9351
query79	2728	791	588	588
query80	675	507	434	434
query81	503	272	224	224
query82	479	123	103	103
query83	254	246	234	234
query84	253	112	90	90
query85	788	409	312	312
query86	382	303	282	282
query87	4500	4432	4309	4309
query88	3857	2290	2300	2290
query89	382	325	283	283
query90	1893	212	212	212
query91	145	140	111	111
query92	88	64	59	59
query93	2425	947	579	579
query94	682	419	319	319
query95	379	299	287	287
query96	497	585	279	279
query97	2746	2744	2707	2707
query98	239	213	204	204
query99	1693	1382	1315	1315
Total cold run time: 274551 ms
Total hot run time: 185362 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.37 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5d3c2242bb02b76bc07d666d2bbf98cf1e552855, data reload: false

query1	0.04	0.04	0.02
query2	0.08	0.03	0.03
query3	0.24	0.06	0.07
query4	1.61	0.11	0.10
query5	0.42	0.41	0.40
query6	1.19	0.68	0.66
query7	0.02	0.02	0.02
query8	0.05	0.03	0.04
query9	0.58	0.51	0.52
query10	0.56	0.58	0.57
query11	0.15	0.11	0.11
query12	0.15	0.11	0.12
query13	0.63	0.62	0.60
query14	0.80	0.82	0.82
query15	0.88	0.86	0.89
query16	0.39	0.39	0.40
query17	1.06	1.07	1.06
query18	0.22	0.21	0.21
query19	2.02	1.85	1.84
query20	0.01	0.01	0.01
query21	15.42	0.91	0.55
query22	0.75	1.14	0.92
query23	14.69	1.40	0.65
query24	7.12	1.18	0.32
query25	0.28	0.22	0.08
query26	0.63	0.18	0.15
query27	0.07	0.05	0.05
query28	9.20	0.93	0.46
query29	12.55	3.97	3.34
query30	0.25	0.09	0.06
query31	2.83	0.59	0.39
query32	3.23	0.55	0.47
query33	3.16	3.13	3.10
query34	16.11	5.42	4.76
query35	4.83	4.84	4.86
query36	0.67	0.52	0.48
query37	0.09	0.07	0.06
query38	0.05	0.04	0.04
query39	0.03	0.03	0.03
query40	0.18	0.14	0.14
query41	0.08	0.02	0.02
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 103.4 s
Total hot run time: 29.37 s

@@ -88,6 +88,13 @@ public static String getFieldDelimiter(Table table) {
DEFAULT_FIELD_DELIMITER, fieldDelim, serFormat));
}

public static String getMultiDelimitFieldDelimiter(Table table) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is same as getFieldDelimiter()?

CREATE DATABASE IF NOT EXISTS regression;
USE regression;

CREATE TABLE `multi_delimit_test`(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding array and map type to test 'mapkey.delim' and 'collection.delim' too?

}
process_value_func(data, value_start, size - value_start, _trimming_char, splitted_values);
} else {
size_t start = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add unit test for this algorithm


try {
// Test 1: MultiDelimitSerDe with |+| delimiter
hive_docker """
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw you already add hql in docker/thirdparties/docker-compose/hive/scripts/data/regression/multi_delimit_serde/create_table.hql, why need to create here again?


hive_docker """INSERT OVERWRITE TABLE multi_delimit_test3 VALUES ('field1', 'field2', 'field3'), ('a', 'b', 'c')"""

qt_03 """SELECT * FROM multi_delimit_test3 ORDER BY col1"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test the insert command (Use Doris to write to Hive) for each table type

Comment on lines 475 to 476
} else if (serDeLib.equals(HiveMetaStoreClientHelper.HIVE_MULTI_DELIMIT_SERDE)) {
TFileTextScanRangeParams textParams = new TFileTextScanRangeParams();
// set properties of MultiDelimitSerDe
// 1. set column separator (support multi-character delimiters)
textParams.setColumnSeparator(HiveProperties.getMultiDelimitFieldDelimiter(table));
// 2. set line delimiter
textParams.setLineDelimiter(HiveProperties.getLineDelimiter(table));
// 3. set mapkv delimiter
textParams.setMapkvDelimiter(HiveProperties.getMapKvDelimiter(table));
// 4. set collection delimiter
textParams.setCollectionDelimiter(HiveProperties.getCollectionDelimiter(table));
// 5. set escape delimiter
HiveProperties.getEscapeDelimiter(table).ifPresent(d -> textParams.setEscape(d.getBytes()[0]));
// 6. set null format
textParams.setNullFormat(HiveProperties.getNullFormat(table));
fileAttributes.setTextParams(textParams);
fileAttributes.setHeaderType("");
fileAttributes.setEnableTextValidateUtf8(
sessionVariable.enableTextValidateUtf8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These code is similar to org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, should be merged together

// hive will escape the field separator in string
if (_escape_char != 0 && i > 0 && data[i - 1] == _escape_char) {
continue;
if (_value_sep_len == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isbetter to abstract the branch code into two functions, refer to PlainCsvTextFieldSplitter:

void PlainCsvTextFieldSplitter::do_split(const Slice& line, std::vector<Slice>* splitted_values) {
    if (is_single_char_delim) {
        _split_field_single_char(line, splitted_values);
    } else {
        _split_field_multi_char(line, splitted_values);
    }
}

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from 5d3c224 to 1e01919 Compare June 23, 2025 08:59
@felixwluo
Copy link
Contributor Author

run buildall

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from 7d03769 to 1fe5222 Compare June 23, 2025 09:19
@doris-robot
Copy link

TPC-H: Total hot run time: 33690 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7d037698b232388e99690cdaacbe7a83608d5ecb, data reload: false

------ Round 1 ----------------------------------
q1	17577	5199	5052	5052
q2	1933	289	194	194
q3	10359	1273	711	711
q4	10224	1020	532	532
q5	7533	2351	2269	2269
q6	190	168	142	142
q7	911	746	612	612
q8	9352	1252	1045	1045
q9	6675	5068	5097	5068
q10	6954	2388	1955	1955
q11	482	288	268	268
q12	358	349	221	221
q13	17785	3628	3052	3052
q14	234	237	224	224
q15	555	491	495	491
q16	426	421	368	368
q17	591	848	358	358
q18	7455	7065	7199	7065
q19	2050	1084	528	528
q20	332	334	217	217
q21	3559	3249	2350	2350
q22	1025	1017	968	968
Total cold run time: 106560 ms
Total hot run time: 33690 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5260	5057	5100	5057
q2	245	319	228	228
q3	2179	2594	2297	2297
q4	1324	1815	1373	1373
q5	4188	4111	4351	4111
q6	207	173	128	128
q7	2000	1902	1755	1755
q8	2593	2537	2409	2409
q9	7194	7075	7081	7075
q10	3049	3273	2845	2845
q11	572	504	492	492
q12	663	764	611	611
q13	3504	3915	3250	3250
q14	283	290	285	285
q15	529	482	474	474
q16	429	490	437	437
q17	1118	1526	1393	1393
q18	7487	7159	6942	6942
q19	784	801	926	801
q20	1890	1956	1841	1841
q21	4631	4263	4271	4263
q22	1097	995	973	973
Total cold run time: 51226 ms
Total hot run time: 49040 ms

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34109 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1fe52225732fca0eecfbfc7edcbcabd0c81e0605, data reload: false

------ Round 1 ----------------------------------
q1	17451	5230	5050	5050
q2	1955	293	189	189
q3	7504	1332	748	748
q4	9775	994	540	540
q5	8635	2398	2371	2371
q6	200	165	137	137
q7	923	759	601	601
q8	9206	1294	1128	1128
q9	6807	5067	5077	5067
q10	6955	2362	1967	1967
q11	477	291	278	278
q12	345	349	221	221
q13	4584	3696	3117	3117
q14	241	243	227	227
q15	564	479	482	479
q16	437	426	376	376
q17	594	867	362	362
q18	7444	7134	7036	7036
q19	1188	945	557	557
q20	353	360	227	227
q21	3904	3296	2463	2463
q22	1062	1025	968	968
Total cold run time: 90604 ms
Total hot run time: 34109 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5120	5147	5078	5078
q2	263	320	222	222
q3	2173	2631	2295	2295
q4	1408	1791	1352	1352
q5	4222	4137	4131	4131
q6	203	164	125	125
q7	1922	1835	1711	1711
q8	2484	2484	2390	2390
q9	6751	6678	6717	6678
q10	2924	3149	2732	2732
q11	582	514	507	507
q12	640	737	567	567
q13	3389	3740	3166	3166
q14	268	281	263	263
q15	517	467	473	467
q16	437	477	421	421
q17	1122	1545	1307	1307
q18	7384	7046	6960	6960
q19	795	945	1072	945
q20	1927	1963	1870	1870
q21	4826	4306	4330	4306
q22	1078	996	984	984
Total cold run time: 50435 ms
Total hot run time: 48477 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186356 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1fe52225732fca0eecfbfc7edcbcabd0c81e0605, data reload: false

query1	985	395	399	395
query2	6546	1841	1853	1841
query3	6737	233	218	218
query4	26215	23367	23308	23308
query5	4885	648	481	481
query6	324	212	217	212
query7	4626	509	292	292
query8	279	243	222	222
query9	8662	2649	2667	2649
query10	500	327	270	270
query11	15950	15080	14819	14819
query12	181	119	111	111
query13	1668	545	438	438
query14	10069	6408	6375	6375
query15	206	199	188	188
query16	7647	619	473	473
query17	1295	737	602	602
query18	2043	428	317	317
query19	224	199	171	171
query20	125	116	117	116
query21	216	128	107	107
query22	4014	4066	4055	4055
query23	34049	33067	33297	33067
query24	8193	2381	2371	2371
query25	527	473	398	398
query26	1235	269	156	156
query27	2686	508	352	352
query28	4313	2133	2107	2107
query29	735	564	448	448
query30	295	217	193	193
query31	954	851	729	729
query32	75	71	64	64
query33	567	386	313	313
query34	821	848	550	550
query35	824	826	718	718
query36	968	974	896	896
query37	113	104	76	76
query38	4177	4098	4041	4041
query39	1504	1422	1412	1412
query40	207	125	112	112
query41	62	59	56	56
query42	129	111	110	110
query43	491	506	465	465
query44	1361	826	822	822
query45	187	177	167	167
query46	890	1020	633	633
query47	1710	1781	1716	1716
query48	379	444	303	303
query49	748	492	407	407
query50	647	676	411	411
query51	4119	4250	4129	4129
query52	113	110	103	103
query53	224	261	188	188
query54	579	577	511	511
query55	86	89	90	89
query56	295	324	305	305
query57	1160	1178	1127	1127
query58	276	268	264	264
query59	2678	2749	2552	2552
query60	345	328	318	318
query61	131	130	128	128
query62	818	717	658	658
query63	259	195	192	192
query64	4247	993	669	669
query65	4267	4209	4192	4192
query66	1066	431	340	340
query67	15932	15931	15474	15474
query68	8225	901	539	539
query69	491	312	280	280
query70	1224	1087	1062	1062
query71	494	338	305	305
query72	5516	4642	4608	4608
query73	692	574	360	360
query74	8813	8964	8982	8964
query75	3893	3194	2687	2687
query76	3771	1200	776	776
query77	821	454	306	306
query78	10174	10188	9385	9385
query79	2896	790	572	572
query80	644	531	463	463
query81	480	257	229	229
query82	499	125	100	100
query83	279	259	234	234
query84	298	191	105	105
query85	777	358	314	314
query86	387	325	291	291
query87	4450	4399	4294	4294
query88	3497	2331	2282	2282
query89	375	322	285	285
query90	1939	216	214	214
query91	155	136	113	113
query92	75	62	62	62
query93	1820	962	589	589
query94	663	416	291	291
query95	378	289	286	286
query96	488	573	283	283
query97	2725	2733	2626	2626
query98	244	212	204	204
query99	1441	1386	1255	1255
Total cold run time: 277463 ms
Total hot run time: 186356 ms

Copy link
Contributor

@suxiaogang223 suxiaogang223 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

ClickBench: Total hot run time: 29.2 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1fe52225732fca0eecfbfc7edcbcabd0c81e0605, data reload: false

query1	0.04	0.04	0.03
query2	0.08	0.03	0.04
query3	0.24	0.07	0.07
query4	1.61	0.10	0.11
query5	0.44	0.42	0.41
query6	1.18	0.65	0.66
query7	0.03	0.01	0.02
query8	0.04	0.03	0.04
query9	0.59	0.52	0.52
query10	0.57	0.59	0.57
query11	0.15	0.11	0.10
query12	0.14	0.11	0.11
query13	0.63	0.60	0.60
query14	0.80	0.80	0.82
query15	0.89	0.87	0.88
query16	0.39	0.38	0.40
query17	1.05	1.05	1.06
query18	0.23	0.21	0.22
query19	1.98	1.83	1.78
query20	0.01	0.01	0.01
query21	15.40	0.90	0.54
query22	0.76	1.20	0.64
query23	14.92	1.42	0.67
query24	7.42	1.29	0.53
query25	0.51	0.25	0.05
query26	0.49	0.17	0.14
query27	0.06	0.05	0.06
query28	9.17	0.91	0.45
query29	12.54	4.05	3.42
query30	0.26	0.09	0.06
query31	2.82	0.60	0.39
query32	3.23	0.55	0.48
query33	3.11	3.20	3.05
query34	16.02	5.36	4.73
query35	4.86	4.86	4.81
query36	0.70	0.51	0.50
query37	0.09	0.07	0.07
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.18	0.15	0.15
query41	0.08	0.03	0.03
query42	0.03	0.02	0.03
query43	0.05	0.04	0.03
Total cold run time: 103.87 s
Total hot run time: 29.2 s

@felixwluo
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33679 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e91eff9ba77a616544ebd142146650a6953fba17, data reload: false

------ Round 1 ----------------------------------
q1	17587	5170	5050	5050
q2	1936	298	191	191
q3	10442	1315	727	727
q4	10267	1015	515	515
q5	7846	2234	2343	2234
q6	177	165	136	136
q7	900	729	601	601
q8	9293	1226	1081	1081
q9	6756	5073	5089	5073
q10	6939	2372	1946	1946
q11	479	287	281	281
q12	337	344	217	217
q13	17772	3601	3068	3068
q14	217	232	212	212
q15	573	484	479	479
q16	433	421	368	368
q17	584	876	365	365
q18	7428	7272	7055	7055
q19	2186	963	555	555
q20	340	333	214	214
q21	3912	2529	2337	2337
q22	1063	1016	974	974
Total cold run time: 107467 ms
Total hot run time: 33679 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5220	5128	5205	5128
q2	246	324	216	216
q3	2170	2640	2268	2268
q4	1375	1814	1344	1344
q5	4169	4108	4380	4108
q6	207	168	131	131
q7	2011	1932	1738	1738
q8	2584	2438	2523	2438
q9	7109	7059	7028	7028
q10	3077	3268	2815	2815
q11	585	535	493	493
q12	668	722	631	631
q13	3491	3925	3199	3199
q14	282	322	275	275
q15	507	473	467	467
q16	447	493	437	437
q17	1153	1507	1370	1370
q18	7590	7289	6933	6933
q19	746	739	760	739
q20	1936	1949	1811	1811
q21	4855	4345	4287	4287
q22	1064	1039	999	999
Total cold run time: 51492 ms
Total hot run time: 48855 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185403 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e91eff9ba77a616544ebd142146650a6953fba17, data reload: false

query1	969	374	395	374
query2	6543	1875	1877	1875
query3	6745	221	217	217
query4	25742	23246	23008	23008
query5	4336	604	468	468
query6	313	209	201	201
query7	4620	494	289	289
query8	282	227	213	213
query9	8646	2643	2629	2629
query10	493	320	287	287
query11	15521	15461	14794	14794
query12	175	114	107	107
query13	1673	537	422	422
query14	9713	6162	6292	6162
query15	200	193	171	171
query16	7635	593	467	467
query17	1208	706	579	579
query18	2034	414	310	310
query19	196	196	161	161
query20	121	116	117	116
query21	219	129	108	108
query22	4120	4138	4060	4060
query23	33869	33016	33006	33006
query24	8414	2329	2365	2329
query25	523	460	381	381
query26	1231	264	151	151
query27	2704	513	330	330
query28	4303	2111	2087	2087
query29	721	542	435	435
query30	294	219	190	190
query31	933	833	742	742
query32	75	64	60	60
query33	565	390	306	306
query34	797	862	545	545
query35	779	816	732	732
query36	917	974	885	885
query37	111	101	79	79
query38	4071	4040	4001	4001
query39	1451	1421	1409	1409
query40	206	119	110	110
query41	64	57	60	57
query42	127	109	112	109
query43	495	510	492	492
query44	1297	819	822	819
query45	177	177	169	169
query46	843	1009	624	624
query47	1727	1749	1723	1723
query48	379	417	307	307
query49	723	486	396	396
query50	630	683	383	383
query51	4197	4176	4142	4142
query52	117	110	103	103
query53	223	254	181	181
query54	570	557	494	494
query55	91	91	91	91
query56	298	300	288	288
query57	1188	1172	1109	1109
query58	279	281	256	256
query59	2658	2674	2641	2641
query60	321	328	299	299
query61	122	124	123	123
query62	805	702	685	685
query63	227	186	187	186
query64	4260	998	730	730
query65	4255	4223	4200	4200
query66	1063	413	312	312
query67	15642	15761	15363	15363
query68	9124	931	531	531
query69	463	306	284	284
query70	1210	1123	1047	1047
query71	466	330	295	295
query72	5288	4685	4825	4685
query73	731	635	354	354
query74	8793	8909	8969	8909
query75	4178	3200	2699	2699
query76	3603	1179	748	748
query77	829	382	301	301
query78	9898	10194	9290	9290
query79	1409	785	640	640
query80	619	520	473	473
query81	477	255	223	223
query82	420	128	97	97
query83	289	258	240	240
query84	297	105	83	83
query85	797	358	312	312
query86	333	299	276	276
query87	4430	4423	4298	4298
query88	3016	2295	2281	2281
query89	381	321	282	282
query90	1955	201	202	201
query91	140	138	112	112
query92	83	61	58	58
query93	1105	943	584	584
query94	672	411	307	307
query95	368	291	284	284
query96	493	570	279	279
query97	2757	2727	2690	2690
query98	223	206	199	199
query99	1430	1429	1263	1263
Total cold run time: 272523 ms
Total hot run time: 185403 ms

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from e91eff9 to f1b29b0 Compare June 23, 2025 16:36
@felixwluo
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from d437c2a to 8752d5c Compare June 24, 2025 08:39
@felixwluo
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@doris-robot
Copy link

TPC-H: Total hot run time: 33928 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b6182b84d3850b2342c58c809b8f9161a7c44576, data reload: false

------ Round 1 ----------------------------------
q1	17612	5123	5040	5040
q2	1957	287	200	200
q3	10274	1369	733	733
q4	10226	1020	521	521
q5	7568	2356	2370	2356
q6	177	163	130	130
q7	899	748	589	589
q8	9313	1356	1056	1056
q9	6863	5150	5125	5125
q10	6914	2382	1960	1960
q11	477	297	277	277
q12	352	356	217	217
q13	17760	3703	3091	3091
q14	239	222	212	212
q15	566	485	482	482
q16	428	423	376	376
q17	606	895	363	363
q18	7599	7230	7071	7071
q19	1753	946	572	572
q20	335	339	216	216
q21	4000	3228	2371	2371
q22	1071	1037	970	970
Total cold run time: 106989 ms
Total hot run time: 33928 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5204	5150	5126	5126
q2	249	331	224	224
q3	2181	2662	2295	2295
q4	1410	1809	1375	1375
q5	4222	4395	4373	4373
q6	213	170	127	127
q7	2034	1933	1780	1780
q8	2587	2517	2552	2517
q9	7209	7133	7165	7133
q10	3084	3269	2821	2821
q11	586	523	493	493
q12	700	819	648	648
q13	3600	3946	3318	3318
q14	284	305	286	286
q15	528	473	474	473
q16	416	475	420	420
q17	1093	1511	1312	1312
q18	7415	7188	6996	6996
q19	767	767	876	767
q20	1922	2013	1780	1780
q21	4731	4363	4338	4338
q22	1111	1051	998	998
Total cold run time: 51546 ms
Total hot run time: 49600 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 184973 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b6182b84d3850b2342c58c809b8f9161a7c44576, data reload: false

query1	997	391	388	388
query2	6549	1862	1780	1780
query3	6740	215	214	214
query4	26576	23780	22563	22563
query5	4336	637	457	457
query6	306	210	205	205
query7	4634	510	292	292
query8	285	236	215	215
query9	8637	2607	2619	2607
query10	486	348	278	278
query11	15489	15032	14746	14746
query12	164	112	110	110
query13	1665	527	422	422
query14	9005	6097	6081	6081
query15	214	204	174	174
query16	7456	647	509	509
query17	1207	727	582	582
query18	2061	412	318	318
query19	212	198	180	180
query20	118	116	115	115
query21	215	136	108	108
query22	4141	4020	3954	3954
query23	34078	33177	33166	33166
query24	8501	2334	2382	2334
query25	541	459	394	394
query26	935	266	167	167
query27	2717	500	338	338
query28	4338	2121	2100	2100
query29	717	548	430	430
query30	280	223	188	188
query31	946	826	743	743
query32	73	64	64	64
query33	563	358	297	297
query34	795	839	524	524
query35	761	816	734	734
query36	945	986	873	873
query37	113	99	75	75
query38	4069	4135	3987	3987
query39	1483	1434	1440	1434
query40	215	117	105	105
query41	76	61	57	57
query42	123	108	108	108
query43	480	507	489	489
query44	1350	822	813	813
query45	180	177	163	163
query46	843	1020	631	631
query47	1766	1817	1712	1712
query48	408	408	303	303
query49	739	487	396	396
query50	633	689	400	400
query51	4129	4193	4099	4099
query52	107	105	101	101
query53	232	251	188	188
query54	568	579	499	499
query55	86	82	80	80
query56	300	293	276	276
query57	1179	1204	1121	1121
query58	261	252	248	248
query59	2642	2691	2587	2587
query60	330	336	309	309
query61	121	120	122	120
query62	815	739	658	658
query63	215	189	185	185
query64	3607	989	680	680
query65	4281	4188	4167	4167
query66	970	401	322	322
query67	15795	15572	15501	15501
query68	8515	892	541	541
query69	470	293	265	265
query70	1202	1091	1101	1091
query71	473	330	305	305
query72	5752	4767	4788	4767
query73	734	722	358	358
query74	9143	8815	8839	8815
query75	3892	3199	2707	2707
query76	3603	1203	742	742
query77	782	437	289	289
query78	10229	10165	9371	9371
query79	2375	828	590	590
query80	610	512	454	454
query81	491	267	216	216
query82	477	136	103	103
query83	283	241	236	236
query84	301	101	83	83
query85	786	352	305	305
query86	383	305	284	284
query87	4386	4429	4305	4305
query88	3626	2328	2325	2325
query89	389	317	292	292
query90	1832	213	214	213
query91	140	144	113	113
query92	89	65	55	55
query93	1852	947	593	593
query94	663	396	326	326
query95	370	298	282	282
query96	495	576	284	284
query97	2728	2785	2702	2702
query98	234	210	210	210
query99	1443	1417	1287	1287
Total cold run time: 274547 ms
Total hot run time: 184973 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit b6182b84d3850b2342c58c809b8f9161a7c44576, data reload: false

query1	0.03	0.03	0.03
query2	0.07	0.03	0.04
query3	0.23	0.07	0.07
query4	1.61	0.11	0.10
query5	0.43	0.41	0.41
query6	1.16	0.66	0.66
query7	0.03	0.02	0.01
query8	0.04	0.04	0.04
query9	0.58	0.50	0.51
query10	0.57	0.58	0.57
query11	0.16	0.12	0.10
query12	0.15	0.12	0.12
query13	0.63	0.61	0.61
query14	0.80	0.81	0.80
query15	0.89	0.86	0.89
query16	0.37	0.39	0.40
query17	1.07	1.05	1.04
query18	0.23	0.21	0.21
query19	1.95	1.86	1.93
query20	0.02	0.02	0.01
query21	15.39	0.93	0.56
query22	0.77	1.22	0.63
query23	14.89	1.37	0.61
query24	6.63	1.77	0.94
query25	0.51	0.21	0.10
query26	0.64	0.16	0.13
query27	0.06	0.05	0.05
query28	9.79	0.91	0.47
query29	12.55	3.93	3.34
query30	0.25	0.09	0.07
query31	2.83	0.61	0.40
query32	3.23	0.54	0.47
query33	3.01	3.10	3.14
query34	16.12	5.40	4.79
query35	4.87	4.82	4.80
query36	0.70	0.50	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.02
query40	0.18	0.13	0.14
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 103.77 s
Total hot run time: 29.69 s

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from b6182b8 to 3ebdfe2 Compare June 24, 2025 13:12
@felixwluo
Copy link
Contributor Author

run buildall

@felixwluo felixwluo force-pushed the feat-hive-catalog branch from 3ebdfe2 to ece7b05 Compare June 25, 2025 14:02
@felixwluo
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 95.45% (42/44) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 57.06% (15366/26929)
Line Coverage 46.15% (139471/302219)
Region Coverage 45.48% (70681/155399)
Branch Coverage 40.25% (37348/92790)

@felixwluo
Copy link
Contributor Author

run p0

@felixwluo
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@felixwluo
Copy link
Contributor Author

run buildall

1 similar comment
@felixwluo
Copy link
Contributor Author

run buildall

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 35.71% (5/14) 🎉
Increment coverage report
Complete coverage report

@felixwluo
Copy link
Contributor Author

run buildall

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Support reading Hive table with MultiDelimitSerDe
5 participants