Code and Dataset release for "Handwritten Style Recognition for Chinese Characters on HCL2020 dataset".
Abstract: Structural features of Chinese characters provide abundant style information for handwritten style recognition, while prior work on this task has few senses of using structural information. Meanwhile, based on current handwritten Chinese character datasets, it is hard to obtain a good generalization model only by character category and writer information. Therefore, we add the structural information known as morpheme which is the smallest and unique structure in Chinese character into the large handwritten dataset HCL2000 and update it to HCL2020. We also present a deep fusion network (Morpheme-based Handwritten Style Recognition Network, M-HSRNet), capturing both overall layout characteristics and detail structural features of characters to recognize handwritten style. The evaluation results of the proposed model on HCL2020 are observed to prove the e ectiveness of morpheme. Together with the proposed Morpheme Encoder module, our approach achieves an accuracy of 78.06% in handwritten style recognition, which is 3 points higher than the result without morpheme information.

The criterion of splitting Chinese characters into morphemes based on di erent character structures and some examples. The first split criterion contains: (1) No split based on single-component characters in the form of 'A'. (2) Split into 'A+B' based on multiple-component characters in the form of 'A,B'. (3) Split into 'A+B+C' based on single-component characters in the form of 'A,B,C'. If morphemes after the first split are still more complicated, a second split will be performed which split criterion is same as first split criterion, taking the Chinese character an example. Finally, each morpheme is represented by a morpheme category index.

The difference between HCL2000 dataset and HCL2020 dataset. Compared with HCL2000, HCL2020 has more annotations about character structure information.

@inproceedings{hu2020handwritten,
title={Handwritten Style Recognition for Chinese Characters on HCL2020 Dataset},
author={Hu, Peiyi and Xu, Mengqiu and Wu, Ming and Chen, Guang and Zhang, Chuang},
booktitle={Pattern Recognition and Computer Vision: Third Chinese Conference, PRCV 2020, Nanjing, China, October 16--18, 2020, Proceedings, Part II 3},
pages={138--150},
year={2020},
organization={Springer}}
Thanks for your attention! If you have any suggestion or question, you can leave a message here or contact us directly.