HCL2020: Handwritten Character and Style Recognition

Code and Dataset release for "Handwritten Style Recognition for Chinese Characters on HCL2020 dataset".

Abstract: Structural features of Chinese characters provide abundant style information for handwritten style recognition, while prior work on this task has few senses of using structural information. Meanwhile, based on current handwritten Chinese character datasets, it is hard to obtain a good generalization model only by character category and writer information. Therefore, we add the structural information known as morpheme which is the smallest and unique structure in Chinese character into the large handwritten dataset HCL2000 and update it to HCL2020. We also present a deep fusion network (Morpheme-based Handwritten Style Recognition Network, M-HSRNet), capturing both overall layout characteristics and detail structural features of characters to recognize handwritten style. The evaluation results of the proposed model on HCL2020 are observed to prove the e ectiveness of morpheme. Together with the proposed Morpheme Encoder module, our approach achieves an accuracy of 78.06% in handwritten style recognition, which is 3 points higher than the result without morpheme information.

Reconstruction of HCL2020

The criterion of splitting Chinese characters into morphemes based on di erent character structures and some examples. The first split criterion contains: (1) No split based on single-component characters in the form of 'A'. (2) Split into 'A+B' based on multiple-component characters in the form of 'A,B'. (3) Split into 'A+B+C' based on single-component characters in the form of 'A,B,C'. If morphemes after the first split are still more complicated, a second split will be performed which split criterion is same as first split criterion, taking the Chinese character an example. Finally, each morpheme is represented by a morpheme category index.

The difference between HCL2000 dataset and HCL2020 dataset. Compared with HCL2000, HCL2020 has more annotations about character structure information.

Citation

@inproceedings{hu2020handwritten,
title={Handwritten Style Recognition for Chinese Characters on HCL2020 Dataset},
author={Hu, Peiyi and Xu, Mengqiu and Wu, Ming and Chen, Guang and Zhang, Chuang},
booktitle={Pattern Recognition and Computer Vision: Third Chinese Conference, PRCV 2020, Nanjing, China, October 16--18, 2020, Proceedings, Part II 3},
pages={138--150},
year={2020},
organization={Springer}}

Contact

Thanks for your attention! If you have any suggestion or question, you can leave a message here or contact us directly.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Code		Code
Examples		Examples
.gitattributes		.gitattributes
README.md		README.md
resnet.py		resnet.py
test-label.txt		test-label.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HCL2020: Handwritten Character and Style Recognition

Reconstruction of HCL2020

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

kaka0910/HCL2020

Folders and files

Latest commit

History

Repository files navigation

HCL2020: Handwritten Character and Style Recognition

Reconstruction of HCL2020

Citation

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages