姗姗来迟 (2022-12-31 17:48):
#paper https://link.springer.com/article/10.1007/s11263-022-01654-0?utm_source=xmol&utm_content=meta PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition 该工作针对篇幅级手写中文文本识别问题,提出了端到端弱监督的方法PageNet。该方法的主要优势在于:(1)从一个新的角度解决篇幅级中文文本识别问题——检测识别单字并预测单字间的阅读顺序。(2)模型可以弱监督地训练。对于真实数据仅需要标注文本,不需要任何边界框标注,极大地降低了数据的标注成本。(3)尽管只需要文本标注信息,模型却可以预测出单字级和文本行级的检测和识别结果。(4)该方法深入研究篇幅级文本识别中的阅读顺序问题,所提出的阅读顺序模块可以处理多方向文本、弯曲文本等复杂的阅读顺序。
PageNet: Towards End-to-End Weakly Supervised Page-Level Handwritten Chinese Text Recognition
翻译
Abstract:
Handwritten Chinese text recognition (HCTR) has been an active research topic for decades. However, most previous studies solely focus on the recognition of cropped text line images, ignoring the error caused by text line detection in real-world applications. Although some approaches aimed at page-level text recognition have been proposed in recent years, they either are limited to simple layouts or require very detailed annotations including expensive line-level and even character-level bounding boxes. To this end, we propose PageNet for end-to-end weakly supervised page-level HCTR. PageNet detects and recognizes characters and predicts the reading order between them, which is more robust and flexible when dealing with complex layouts including multi-directional and curved text lines. Utilizing the proposed weakly supervised learning framework, PageNet requires only transcripts to be annotated for real data; however, it can still output detection and recognition results at both the character and line levels, avoiding the labor and cost of labeling bounding boxes of characters and text lines. Extensive experiments conducted on five datasets demonstrate the superiority of PageNet over existing weakly supervised and fully supervised page-level methods. These experimental results may spark further research beyond the realms of existing methods based on connectionist temporal classification or attention. The source code is available at https://github.com/shannanyinxiang/PageNet.
翻译
回到顶部