详细信息
Single visual model based on transformer for digital instrument reading recognition ( SCI-EXPANDED收录 EI收录)
文献类型:期刊文献
英文题名:Single visual model based on transformer for digital instrument reading recognition
作者:Li, Xiang Zeng, Changchang Yao, Yong Zhang, Sen Zhang, Haiding Yang, Suixian
第一作者:Li, Xiang
通信作者:Zeng, CC[1]
机构:[1]Sichuan Univ, Sch Mech Engn, Chengdu 610065, Sichuan, Peoples R China;[2]Civil Aviat Flight Univ china, Sch Comp Sci, Guanghan 618307, Peoples R China;[3]Natl Inst Measurement & Testing Technol, Chengdu 610056, Sichuan, Peoples R China;[4]Guizhou Inst Technol, Sch Big Data, Guiyang 550003, Guizhou, Peoples R China
第一机构:Sichuan Univ, Sch Mech Engn, Chengdu 610065, Sichuan, Peoples R China
通信机构:corresponding author), Civil Aviat Flight Univ china, Sch Comp Sci, Guanghan 618307, Peoples R China.
年份:2025
卷号:36
期号:1
外文期刊名:MEASUREMENT SCIENCE AND TECHNOLOGY
收录:;EI(收录号:20251017997744);Scopus(收录号:2-s2.0-85219583248);WOS:【SCI-EXPANDED(收录号:WOS:001382473100001)】;
基金:This research was funded by the Fundamental Research Funds for the Central Universities (No. PHD2023-028), in part by the Open Project of State Key Laboratory of Public Big Data (No. PBD2023-36).
语种:英文
外文关键词:DIRR; attention mechanism; transformer; single visual model
摘要:Digital instrument reading recognition (DIRR) technology is crucial for industrial digital transformation and the advancement of industrialisation. However, digital instruments differ in character fonts, styles, spacing, and aspect ratios, as well as the scarcity of data pose significant challenges to current recognition technologies. To address these challenges, this study proposed a novel single visual model based on transformer for digital instrument recognition (SVDIR). The SVDIR model primarily comprised a scaled cosine attention mechanism (SC-attention) and a local Transformer block. First, the SC-attention was designed to calculate the cosine similarity of two image patches. It rendered the attention calculation independent of the input amplitude and produced milder attention weights to alleviate overconcentration issues. Second, a local Transformer block module was proposed for extracting the internal stroke features and dependencies between character components. Fine-grained characteristic features were obtained using this method. In addition, a post-norm structure was introduced into the local Transformer block module to reduce the accumulation of activation values following the deepening of the network. Finally, experimental results demonstrated the effectiveness and superiority of the proposed model on two digital instrument datasets.
参考文献:
正在载入数据...