登录    注册    忘记密码

详细信息

An attribute-weighted isometric embedding method for categorical encoding on mixed data  ( SCI-EXPANDED收录 EI收录)   被引量:1

文献类型:期刊文献

英文题名:An attribute-weighted isometric embedding method for categorical encoding on mixed data

作者:Liang, Zupeng Ji, Shengfen Li, Qiude Hu, Sigui Yu, Yang

第一作者:Liang, Zupeng

通信作者:Li, QD[1]

机构:[1]Guizhou Med Univ, Sch Biol & Engn, Guiyang 550025, Guizhou, Peoples R China;[2]Guizhou Univ, Sch Math & Stat, Guiyang 550025, Guizhou, Peoples R China;[3]Guizhou Inst Technol, Sch Foreign Language, Guiyang 550003, Guizhou, Peoples R China

第一机构:Guizhou Med Univ, Sch Biol & Engn, Guiyang 550025, Guizhou, Peoples R China

通信机构:corresponding author), Guizhou Med Univ, Sch Biol & Engn, Guiyang 550025, Guizhou, Peoples R China.

年份:2023

外文期刊名:APPLIED INTELLIGENCE

收录:;EI(收录号:20233414620886);Scopus(收录号:2-s2.0-85168569236);WOS:【SCI-EXPANDED(收录号:WOS:001061454900001)】;

基金:The work was funded by the National Natural Science Foundation of China (grant no. 62166009), the Guizhou Provincial Natural Science Foundation of China (grant nos. ZK[2021]333 and ZK[2022]350), the Science and Technology Foundation of the Guizhou Provincial Health Commission (grant no. gzwkj2023-258), and the Ph.D. Research Startup Foundation of Guizhou Medical University (grant nos. 2020-051 and 2023-009).

语种:英文

外文关键词:Mixed data; isometric embedding; attribute weighting; must-link and cannot-link constraints; one dependence value difference metric

摘要:Mixed data containing categorical and numerical attributes are widely available in real-world. Before analysing such data, it is typically necessary to process (transform/embed/represent) them into high-quality numerical data. The conditional probability transformation method (CPT) can provide acceptable performance in the majority of cases, but it is not satisfactory for datasets with strong attribute association. Inspired by the one dependence value difference metric method, the concept of relaxing the attributes conditional independence has been applied to CPT, but this approach has the drawback of dramatically-expanding the attribute dimensionality. We employ the isometric embedding method to tackle the problem of dimensionality expansion. In addition, an attribute weighting method based on the must-link and cannot-link constraints is designed to optimize the data transformation quality. Combining these methods, we propose an attribute-weighted isometric embedding (AWIE) for categorical encoding on mixed data. Extensive experimental results obtained on 16 datasets demonstrate that AWIE significantly improves upon the classification performance (increasing the F1-score by 2.54%, attaining 6/16 best results, and reaching average ranks of 1.94/8), compared with 28 competitors.

参考文献:

正在载入数据...

版权所有©贵州理工学院 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心