详细信息
Using fine-tuned conditional probabilities for data transformation of nominal attributes ( SCI-EXPANDED收录 EI收录) 被引量:10
文献类型:期刊文献
英文题名:Using fine-tuned conditional probabilities for data transformation of nominal attributes
作者:Li, Qiude Xiong, Qingyu Ji, Shengfen Wen, Junhao Gao, Min Yu, Yang Xu, Rui
第一作者:Li, Qiude
通信作者:Xiong, QY[1];Xiong, QY[2]
机构:[1]Chongqing Univ, Minist Educ, Key Lab Dependable Serv Comp Cyber Phys Soc, Chongqing, Peoples R China;[2]Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400044, Peoples R China;[3]Guizhou Med Univ, Sch Biol & Engn, Guiyang 550004, Guizhou, Peoples R China;[4]Guizhou Inst Technol, Foreign Language Teaching Ctr, Guiyang 550003, Guizhou, Peoples R China
第一机构:Chongqing Univ, Minist Educ, Key Lab Dependable Serv Comp Cyber Phys Soc, Chongqing, Peoples R China
通信机构:corresponding author), Chongqing Univ, Minist Educ, Key Lab Dependable Serv Comp Cyber Phys Soc, Chongqing, Peoples R China;corresponding author), Chongqing Univ, Sch Big Data & Software Engn, Chongqing 400044, Peoples R China.
年份:2019
卷号:128
起止页码:107-114
外文期刊名:PATTERN RECOGNITION LETTERS
收录:;EI(收录号:20193507385636);Scopus(收录号:2-s2.0-85071468719);WOS:【SCI-EXPANDED(收录号:WOS:000498398400015)】;
基金:We thank anonymous reviewers for their valuable comments and suggestions. The work was supported by the Key Research Program of Chongqing Science & Technology Commission (grant no. CSTC2017jcyjBX0025), the Science and Technology Major Special Project of Guangxi (grant no. GKAA17129002), the National Natural Science Foundations of China (grant no. 61771077), and the National Key R&D Program of China (grant no. 2018YFF0214706).
语种:英文
外文关键词:Conditional probability transformation; Fine-tuning algorithm; MIC-based feature selection; Data transformation; Distance measure
摘要:Most of existing machine learning algorithms do not natively support nominal attributes, so it is essential to develop the data transformation of nominal attributes into high-quality numeric ones. Conditional Probability Transformation (CPT), using conditional probability terms to transform categories in nominal attributes, is competitive with state-of-the-art transformation methods such as One-Hot Encoding (OHE) and Separability Split Value Transformation (SSVT). However, it may be difficult to accurately estimate conditional probability terms when training data is insufficient or there exist strong dependencies among its attributes. Inspired by the fine-tuning method for improving conditional probability terms in distance measures, we proposed a Fine-Tuned Conditional Probability Transformation (FTCPT). In addition, we proposed an Improved SSV (ISSV) based on fine-tuned conditional probability terms, and used our Modified MIC-based Feature Selection method to further improve the performance of FTCPT. Experiment results show that the proposed methods can improve the quality of data transformation, thereby contribute to improving the classification performance of subsequent machine learning algorithm. (C) 2019 Elsevier B.V. All rights reserved.
参考文献:
正在载入数据...