1.上海交通大学电子信息与电气工程学院(上海 200240)
2.上海中医药大学协同创新中心(上海 201203)
陈亮,男,硕士,主要从事健康大数据研究
孙卫强,教授,博士生导师;E-mail:sunwq@sjtu.edu.cn
纸质出版日期:2024-07-25,
收稿日期:2023-03-15,
修回日期:2023-06-07,
扫 描 看 全 文
陈亮,孙卫强,邓宏勇等.基于属性知识嵌入的LDA模型在中药推荐中的应用[J].上海中医药大学学报,2024,38(04):38-47.
CHEN Liang,SUN Weiqiang,DENG Hongyong,et al.Application of Latent Dirichlet Allocation model based on attribute knowledge embedding in traditional Chinese medicine recommendation[J].Academic Journal of Shanghai University of Traditional Chinese Medicine,2024,38(04):38-47.
陈亮,孙卫强,邓宏勇等.基于属性知识嵌入的LDA模型在中药推荐中的应用[J].上海中医药大学学报,2024,38(04):38-47. DOI: 10.16306/j.1008-861x.2024.04.006.
CHEN Liang,SUN Weiqiang,DENG Hongyong,et al.Application of Latent Dirichlet Allocation model based on attribute knowledge embedding in traditional Chinese medicine recommendation[J].Academic Journal of Shanghai University of Traditional Chinese Medicine,2024,38(04):38-47. DOI: 10.16306/j.1008-861x.2024.04.006.
目的
2
数据驱动的药材推荐方法帮助中医医师在真实的临床实践中更精确、更智能地制定科学的治疗处方,也可以为中医诊断和治疗的发展提供科学依据。
方法
2
通过文本挖掘方法分析了24 127条中医处方记录,在中医理论的基础上模拟生成处方的过程,并将症状和药材的丰富信息及其相互关系等领域知识纳入考虑。提出了一种基于属性知识网络嵌入的LDA(Latent Dirichlet Allocation)主题模型的中药推荐方法,中药的属性知识网络包含药材丰富的属性信息以及蕴含的药理作用,对主题模型进行了增强。
结果
2
研究结果表明,在最佳嵌入系数下,模型的预测困惑度、准确性以及平均AUC相较于基线主题模型均有更好的表现。
结论
2
所提出的方法有利于提升模型稳定性、药材推荐准确度,更好地承担诊疗模式挖掘等任务。
Objective: A data-driven medicinal materials recommendation method helps traditional Chinese medicine (TCM) physicians to make scientific treatment prescriptions more accurately and intelligently in real clinical practice, and can also provide a scientific basis for the development of TCM diagnosis and treatment.
Methods
2
24 127 TCM prescription records were analyzed by text mining method, and the process of generating prescriptions was simulated based on TCM theory, and the knowledge of the rich information of symptoms and medicinal materials and their interrelationships was taken into account. A TCM recommendation method based on Latent Dirichlet Allocation (LDA) topic model embedded with attribute knowledge network was proposed, and the attribute knowledge network of TCM contains rich attribute information of medicinal materials and pharmacological effects, which enhanced the topic model.
Results
2
The experimental results showed that the prediction perplexity, accuracy and average Area Under Curve (AUC) of the model performed better compared with those of the baseline topic model under the optimal embedding coefficients.
Conclusion
2
The proposed method is beneficial in improving model stability and accuracy of medicinal materials recommendation, and better undertaks tasks such as diagnosis and treatment pattern mining.
中药数据挖掘主题模型属性知识网络草药推荐
traditional Chinese medicinedata miningtopic modelattribute knowledge networkherb recommendation
黄忠营, 孙卫强, 邓宏勇, 等. 中医随机对照试验元数据完整性的评价[J]. 中国循证医学杂志, 2021, 21(10): 1211-1218.
HUANG Z Y, SUN W Q, DENG H Y, et al. Data integrity of randomized controlled trial in TCM[J]. Chinese Journal of Evidence-Based Medicine, 2021, 21(10): 1211-1218.
LI S, ZHANG B, JIANG D, et al. Herb network construction and co-module analysis for uncovering the combination rule of traditional Chinese herbal formulae[J]. BMC Bioinformatics, 2010, 11(Suppl 11): S6.
HE P, DENG K, LIU Z, et al. Discovering herbal functional groups of traditional Chinese medicine[J]. Stat Med, 2012, 31(7): 636-642.
POON S K, POON J, McGrane M, et al. A novel approach in discovering significant interactions from TCM patient prescription data[J]. Int J Data Min Bioinform, 2011, 5(4): 353-368.
何小娟, 李健, 陈杲, 等. 基于病证结合的中药网络药理学研究与新药研发策略[J]. 中国中医基础医学杂志, 2011, 17(11): 1271-1273.
HE X J, LI J, CHEN G, et al. The Paradigm in New Herbal Medicine Discovery: Integrative Medicine Based Network Pharmacology[J]. Journal of Basic Chinese Medicine, 2011, 17(11): 1271-1273.
王博龙, 吴春兴, 易增兴, 等. 基于数据挖掘探讨中医药治疗鼻窦炎的用药规律[J]. 中国中药杂志, 2022, 47(4): 1114-1119.
WANG B L, WU C X, YI Z X, et al. Formulation regularity of traditional Chinese medicine in treatment of sinusitis based on data mining[J]. China Journal of Chinese Materia Medica, 2022, 47(4): 1114-1119.
YANG H J, CHEN J X, TANG S H, et al. New drug R&D of traditional Chinese medicine:Role of data mining approaches[J]. J biol syst, 2009, 17(3): 329-347.
BLEI D, NG A, JORDAN M. Latent dirichlet allocation[J]. JMLR, 2003: 993-1022.
HUANG Z, LU X, DUAN H. Latent treatment pattern discovery for clinical processes[J]. J Med Syst, 2013, 37(2): 9915.
ESBROECK A V, CHIA C, SYAD Z. Heart rate topic models[C]//Proceedings of the 26th AAAI Conference on Artificial Intelligence. Toronto: AAAI Press, 2012: 1635-1641.
ZHANG X P, ZHOU X Z, HUANG H K, et al. Topic model for Chinese medicine diagnosis and prescription regularities analysis: Case on diabetes[J]. Chin J Integr Med, 2011, 17(4): 307-313.
IANG Z X, ZHOU X Z, ZHANG X P, et al. Using link topic model to analyze traditional Chinese Medicine Clinical symptomherb regularities[C]//2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom). Beijing: IEEE, 2012: 15-18.
PRITCHARD J K, STEPHENS M, DONNELLY P. Inference of population structure using multilocus genotype data[J]. Genetics, 2000, 155(2): 945-959.
CHEN X, HE T T, HU X H, et al. Inferring functional groups from microbial gene catalogue with probabilistic topic models[C]//2011 IEEE International Conference on Bioinformatics and Biomedicine. Atlanta: IEEE Computer Society, 2011: 3-9.
WANG S, HUANG E W, ZHANG R S, et al. A conditional probabilistic model for joint analysis of symptoms, diseases, and herbs in traditional Chinese medicine patient records[C]//2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Shenzhen: IEEE, 2016: 411-418.
JI W D, ZHANG Y, WANG X L, et al. Latent semantic diagnosis in traditional Chinese medicine[M]. Berlin: Springer Cham, 2017, 20(5): 1071-1087.
YAO L, ZHANG Y, WEI B G, et al. A topic modeling approach for traditional Chinese medicine prescriptions[J]. IEEE T Knowl Data En, 2018, 30(6): 1007-1021.
NGUYEN D Q, BILLINGSLEY R, DU L, et al. Improving topic models with latent feature word representations[J]. Trans Assoc Comput Linguist, 2015,3:299-313.
DAS R, ZAHEER M, DYER C. Gaussian LDA for topic models with word embeddings[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Beijing: Association for Computational Linguistics, 2015: 795-804.
ANDRZEJEWSKI D, ZHU X, Craven M. Incorporating domain knowledge into topic modeling via dirichlet forest priors[J]. Proc Int Conf Mach Learn, 2009, 382(26): 25-32.
夏宇航, 高大启, 阮彤, 等. 基于知识图谱的医疗病历数据存储研究[J]. 计算机工程, 2019, 45(1): 9-16, 22.
XIA Y H, GAO D Q, RUAN T, et al. Research on Data Storage of Medical Record Based on Knowledge Graph[J]. Computer Engineering, 2019, 45(1): 9-16, 22.
杨晓波,陈楚湘,王至婉. 基于随机游走理论的改进LFM算法[J]. 计算机工程, 2017, 43(11): 182-186.
YANG X B, CHEN C X, WANG Z W. Improved LFM Algorithm Based on Random Walk Theory[J]. Computer Engineering, 2017, 43(11): 182-186.
GROVER A, LESKOVEC J. Node2vec: Scalable Feature Learning for Networks[C]//In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 855-864.
HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE, 2016: 770-778.
WANG S H, WANG Y L, TANG J L, et al. What Your Images Reveal: Exploiting Visual Contents for Point-of-Interest Recommendation[C]//In 26th International World Wide Web Conference. Switzerland: International World Wide Web Conferences Steering Committee, 2017:391-400.
ZHANG X P, ZHOU X Z, HUANG H K, et al. Topic model for Chinese medicine diagnosis and prescription regularities analysis: case on diabetes[J]. Chin J Integr Med, 2011, 17(4): 307-313.
刘凯,周雪忠,于剑,等. 基于条件随机场的中医临床病历命名实体抽取[J]. 计算机工程, 2014, 40(9): 312-316.
LIU K, ZHOU X Z, YU J, et al. Named Entity Extraction of Traditional Chinese Medicine Medical Records Based on Conditional Random Field[J]. Computer Engineering, 2014, 40(9): 312-316.
0
浏览量
0
下载量
0
CSCD
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构