随着电子健康记录(Electronic Health Record,EHR)的出现与广泛应用,基于EHR数据的预测模型可以起到早期检测和干预疾病的作用.异质属性在EHR数据中普遍存在,但是难以做到深度利用,因此可通过对数据样本进行异质属性融合的方法,为后续模型训练提供信息丰富的数据表征基础.本文设计一种高效的二阶段预测模型,用于解决重疾预测中存在的时效与成本等问题.该模型的第一阶段对病例样本进行粗粒度预测,将危重程度低的病例进行疾病初筛,起到提前分流病人的作用;第二阶段模型则基于第一阶段的粗滤结果,对潜在的危重病例进行更细粒度的预测.通过实验验证,经过异质属性融合处理后,在选择前6个时间点构造非时序模型时,二阶段模型可以较好地兼具疾病初筛以及疾病预测的效果.
With the emergence and wide application of Electronic Health Record(EHR),the prediction model based on EHR data can be used for early detection and intervention of diseases.Heterogeneous attributes are ubiquitous in EHR data,but it is difficult to thoroughly exploit their information.Therefore,the method of heterogeneous attribute fusion provides an informative data representation basis for subsequent model training.This paper designs an efficient two-stage prediction model for solving the problems of time and cost in predicting critical illness.In the first stage of the model,coarse-grained prediction is performed on patient samples.Patients with low severity are initially screened out,which plays a key role in patient diversion.The second stage makes more fine-grained predictions of potentially critical patients based on the coarse filtering results of the first stage.The ex-perimental verified that,after heterogeneous attribute fusion,when we select the first 6 time points to construct a non-temporal model,the two-stage model has better performance in both initial disease screening and disease prediction.