单位:[1]Department of Industrial Engineering, Tsinghua University, Beijing 100084, China[2]Department of Gastroenterology, Beijing FriendshipHospital, Capital Medical University,Beijing 100050, China[3]Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907 USA[4]School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
Colorectal canceris a common type of cancer. Due to the alarming incidence and mortality rate, it has received increasing attention on early detection and treatment. Colorectal polyps form and grow at initial stages of most colorectal cancer cases. Due to rather stringent medical resource availability and low screening compliance rate, it is more desirable in China than industrialized countries to characterize the relations between colorectalpolyp occurrence and various potential determinants, including basic health information, comorbidities, and lifestyle conditions. Subsequently, one can better predict polyp incidence for each individual. In this letter, we report a data-driven modeling study to improve binary classification of colorectal polyp occurrence. We apply several machine-learning methods, particularly random forests, for physical examination and screening colonoscopy results of a Chinese cohort, to build the classifiers. Our results suggest improved prediction performance with the random forests model. Our study also provides evidence to support the general speculation that emotional status may be an influential risk factor to early colorectal cancer growth in China.
基金:
National Natural Science Foundation of ChinaNational Natural Science Foundation of China (NSFC) [71432002, 71672006, 71501109]; NIH/NCI National [106511]; Center for Data-Centric Management in the Department of Industrial Engineering, Tsinghua University; U.S. National Cancer InstituteUnited States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Cancer Institute (NCI) [106511]