提示: 手机请竖屏浏览!

医学领域的机器学习
Machine Learning in Medicine


Alvin Rajkomar ... 其他 • 2019.04.04
相关阅读
• 机器学习与医学预测——跨越过度期望的顶峰之后

一名49岁患者注意到自己肩部有一处无痛皮疹,但未就诊。几个月后,妻子让他去就诊,结果诊断出脂溢性角化病。之后,患者在接受结肠镜筛查时,护士注意到其肩部有一处深色斑疹,并建议其接受检查。1个月后,患者到皮肤科医师处就诊,医师采集了皮损活检标本。检查发现一处非癌性色素性皮损。皮肤科医师仍对皮损表示担心,因此要求对活检标本进行第二次读片,结果诊断出侵袭性黑色素瘤。肿瘤科医师对患者启动全身性化疗。患者的一位医师朋友问其为何未接受免疫治疗。

如果每一项医疗决策(无论是重症监护医师还是社区卫生工作者做出的医疗决策)均由相关专家团队立即进行审核,并在决策看似有误的情况下提供指导,情况将会是什么样?新诊断出高血压,但无并发症的患者将接受已知最有效的药物,而非处方医师最熟悉的药物1,2。开处方时无意中发生的过量和错误将很大程度上被消除3,4。患神秘和罕见疾病的患者可被引导至其疑似诊断的相关领域知名专家处5

这样的系统看似难以实现。并无充足的医学专家可以参与其中,专家要花太长时间才能通读患者病史,而且与隐私法相关的顾虑也会导致工作尚未开始就已结束6。然而,这正是医学领域机器学习展现的前景:几乎所有临床医师制订决策时蕴含的智慧以及数十亿患者的结局应该可以为每位患者的治疗提供指导。也就是说,每项诊断、管理决策和治疗都应结合集体的经验教训,从而根据患者的所有已知信息做到实时个体化。

这一框架强调机器学习不仅仅是像新药或新医疗器械一样的新工具,而是对超出人脑理解能力的数据进行有意义处理所需的基本技术;海量的信息储存日益见于庞大的临床数据库,甚至日益见于单一患者的数据7

近50年前,本刊的一篇特别报告指出,计算功能将“增强医师的智力功能,并且在某些情况下很大程度上取代医师的智力功能8。”然而,到了2019年初,机器学习在医疗领域发挥的推动作用仍然惊人地少。我们在本文中描述医学领域的机器学习要实现其全部前景,医疗系统必须做出的核心结构变化和模式转变(见视频),而非报告已经测试过的无数(关于回顾性数据的)概念验证模型





作者信息

Alvin Rajkomar, M.D., Jeffrey Dean, Ph.D., and Isaac Kohane, M.D., Ph.D.
From Google, Mountain View, CA (A.R., J.D.); and the Department of Biomedical Informatics, Harvard Medical School, Boston (I.K.). Address reprint requests to Dr. Kohane at the Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck St., Boston, MA, 02115, or at isaac_kohane@harvard.edu.

 

参考文献

1. Bakris G, Sorrentino M. Redefining hypertension — assessing the new blood-pressure guidelines. N Engl J Med 2018;378:497-499.

2. Institute of Medicine. Crossing the quality chasm: a new health system for the twenty-first century. Washington, DC: National Academies Press, 2001.

3. Lasic M. Case study: an insulin overdose. Institute for Healthcare Improvement (http://www.ihi.org/education/IHIOpenSchool/resources/Pages/Activities/AnInsulinOverdose.aspx).

4. Institute of Medicine. To err is human: building a safer health system. Washington, DC: National Academies Press, 2000.

5. National Academies of Sciences, Engineering, and Medicine. Improving diagnosis in health care. Washington, DC: National Academies Press, 2016.

6. Berwick DM, Gaines ME. How HIPAA harms care, and how to stop it. JAMA 2018;320:229-230.

7. Obermeyer Z, Lee TH. Lost in thought — the limits of the human mind and the future of medicine. N Engl J Med 2017;377:1209-1211.

8. Schwartz WB. Medicine and the computer — the promise and problems of change. N Engl J Med 1970;283:1257-1264.

9. Schwartz WB, Patil RS, Szolovits P. Artificial intelligence in medicine — where do we stand? N Engl J Med 1987;316:685-688.

10. Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning. Cambridge, MA: MIT Press, 2016.

11. Muntner P, Colantonio LD, Cushman M, et al. Validation of the atherosclerotic cardiovascular disease Pooled Cohort risk equations. JAMA 2014;311:1406-1415.

12. Clark J. Google turning its lucrative Web search over to AI machines.Bloomberg News. October 26, 2015 (https://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines).

13. Johnson M, Schuster M, Le QV, et al. Google’s multilingual neural machine translation system: enabling zero-shot translation. arXiv. November 14, 2016 (http://arxiv.org/abs/1611.04558).

14. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. arXiv. September 1, 2014 (http://arxiv.org/abs/1409.0473).

15. Kannan A, Chen K, Jaunzeikare D, Rajkomar A. Semi-supervised learning for information extraction from dialogue. In: Interspeech 2018. Baixas, France: International Speech Communication Association, 2018:2077-81.

16. Rajkomar A, Oren E, Chen K, et al. Scalable and accurate deep learning for electronic health records. arXiv. January 24, 2018 (http://arxiv.org/abs/1801.07860).

17. Escobar GJ, Turk BJ, Ragins A, et al. Piloting electronic medical record-based early detection of inpatient deterioration in community hospitals. J Hosp Med 2016;11:Suppl 1:S18-S24.

18. Grinfeld J, Nangalia J, Baxter EJ, et al. Classification and personalized prognosis in myeloproliferative neoplasms. N Engl J Med 2018;379:1416-1430.

19. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med 2019;25(1):44-56.

20. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut 2019 February 27 (Epub ahead of print).

21. Krause J, Gulshan V, Rahimy E, et al. Grader variability and the importance of reference standards for evaluating machine learning models for diabetic retinopathy. Ophthalmology 2018;125:1264-1272.

22. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016;316:2402-2410.

23. Ting DSW, Cheung CY-L, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 2017;318:2211-2223.

24. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 2018;172(5):1122-1131.e9.

25. Poplin R, Varadarajan AV, Blumer K, et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat Biomed Eng 2018;2:158-164.

26. Steiner DF, MacDonald R, Liu Y, et al. Impact of deep learning assistance on the histopathologic review of lymph nodes for metastatic breast cancer. Am J Surg Pathol 2018;42:1636-1646.

27. Liu Y, Kohlberger T, Norouzi M, et al. Artificial intelligence-based breast cancer nodal metastasis detection. Arch Pathol Lab Med 2018 October 8 (Epub ahead of print).

28. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 2017;318:2199-2210.

29. Chilamkurthy S, Ghosh R, Tanamala S, et al. Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. Lancet 2018;392:2388-2396.

30. Mori Y, Kudo SE, Misawa M, et al. Real-time use of artificial intelligence in identification of diminutive polyps during colonoscopy: a prospective study. Ann Intern Med 2018;169:357-366.

31. Tison GH, Sanchez JM, Ballinger B, et al. Passive detection of atrial fibrillation using a commercially available smartwatch. JAMA Cardiol 2018;3:409-416.

32. Galloway CD, Valys AV, Petterson FL, et al. Non-invasive detection of hyperkalemia with a smartphone electrocardiogram and artificial intelligence. J Am Coll Cardiol 2018;71:Suppl:A272-A272. abstract.

33. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017;542:115-118.

34. Rajkomar A, Yim JWL, Grumbach K, Parekh A. Weighting primary care patient panel size: a novel electronic health record-derived measure using machine learning. JMIR Med Inform 2016;4(4):e29-e29.

35. Schuster MA, Onorato SE, Meltzer DO. Measuring the cost of quality measurement: a missing link in quality strategy. JAMA 2017;318:1219-1220.

36. Beam AL, Kohane IS. Big data and machine learning in health care. JAMA 2018;319:1317-1318.

37. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521:436-444.

38. Hinton G. Deep learning — a technology with the potential to transform health care. JAMA 2018;320:1101-1102.

39. Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intell Syst 2009;24:8-12.

40. Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: using analytics to identify and manage high-risk and high-cost patients. Health Aff (Millwood) 2014;33:1123-1131.

41. Rajkomar A, Oren E, Chen K, et al. Scalable and accurate deep learning with electronic health records. npj Digital Medicine 2018;1(1):18-18.

42. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med 2018;24:1342-1350.

43. Mandl KD, Szolovits P, Kohane IS. Public standards and patients’ control: how to keep electronic medical records accessible but private. BMJ 2001;322:283-287.

44. Mandl KD, Kohane IS. Time for a patient-driven health information economy? N Engl J Med 2016;374:205-208.

45. Mandel JC, Kreda DA, Mandl KD, Kohane IS, Ramoni RB. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. J Am Med Inform Assoc 2016;23:899-908.

46. Hersh WR, Weiner MG, Embi PJ, et al. Caveats for the use of operational electronic health record data in comparative effectiveness research. Med Care 2013;51:Suppl 3:S30-S37.

47. McGlynn EA, McDonald KM, Cassel CK. Measurement is essential for improving diagnosis and reducing diagnostic error: a report from the Institute of Medicine. JAMA 2015;314:2501-2502.

48. Institute of Medicine, National Academies of Sciences, Engineering, and Medicine. Improving diagnosis in health care. Washington, DC: National Academies Press, 2016.

49. Das J, Woskie L, Rajbhandari R, Abbasi K, Jha A. Rethinking assumptions about delivery of healthcare: implications for universal health coverage. BMJ 2018;361:k1716-k1716.

50. Reis BY, Kohane IS, Mandl KD. Longitudinal histories as predictors of future diagnoses of domestic abuse: modelling study. BMJ 2009;339:b3677-b3677.

51. Kale MS, Korenstein D. Overdiagnosis in primary care: framing the problem and finding solutions. BMJ 2018;362:k2820-k2820.

52. Lindenauer PK, Lagu T, Shieh M-S, Pekow PS, Rothberg MB. Association of diagnostic coding with trends in hospitalizations and mortality of patients with pneumonia, 2003-2009. JAMA 2012;307:1405-1413.

53. Slack WV, Hicks GP, Reed CE, Van Cura LJ. A computer-based medical-history system. N Engl J Med 1966;274:194-198.

54. Ford I, Norrie J. Pragmatic trials. N Engl J Med 2016;375:454-463.

55. Frieden TR. Evidence for health decision making — beyond randomized, controlled trials. N Engl J Med 2017;377:465-475.

56. Ross C, Swetlitz I, Thielking M, et al. IBM pitched Watson as a revolution in cancer care: it’s nowhere close. Boston: STAT, September 5, 2017 (https://www.statnews.com/2017/09/05/watson-ibm-cancer/).

57. Fiore LD, Lavori PW. Integrating randomized comparative effectiveness research with patient care. N Engl J Med 2016;374:2152-2158.

58. Schneeweiss S. Learning from big health care data. N Engl J Med 2014;370:2161-2163.

59. Institute of Medicine. The learning healthcare system: workshop summary. Washington, DC: National Academies Press, 2007.

60. Erickson SM, Rockwern B, Koltov M, McLean RM. Putting patients first by reducing administrative tasks in health care: a position paper of the American College of Physicians. Ann Intern Med 2017;166:659-661.

61. Hill RG Jr, Sears LM, Melanson SW. 4000 Clicks: a productivity analysis of electronic medical records in a community hospital ED. Am J Emerg Med 2013;31:1591-1594.

62. Sittig DF, Murphy DR, Smith MW, Russo E, Wright A, Singh H. Graphical display of diagnostic test results in electronic health records: a comparison of 8 systems. J Am Med Inform Assoc 2015;22:900-904.

63. Mamykina L, Vawdrey DK, Hripcsak G. How do residents spend their shift time? A time and motion study with a particular focus on the use of computers. Acad Med 2016;91:827-832.

64. Oxentenko AS, West CP, Popkave C, Weinberger SE, Kolars JC. Time spent on clinical documentation: a survey of internal medicine residents and program directors. Arch Intern Med 2010;170:377-380.

65. Arndt BG, Beasley JW, Watkinson MD, et al. Tethered to the EHR: primary care physician workload assessment using EHR event log data and time-motion observations. Ann Fam Med 2017;15:419-426.

66. Sinsky C, Colligan L, Li L, et al. Allocation of physician time in ambulatory practice: a time and motion study in 4 specialties. Ann Intern Med 2016;165:753-760.

67. Howe JL, Adams KT, Hettinger AZ, Ratwani RM. Electronic health record usability issues and potential contribution to patient harm. JAMA 2018;319:1276-1278.

68. Lee VS, Blanchfield BB. Disentangling health care billing: for patients’ physical and financial health. JAMA 2018;319:661-663.

69. Haynes AB, Weiser TG, Berry WR, et al. A surgical safety checklist to reduce morbidity and mortality in a global population. N Engl J Med 2009;360:491-499.

70. Steinhubl SR, Kim K-I, Ajayi T, Topol EJ. Virtual care for improved global health. Lancet 2018;391:419-419.

71. Gabriels K, Moerenhout T. Exploring entertainment medicine and professionalization of self-care: interview study among doctors on the potential effects of digital self-tracking. J Med Internet Res 2018;20(1):e10-e10.

72. Morawski K, Ghazinouri R, Krumme A, et al. Association of a smartphone application with medication adherence and blood pressure control: the MedISAFE-BP randomized clinical trial. JAMA Intern Med 2018;178:802-809.

73. de Jong MJ, van der Meulen-de Jong AE, Romberg-Camps MJ, et al. Telemedicine for management of inflammatory bowel disease (myIBDcoach): a pragmatic, multicentre, randomised controlled trial. Lancet 2017;390:959-968.

74. Denis F, Basch E, Septans AL, et al. Two-year survival comparing web-based symptom monitoring vs routine surveillance following treatment for lung cancer. JAMA 2019;321(3):306-307.

75. Fraser H, Coiera E, Wong D. Safety of patient-facing digital symptom checkers. Lancet 2018;392:2263-2264.

76. Elmore JG, Barnhill RL, Elder DE, et al. Pathologists’ diagnosis of invasive melanoma and melanocytic proliferations: observer accuracy and reproducibility study. BMJ 2017;357:j2813-j2813.

77. Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential biases in machine learning algorithms using electronic health record data. JAMA Intern Med 2018;178:1544-1547.

78. Rajkomar A, Hardt M, Howell MD, Corrado G, Chin MH. Ensuring fairness in machine learning to advance health equity. Ann Intern Med 2018;169:866-872.

79. Institute of Medicine. Unequal treatment: confronting racial and ethnic disparities in health care. Washington, DC: National Academies Press, 2003.

80. Shuren J, Califf RM. Need for a national evaluation system for health technology. JAMA 2016;316:1153-1154.

81. Kesselheim AS, Cresswell K, Phansalkar S, Bates DW, Sheikh A. Clinical decision support systems could be modified to reduce ‘alert fatigue’ while still minimizing the risk of litigation. Health Aff (Millwood) 2011;30:2310-2317.

82. Auerbach AD, Neinstein A, Khanna R. Balancing innovation and safety when integrating digital tools into health care. Ann Intern Med 2018;168:733-734.

83. Amarasingham R, Patzer RE, Huesch M, Nguyen NQ, Xie B. Implementing electronic health care predictive analytics: considerations and challenges. Health Aff (Millwood) 2014;33:1148-1154.

84. Sniderman AD, D’Agostino RB Sr, Pencina MJ. The role of physicians in the era of predictive analytics. JAMA 2015;314:25-26.

85. Krumholz HM. Big data and new knowledge in medicine: the thinking, training, and tools needed for a learning health system. Health Aff (Millwood) 2014;33:1163-1170.

86. Lyell D, Coiera E. Automation bias and verification complexity: a systematic review. J Am Med Inform Assoc 2017;24:423-431.

87. Cabitza F, Rasoini R, Gensini GF. Unintended consequences of machine learning in medicine. JAMA 2017;318:517-518.

88. Castelvecchi D. Can we open the black box of AI? Nature 2016;538:20-23.

89. Jiang H, Kim B, Guan M, Gupta M. To trust or not to trust a classifier. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R, eds. Advances in neural information processing systems 31. New York: Curran Associates, 2018:5541-52.

90. Cohen IG, Amarasingham R, Shah A, Xie B, Lo B. The legal and ethical concerns that arise from using complex predictive analytics in health care. Health Aff (Millwood) 2014;33:1139-1147.

91. arXiv.org Home page (https://arxiv.org/).

92. bioRxiv. bioRxiv: The preprint server for biology (https://www.biorxiv.org/).

服务条款 | 隐私政策 | 联系我们