학술논문

Phenotate: crowdsourcing phenotype annotations as exercises in undergraduate classes
Document Type
Brief Communication
Source
Genetics in Medicine: Official journal of the American College of Medical Genetics and Genomics. 22(8):1391-1400
Subject
rare diseases
phenotype
crowdsourcing
medical education
machine learning
Language
English
ISSN
1098-3600
1530-0366
Abstract
Purpose: Computational documentation of genetic disorders is highly relianton structured data for differential diagnosis, pathogenic variantidentification, and patient matchmaking. However, most information on rarediseases (RDs) exists in freeform text, such as academic literature. To increaseavailability of structured RD data, we developed a crowdsourcing approach forcollecting phenotype information using student assignments.Methods: We developed Phenotate, a web application for crowdsourcing diseasephenotype annotations through assignments for undergraduate genetics students.Using student-collected data, we generated composite annotations for eachdisease through a machine learning approach. These annotations were comparedwith those from clinical practitioners and gold standard curated data.Results: Deploying Phenotate in five undergraduate genetics courses, wecollected annotations for 22 diseases. Student-sourced annotations showed strongsimilarity to gold standards, with F-measures ranging from 0.584 to 0.868.Furthermore, clinicians used Phenotate annotations to identify diseases withcomparable accuracy to other annotation sources and gold standards. For sixdisorders, no gold standards were available, allowing us to create some of thefirst structured annotations for them, while students demonstrated ability toresearch RDs.Conclusion: Phenotate enables crowdsourcing RD phenotypic annotations througheducational assignments. Presented as an intuitive web-based tool, it offerspedagogical benefits and augments the computable RD knowledgebase.