KOR

e-Article

Real-Time XFEL Data Analysis at SLAC and NERSC: a Trial Run of Nascent Exascale Experimental Data Analysis
Document Type
article
Source
Subject
cs.DC
physics.data-an
Language
Abstract
X-ray scattering experiments using Free Electron Lasers (XFELs) are apowerful tool to determine the molecular structure and function of unknownsamples (such as COVID-19 viral proteins). XFEL experiments are a challenge tocomputing in two ways: i) due to the high cost of running XFELs, a fastturnaround time from data acquisition to data analysis is essential to makeinformed decisions on experimental protocols; ii) data collection rates aregrowing exponentially, requiring new scalable algorithms. Here we report ourexperiences analyzing data from two experiments at the Linac Coherent LightSource (LCLS) during September 2020. Raw data were analyzed on NERSC's CoriXC40 system, using the Superfacility paradigm: our workflow automatically movesraw data between LCLS and NERSC, where it is analyzed using the softwarepackage CCTBX. We achieved real time data analysis with a turnaround time fromdata acquisition to full molecular reconstruction in as little as 10 min --sufficient time for the experiment's operators to make informed decisions. Byhosting the data analysis on Cori, and by automating LCLS-NERSCinteroperability, we achieved a data analysis rate which matches the dataacquisition rate. Completing data analysis with 10 mins is a first for XFELexperiments and an important milestone if we are to keep up with datacollection trends.