Harnessing the Power of Scientific Cloud Computing to Fight COVID-19
The University of Victoria (UVic) and Compute Canada are turning the power of Arbutus, Canada’s largest scientific cloud, toward the task of analyzing the protein structure of the SARS-CoV-2 virus, as part of an international collaborative effort to help researchers develop drug therapies for COVID-19.
The Folding@home project is an international distributed computing collaboration that runs protein folding simulations for the purpose of disease pathology and treatment. Folding@home is the first scientific computing system to break the exascale barrier and is now more powerful than all the Top 500 supercomputers in the world combined.
To fight COVID-19, Folding@home is analyzing SARS-CoV-2 viral proteins such as the distinctive coronavirus spike structure which binds to the ACE2 protein found on the surface of human lung cells, allowing the virus to break into them, and the main viral protease which facilitates replication of the virus. Finding a compound that binds to or inhibits either of these proteins could lead to therapeutic drugs for treating COVID-19.
Out of more than 250,000 teams participating in the Folding@home project, as of this writing the Compute Canada team is among the top 15 contributors in the world in terms of daily simulation volume and in the top 100 for total cumulative contributions, generating thousands of protein folding simulation work units per day. Arbutus is currently the 4th highest individual contributor in the world. The Cedar cluster at Simon Fraser University is also contributing resources via preemptible batch jobs.
It all started as a test…
Arbutus is one of five Canadian national research computing sites. It is hosted at UVic and operated by the UVic Systems Research Computing Services (RCS) team in partnership with Compute Canada and regional partner WestGrid.
Innovation, Science and Economic Development Canada (ISED), in coordination with Compute Canada, recently funded a major “Phase 3” hardware expansion of Arbutus, significantly augmenting its compute resources, storage facilities, and system capabilities. This expansion was in its final stages of deployment in early March, just as the COVID-19 outbreak began to spread around the world.
In need of a burn-in workload to comprehensively test and commission the hardware expansion, and with contributions to COVID-19 research foremost in the team’s priorities, highly qualified personnel in UVic Systems RCS tested, configured and launched a small proof of concept deployment of the Folding@home application and successfully processed their first simulation work unit, in less than one day.
Over the following days and weeks, RCS worked intensively on a massive scale-out using TerraFold, a virtual cluster environment for Folding@home developed by the team to efficiently manage and provision the large number of systems, using Ansible and Terraform.
Initially having zero prior knowledge of the Folding@Home application, RCS scientific computing specialists eliminated several scalability barriers, automated the provisioning and scaling process, and deployed over 10,000 virtual CPUs and 300 virtual GPUs within two weeks.
“Massive computing systems and dynamic large-scale deployments are part of the appeal that first drew me to the field of advanced research computing,” said Ryan Taylor, a senior research computing specialist on the UVic RCS team, “but working on this over the last several weeks has been a particularly compelling experience because of the opportunity to make an immediate impact in a joint effort to tackle a global crisis, which is a bit different from my usual day-to-day work.”
Meanwhile, other Canadian researchers are rapidly advancing other work on COVID-19, and Compute Canada is providing special prioritization for resource allocations supporting these projects. As these new projects ramp up, the elasticity of the Arbutus Cloud environment allows TerraFold to gracefully scale back its utilization, ensuring that as much of the cloud as possible is fully engaged in servicing scientific workloads at all times.
A closer look at the system: What is the Arbutus Cloud?
Arbutus is a purpose-built research cloud for Canadian researchers, the largest of its kind in the country with over 30,000 virtual CPUs, 140 terabytes of RAM, more than 18 petabytes of storage capacity, and 104 Tesla V100-PCIE-32GB GPUs for accelerating highly parallelized data processing workloads such as artificial intelligence, bioinformatics and molecular dynamics.
Cloud computing is a unique form of scientific computing, facilitating highly agile provisioning of resources for opportunistic workloads, maximizing usage efficiency of cloud resources by consuming spare capacity when available, and dynamically scaling back when other workloads require additional resources.
Arbutus’ expansion in March augmented the existing compute resources and storage with the addition of more than 13,000 vCPUs, 78TB of RAM, and 15PiB of raw disk storage, and provided new capabilities with a pool of 104 GPUs, a 24PB tape backup system, and clustered filesystem and object storage deployments.
Access to Arbutus’ cloud resources is available free of charge to any faculty member or senior researcher at a Canadian university, college, or research facility (including research hospitals) who conducts academic research. For more information and to apply for access, please visit https://docs.computecanada.ca/wiki/Cloud
How does protein folding simulation help fight COVID-19?
Proteins (such as antibodies, enzymes, and many other types) are responsible for performing most biological functions in living organisms. Proteins are assembled chains of amino acids, and each protein functions in a specific way determined by how those chains fold together to form a particular structure and bond together or potentially change shape when interacting with other molecules. However, current experimental techniques are incapable of observing moving proteins as they undergo folding and structural transitions, so researchers use molecular dynamics simulations to visualize and understand these processes, and reveal therapeutic opportunities for treating diseases, including identifying drug candidates to disrupt or modify these processes.
For more information on the Folding@home project and how you can support it using your own computer, visit their website. To join the Compute Canada Folding@home team, you can install the client on your home computer and enter team number 250396 in the setup.