Compute Canada shares plans for meeting big computing needs
Demand for advanced research computing resources exceed currently available capacity in Canada. Compute Canada’s plans to meet this growing and evolving need were the focus of a June 21 Town Hall session at Compute Canada’s annual High Performance Computing Symposium, delegates were briefed on the various efforts underway—and challenges still ahead—to keep Canada competitive in the global utilization of computing power for discovery and innovation.
“It’s about doing things better and smarter. We are modernizing and moving in a good direction, but it’s still not the extreme growth that we need,” Greg Newby, Compute Canada’s Chief Technology Officer told delegates attending the Town Hall, which was hosted by Compute Canada’s senior management team.
Compute Canada is leading one of the biggest advanced research computing renewals in Canada’s history, driven by a two-and-a-half fold increase in its user base, new scientific instruments and experiments, and growing demand from a broad range of disciplines.
From 2016 to 2018, it will use $30 million from the Canada Foundation for Innovation (CFI) ($75 million including matching and in-kind funds) to replace ageing legacy equipment with more powerful systems. These investments will consolidate resources, centralize services and expand computing and storage capacity at four national sites: University of Victoria, Simon Fraser University, University of Waterloo and University of Toronto.
The current technology refresh is expected to yield 126,500 CPU cores, approximately 4,000 nodes and consume 2,400 kW of electricity for power and cooling. All four sites will have 100 Gb connectivity to CANARIE and its regional networks to facilitate rapid data access.
One of the key upgrades will be a dramatic increase in online storage capacity, from 20 petabytes to 62 petabytes. Capacity for backups will also be significantly expanded.
“We’re backing it all up,” said Dr. Newby. “One of the first questions we get from researchers is, ‘what about my data, data access, long-term retention of data for my research group, my experimental data and observational data?’ Our answer is that we’re going to take really good care of that.”
But that increase in computing power, speed and storage is not enough to keep pace with what Canadian researchers say they will require over the next 5 years. Compute Canada’s user base is projected to grow by approximately 60% over that period.
That’s why Compute Canada is already looking ahead to the next wave of technology upgrades.
In May, it filed two new funding applications to CFI. The first, to CFI’s Cyberinfrastructure Initiative (Challenge 2, Stage 2) competition, seeks $20 million for the next series of investments in new computing and storage infrastructure. At the same time, Compute Canada applied to CFI’s Major Sciences Initiative (MSI) program for five years (2017-2022) of continued operational funding, totalling over $80M from CFI. Decisions are expected by late September.
Compute Canada’s Chief Science Officer Dugan O’Neil noted that the recently completed 2016 resource allocations competition was very tough, with more than 40 projects receiving no support from Compute Canada, in spite of their already having tri-council funding.
“The 2017 competition will also be tough. We have a whole bunch of new systems coming online but they’re not all going to be available for production use in time for the beginning of 2017. Also, during the same time we will be retiring many of the oldest systems. We hope to see some relief by the 2018 competition when the CFI investments kick in,” said Dr. O’Neil.
A recent survey of Compute Canada users found that, even with these new CFI investments, demand for additional computing and storage capacity will continue to exceed capacity. For example, results from the survey indicate that data storage will need to increase 15-19 times between 2016 and 2021, from just 20 petabytes today to more than 300 petabytes by 2021.
“We are hopeful that there will be more funding programs to address the gap, but it’s not a sure thing,” said Dr. O’Neil.
“Delegates heard how Compute Canada is acting on suggestions from researchers on to how improve its resource allocation competition (RAC), including adjusting the schedule and expanding availability of multi-year awards (planned for 2018). Dr. O’Neil said they will also replace a one-size-fits-all competition with separate competition streams, based on the size of the request.”
“Today, whether people are asking us for 100, 1000 or 10,000 core years it’s always the same application form and the same review process,” said Dr. O’Neil. “One of the concerns from the research community is that the application and review process is too much work, when asking for something like 100 cores, ’Why are you making me do all this work?’”
Dr. O’Neil also raised the touchy subject of CCVs (the Canadian Common CV), which researchers are now required to supply as part of their resource renewals. This information is being used to measure the impact of advanced research computing on scientific outcomes in Canada. The CCV holds this information so rather than have researchers re-enter the information we integrated the CCV into our process.
Data were collected on the number of papers published using Compute Canada resources, patents, collaborations with industry and benefits to Canada.
“The reports we got back after using CCV were so much better and so much more complete than anything we had done before in our renewal. It’s amazing,” said Dr. O’Neil. Previously in the whole history of Compute Canada we had less than 4,000 publications reported as enabled by Compute Canada. We now have identified 35,000 publications enabled by Compute Canada.”
“We’re also working to make the CCV process better,” added Mark Dietrich, Compute Canada’s President and CEO.
One of the delegates, an adjunct professor with Laurentian University, noted that of all the CCV applications researchers are required to file with other granting agencies, “yours is the easiest to use.”
2015-2018 technology plans
Distributed across Canada today
- 50 systems
- 27 data centres
- 200,000 cores, 2 Pflops, 20 PB
- 200 experts
Consolidation & Concentration by 2018
- 5-10 data centres
- 300,000 cores, 12 Pflops, 50+ PB
- 200 experts