Research Portal2018 Resource Allocations Competition Results
– CPU Allocations
– GPU Allocations
– Cloud Allocations
– Persistent Disk Storage Allocations
– Review Process
– Scaling for Compute Requests
Monetary Value of the 2018 Allocations
Canada’s national advanced research computing (ARC) platform is delivered through the partnership of Compute Canada, regional organizations (WestGrid, Compute Ontario, Calcul Québec and ACENET) and institutions across Canada. Providing researchers with access to the infrastructure and expertise they need to accomplish globally competitive, data-driven, transformative research, it serves the needs of more than 11,000 researchers, including over 3,900 faculty based at Canadian institutions as of January 1, 2018.
Recent investments have enabled a renewal of Canada’s national ARC platform — the incorporation of the new Stage 1 systems, Cedar (SFU), Graham (Waterloo), and Niagara (Toronto), yielded approximately 60PB of new raw storage capacity and 133,552 core years.
However, the dual challenge of the retirement of legacy systems and an ongoing growth in researcher demand for resources meant that demand continued to outstrip supply. The 2018 RAC competition received the highest number of applications in its history with 469 projects applying for an allocation — an increase of 15% over requests made in 2017. Unfortunately, due to the challenges discussed above, this year’s RAC was only able to award 55.1% of the total compute requested, 73% of the total storage requested, and 20.5% of the total GPUs requested.
In general, 80% of resources are reserved for the Resource Allocation Competitions (RAC), leaving 20% for use via the Rapid Access Service (RAS). Those with RAC awards will have a higher priority on the clusters; however, all users have access to modest quantities of compute, storage and cloud resources via RAS as soon as they have a Compute Canada account.
If you have questions about the terminology used in this page, please consult the Technical Glossary.
Table 1: Applications submitted to the Resource Allocation Competitions
between 2011 and 2018.
|Year||Total||Year on Year Increase|
Based on available computing resources, 55.1% of CPU (core year) requests were met by RAC 2018. New systems (Cedar, Graham, Niagara), which are faster and have more memory than older systems, provided nearly 75% of the available capacity or approximately 146,000 cores. In addition, CFI made an exception and allowed for a year’s extension of a number of legacy systems to help address the capacity shortfall. This resulted in a modest increase in available cores.
The new systems were allocated at close to 80% capacity, leaving approximately 20% capacity for new users and smaller development projects that did not receive an allocation through the RAC process. As mentioned above, a number of legacy systems, including (Briaree, MP2, Frontenac) were also allocated on an exceptional basis, but these are smaller systems, including some with older, less capable CPUs, so allocation rates were closer to 60% capacity with room for opportunistic and bursty use.
Table 2: 2018 Compute Allocations per System
|CPU||Cores available for allocations (100% capacity)*||Cores requested||Cores Allocated||% of cores allocated vs. available|
As of April 26, 2018
*This provides the total number of available cores. Generally, approximately 80% of these cores are allocated for RAC, leaving 20% for use through RAS for new users and development projects that did not receive a RAC allocation, as well as for system outages and upgrades.
** These systems were added to the allocation pool, on an exceptional basis, to help make some extra cores available to address capacity shortages.
Table 3: Historical Compute Ask vs. Allocation
|Year||Supply: Allocatable CPU CY||
Need: Total CY Requested
|Provided: Total CY allocated||Shortfall capacity CY||% of the demand awarded|
|2012||189, 024||103, 845||87, 312||16, 533||84.10%|
|2011||132, 316||72, 848||75, 471||-2, 623||103.60%|
Constraint in GPU resources was greater than for CPUs. As Table 4 shows, requests for GPUs have increased 6 fold since 2015. In 2018, nearly 1000 new GPU devices became available as part of the Graham and Cedar clusters. Unfortunately, at the same time, some older systems were removed from service, which meant that the allocation success rate was lower in 2018 at 20.5%, compared with 37.5% in 2017.
Table 4: Historical GPU demand vs. supply (GPU years)
|Year||Supply: Allocatable GPUs||
Need: GPUS Requested
|Provided: Total GPUs allocated||Shortfall capacity GPUs||% of the need awarded|
The Arbutus cluster at the University of Victoria has 10,336 allocatable virtual CPUs. These are available via RAC and RAS, and are also utilized for internal Compute Canada services such as software builds and hosting. A further 36 legacy nodes remain in service as part of Cloud East at l’Université de Sherbrooke. RAC 2018 received a 36% increase in requests for virtual CPUs. Between Arbutus and the additional nodes at Cloud East (UdeS), this year’s RAC was able to allocate 95% of the total virtual CPUs requested. In total, cloud storage was allocated at 94% of its capacity for 2018.
Persistent Disk Storage Allocations
The incorporation of the new Stage 1 systems Cedar, Graham, Arbutus and Niagara yielded approximately 60PB of new raw storage capacity in 2018. “Nearline” capacity, to relocate data from online (disk) to nearline (tape), is under development and is expected to be available for RAC 2019. Until then, some of the nearline capacity requirements are being met by HPSS and Mammouth Archive.
As of early 2018, a total of 45.5PB of persistent disk storage was allocated from the 54.6PB of allocatable supply.
Table 5: 2018 Storage Need v. Supply by Storage Type (TB)
|Storage type||Supply||Need: Storage Requested TB||Provided: Storage allocated||% of the demand awarded|
|Nearline (HPSS & Mammouth Archive)||14,000||17,503||6,607||38%|
Submissions are evaluated for technical feasibility and scientific excellence. For the 2018 competitions, 469 applications were evaluated. Virtually all RAC applicants are requesting resources to support research programs and HQP that were are already funded through other tri-council and other peer-reviewed sources.
|Technical Review||Technical Staff||
|Scientific Review||Disciplinary peer review panel evaluates each proposal||
Scaling for Compute Requests
As described above, there were insufficient ARC resources to fully meet the allocations requested through RAC 2018.
As a result, a scaling function was applied to RAC 2018 (see chart below) to provide a means by which decisions on RAC allocations in a context of insufficient capacity could be made. This function, which is endorsed by the RAC Chairs committee, was set so that only applications with a science score of 2.0 or higher received an allocation, with a maximum of 83% of their total allocation request being met for those with a score of 5. Applicants who did not receive a compute allocation can still make opportunistic use of system resources via Rapid Access Service.
Table 6: Scaling Parameters for Compute Allocations
|Minimum Science Score for an allocation||2.0||2.2|
|% of CPU request allocated at minimum Science Score||10%||16%|
|% of CPU request allocated to 4.0 Science Score||61%||72%|
|% of CPU request allocated to 5.0 Science Score||83%||87.5%|
|Number of applications below minimum allocatable score||18||55|
Monetary Value of the 2018 Allocations
These values represent an average across the national ARC platform’s facilities and include total capital and operational costs incurred to deliver the resources and associated services. These are not commercial or market values. For the 2018 competition, the value of the resources allocated was calculated on a per-year basis using the following rates:
Table 7: Historical Financial Value of RAC Awards
|Financial value of award||2018||2017||2016||2015|
|1 core year||$156.78||$188.84||$279.00||$275.00|
|1 GPU year||$2,960.77||$566.52||$1,100.00||$1,100.00|
|1 TB of project storage / year||$36.48||$128.00||$173.00||$190.00|
|1 VCPU year||$91.05||$40.50||NA||NA|
|1 TB of cloud storage (Ceph) / year||$236.81||$178.50||NA||NA|
Costs for CPUs also reflect inclusion of 42,044 legacy CPU cores valued at $279/year each. Capital costs are not included for legacy cores, only operational costs. The GPU methodology was improved from 2017 to be more accurate. Some legacy storage and other resources were allocated for 2018, but costs are not differentiated from the new equipment.
With the exception of GPUs, the valuation of each of these resources decreases each year as older, more expensive, resources are retired and replaced with newer, more cost-effective, resources.