Research PortalRPP Examples of Eligible Submissions – Case Studies

French

The Research Platforms and Portals (RPP) Competition encourages applications from projects that create new or support existing research platforms or portals for a community of researchers.

Groups are encouraged to use the RPP competition if their application falls within any of the following categories:

  • Resource requested on behalf of a large community of users that will be reallocated to individuals and small groups following the award.
  • Applications that provide a public platform that will make use of Compute Canada computing or storage.
  • Groups engaging in international agreements to provide multi-year computing or storage solutions based in Canada.
  • Recipients of the Major Science Initiative from the Canada Foundation for Innovation
  • Groups that are providing shared data sets accessible using a third party (non-Compute Canada) interface.

A number of high-quality applications were received for the RPP competition’s inaugural year and we expect this number to grow in 2017. The following case studies of successful recipients from the 2015 RPP process demonstrates the diversity in scope, scale and disciplines of projects eligible to apply to this competition. If you have any questions regarding the competition like to discuss your interest in applying, please contact us.

Case Studies

International Human Epigenome Consortium (IHEC) - Portal

Project Name:
International Human Epigenome Consortium (IHEC)

Discipline:
Epigenomics

Applicant:
Dr. Guillaume Bourque, Associate Professor, Department of Human Genetics, McGill University; and, Director of Bioinformatics, McGill University & Genome Quebec Innovation Centre

Research Summary (ie. 100-200 words):
The International Human Epigenome Consortium (IHEC) is a global consortium with the primary goal of providing free access to high-resolution reference human epigenome maps for normal and disease cell types to the research community. The epigenome reference maps can have an immediate impact on the understanding of many diseases, and will hopefully lead to the discovery of new means to treat or manage them. In addition to this work, many members support related projects to improve epigenomic technologies, investigate epigenetic regulation in disease processes, and explore broader gene-environment interactions in human health. The IHEC also makes available a Data Portal hosted by the Genetics and Genomics Analysis Platform (GenAP) to allow users to view, search and download the data already released by different IHEC- associated projects.

Applicable Criteria:
Applications that provide a public platform that will make use of Compute Canada computing or storage
The IHEC Data Portal was developed to address the need to integrate and distribute datasets produced by various IHEC member consortia. It was created and is maintained by the McGill Epigenomics Data Coordination Centre (EDCC) and the McGill Epigenomics Mapping Centre (EMC). The Portal implements a database and a graphical interface that currently hosts 5,500 epigenomic datasets. It provides an overview of all the whole genome experiments produced by IHEC members, categorized by providers, tissue types and assay types. It also offers a dynamic grid that can be used to navigate through the datasets and provides links to either visualize the data in a genome browser or download it. Access to the data by collaborators and the greater scientific community is gained via the Portal, which takes advantage of Compute Canada high-performance computing cluster resources to manage the large volume of data associated with the generation of reference epigenome maps. By facilitating data distribution and interpretation, the IHEC Data Portal helps accelerate the translation of epigenomics knowledge into health and disease applications. So far, the project has made available the data from 393 cell types from different tissues and performed epigenomic analysis of 3565 datasets on IHEC Core assays: 855 RNA-seq, 163 Bisulfite-seq, 2547 ChIP-seq (Input+6 histones marks).

 

Ocean Networks Canada - Portal

Project Name:
Ocean Networks Canada

Discipline:
Earth and Environmental Science

Applicant:
Benoît Pirenne, Director, User Engagement, Ocean Networks Canada

Research Summary (ie. 100-200 words):
Ocean Networks Canada (ONC) is a not-for-profit society created in 2007 by the University of Victoria to develop and manage the NEPTUNE and VENUS Observatories, to position Canada as an international leader in the science and technology of ocean observing systems, and to maximize associated economic and societal benefits through the delivery of derived data products, commercialization and outreach. The NEPTUNE, VENUS and Cambridge Bay cabled ocean observatories collect data on physical, chemical, biological, and geological aspects of the ocean over long time periods, supporting research on complex Earth processes in ways not previously possible. The 800-km NEPTUNE observatory, the nearly 50-km VENUS coastal observatory and the Cambridge Bay Arctic observatory—which together make up the Ocean Networks Canada Observatory—stream live data from instruments at key sites off coastal BC and Nunavut via the Internet to scientists, policy-makers, educators and the public around the world. They also provide unique scientific and technical capabilities that permit researchers to operate instruments remotely and receive data at their home laboratories anywhere on the globe in real time.

Applicable Criteria:
Recipients of the Major Science Initiative from the Canada Foundation for Innovation
In October 2012, the governments of Canada and British Columbia announced a total of $41.7 million in new funding to support Ocean Networks Canada through the Canada Foundation for Innovation (CFI) Major Science Initiatives (MSI) program. Of the total, CFI contributed $32.8 million and the BC government contributed $8.9 million. In 2014, the Western Economic Diversification contributed another $9 millions towards the Smart Oceans™ BC program and Transport Canada also contributed $20 millions. The MSI funds are being used to support ONC projects such as the completion of the world’s largest deep-sea tsunami array, new instruments to improve marine safety in the Strait of Georgia, and the first subsea instrument platform in the Arctic. The funds, together with in-kind contributions from IBM Canada, will also support the creation and operational delivery of advanced derived data products.

TAPoR - Portal

Project Name:
TAPoR

Discipline:
Humanities

Applicant:
Geoffrey Rockwell, Professor, Philosophy and Humanities Computing, University of Alberta Director, Kule Institute for Advanced Study

Research Summary:
TAPoR is a place where scholars can experiment with various online tools that allow them to manipulate, analyze and visualize electronic texts. TAPoR was designed as a portal where people can find analysis tools and comment on them. As such, it needs to be “always on” to be useful to the community. Through TAPoR, researchers can:

  • Discover text manipulation, analysis, and visualization tools
  • Discover historic tools
  • Read tool reviews and recommendations
  • Learn about papers, articles and other sources about specific tools
  • Tag, comment, rate and review collaboratively

Applicable Criteria:
Applications that provide a public platform that will make use of Compute Canada computing or storage
As a public platform, the Text Analysis Portal for Research (TAPoR) receives between 150-200 online visitors each day and is regularly accessed by more than 100 countries, including the United States, Canada, the United Kingdom, Australia, India, Germany, Brazil, and the Netherlands. Since its launch in 2012, the TAPoR founders have continuously updated and adapted the portal to serve the needs of digital humanists. It applied to the 2015 RPP competition to support a further renewal and expansion of the portal to integrate new functionality for a SSHRC Partnership grant on Text Mining the Novel (TMN), a multi-university Digital Humanities initiative led by Andrew Piper, an Associate Professor in the Department of Art History and Communications Studies at McGill University.

Through the TMN project, TAPoR will become a site where text mining code developed by the distributed network of researchers could be returned to the larger community of textual scholars along with “recipes” or documentation on how to use mining methods. This virtual sandbox for experimenting with mining methods could be transformative for the interpretative humanities. The computing and storage resources requested in TAPoR’s RPP application supports the use of:

  • a production portal / web server as the platform’s main interface
  • a work server to execute computation jobs
  • storage capacity for running and saving code

TAPoR team members are also consulting with Compute Canada’s digital humanities consultants to advise on best practices on developing and sharing code that could be run on Compute Canada systems. The TMN project has the potential to leverage Compute Canada resources to develop, test, document and share text-mining methods of interest to textual scholars across the country.

ATLAS- Platform

Project Name:
ATLAS

Discipline:
Physics

Applicant:
Reda Tafirout, Research Scientist, TRIUMF

Research Summary:
The ATLAS experiment at the Large Hadron Collider (LHC) is studying proton-proton, proton- lead, and lead-lead collisions at very high energy. The unprecedented energy densities created are allowing researchers to study the structure of matter at a much smaller scales than was previously possible, to extend investigations of the fundamental forces of Nature, to understand the origin of matter, and to search for physics beyond the Standard Model (SM) of particle physics. The experiment is collecting several Petabytes of raw data each year during the LHC operations, and is producing numerous derived and simulated datasets. The enormous amount of data generated by ATLAS is distributed and analyzed onto the Worldwide LHC Computing Grid infrastructure (WLCG), an international network of high-performance computing centres, for which Canada is a significant contributor and a key player with a Tier-1 centre at TRIUMF and four Tier-2 facilities at Compute Canada.

Applicable Criteria:
Groups engaging in international agreements to provide multi-year computing or storage solutions based in Canada
There are nearly 3,000 researchers from 177 institutions in 38 countries who are participating in ATLAS. The computing resources are a vital instrument for the research program and in making breakthrough discoveries. As such, the overall ATLAS scientific program requires a large amount of disk storage and computing facilities at a global scale. In Canada, there are portals implemented that consist of a Compute Element (CE), which act as a gateway for ATLAS jobs, and a Storage Resource Manager (SRM), to provide access to disk resources. These portals are fully integrated within ATLAS data management and workload management systems. Also, the use of two Tier-2 federations at Compute Canada facilities, one in the East and one in the West, allows for load balancing across sites and also leverages the knowledge accumulated at these sites over the past 5-7 years of operation and integration into WLCG. ATLAS-Canada has worked with Compute Canada since its inception to fully integrate its facilities into the WLCG, and the required Grid and networking infrastructures have been set up and high-level expertise assembled across Canada. These resources allow ATLAS-Canada to meet its increasing commitments to ATLAS as more data are being generated, and allows Canadians to remain competitive on the world stage. For example, the discovery of the Higgs boson in 2012 would not have been possible without the WLCG infrastructure; in particular, the Canadian Tier-1 centre at TRIUMF and Tier-2 facilities at Compute Canada provided crucial extra computational resources and storage capacity that facilitated this discovery.

CANFAR - Platform

Project Name:
CANFAR

Discipline:
Astronomy

Applicant:
Christopher Pritchet, Professor, Department of Physics and Astronomy, University of Victoria

Research Summary:
The Canadian Advanced Network for Astronomical Research (CANFAR) is a virtual computing environment for astronomers. CANFAR represents the only repository of most of Canada’s telescope data collections and it uses a cloud-based framework to provide access to very large resources for both storage and processing. Its services can be broken down into three major functional areas:

  • Data Discovery and Access Services (based on bulk storage and database systems): These house the data collections from Canada’s major telescopes.
  • VOSpace Service: A user-managed storage service that supports major projects, survey teams, and individuals with the ability to do data sharing and intermediate term storage
  • Processing Services: These support a variety of applications including batch cloud processing, persistent Virtual Machines (VMs) that support Software-as-a-Service for visualization and other applications, and VM configuration and management services.

Applicable Criteria:
Groups that are providing shared data sets accessible using a third party (non-Compute Canada) interface
For nearly five years, CANFAR has operated a distributed data management system with storage nodes at the University of Saskatchewan, University of Victoria, and the Canadian Astronomy Data Centre (CADC) in Victoria. Users get data from, and put data into, this storage system without knowing where it is coming from or going to because the system is fully integrated. In the year ending March 31, 2014, a total of 253 Canadians and more than 4,000 international users downloaded data from CANFAR data collections. Data collections, VOSpace, and cloud processing are accessed through web browser interfaces, command lines, a variety of client software, and as web services. The ongoing development of the CANFAR platform is being leveraged to serve not only the astronomy research community, but also to make significant contributions to the design and deployment of research data management systems that serve research communities with emerging needs that are not yet well-served by existing infrastructure. CANFAR is working closely with Compute Canada to develop systems that better support the needs of data-intensive science in astronomy and other fields. As a Big Data community, astronomy has much to contribute to other research areas.

Resource requested on behalf of a large community of users that will be reallocated to individuals and small groups following the award
The two primary client communities for the CANFAR platform are astronomical observatories and university researchers. Most access to the astronomy data collections is anonymous. Following the receipt of a main resource allocation from Compute Canada, CANFAR then re-distributes portions of its allocation to projects within its community of users. For example, in 2015, CANFAR was allocated access to 400 core years and 720 TB of storage on Compute Canada systems. CANFAR is governed by a Science Management Committee (SMC) that is responsible for setting policy, requesting resource allocations from Compute Canada, requesting project funding from funding agencies such as CANARIE, overseeing operations, and allocating CANFAR resources to users. A sub-committee of the SMC handles the distribution of resource allocations within CANFAR, reviewing each project’s request for storage and/or processing needs. Once allocated, CANFAR’s user support staff work with the project to implement the request.

 

CBRAIN - Platform

Project Name:
CBRAIN

Discipline:
Neuroscience

Applicant:
Dr. Alan Evans, Professor, Neurology and Neurosurgery Professor, Medical Physics Professor, Biomedical Engineering, McGill University

Research Summary (ie. 100-200 words):
CBRAIN is web-based software that allows neuroimaging researchers to perform computationally intensive analyses on data by connecting them to High-Performance Computing (HPC) facilities across Canada and around the world. CBRAIN connects researchers to the tools and processing power required to handle the large neuroimaging datasets that have become the norm in the field. It does so while at the same time reducing the technical expertise required to use these resources. No computer programming skills are required and it is not necessary to install any software. All that is required is a modern web browser of any kind. A range of neuroimaging analysis tools are available, as well as cutting-edge 2D and 3D real-time visualization to view the brain imaging data.

Applicable Criteria:
Groups that are providing shared data sets accessible using a third party (non-Compute Canada) interface
The CBRAIN web-platform currently interacts transparently with 11 compute clusters and servers and 34 remote data sources across the country and Europe. It should noted that the CBRAIN platform is not simply a research data- and computation-grid, but more importantly, it is an online collaborative research environment with modern visualization tools, which is transforming how neuroscientists work and interact together. Since 2009, the platform has been used by roughly 270 scientists/labs spread over 50 cities in 20 countries to launch millions of individual operations on brain scans (70% are Canadian users, others are international collaborators). Most of these portal users do not have any experience interacting with computerized systems using command-line shells. This community now benefits from an unmatched ability to share tools, compute resources and data online, entirely within the ease and comfort of a web-browser. This has the immediate benefit of standardizing tools, methods and traceability, thus ensuring better experimental result reproducibility between dispersed teams; a currently problematic area where we are actively promoting unified platforms. The CBRAIN platform is generic in nature, it can accommodate any data type and any scriptable tools, whether in life sciences, astronomy, physics or humanities. CBRAIN also has a web application programming interface (API) which allows any CBRAIN user group to integrate external portals, databases and platforms with CBRAIN services (authentication, data registration & movement, job launching and monitoring, result importation).

Resource requested on behalf of a large community of users that will be reallocated to individuals and small groups following the award
The CBRAIN Portal has been serving a growing neuro-scientific community continually since 2008 and its LORIS Study Management Portal since 2001. National and international collaborations are at the heart of the group’s work. Most of its active studies involve strong collaboration and leadership on both Canadian and international levels. All around the world, the neuro-scientific community is converging towards larger, real-time collaborations, as exemplified by projects International Neuroinformatics Coordinating Facility (INCF), Neugrid4U (EU), LONI (USA), the Human Brain Project (EU), and current alignment of Brain Canada. As an active partner in these major neuroscience initiatives and in global standardization discussions, CBRAIN is a uniquely positioned platform to create a bridge between Canadian researchers and other neuroscience initiatives around the world. CBRAIN uses Virtual Organizations (VOs) and a fine-grained permission system for all resources, data and tools, allowing relevant sharing and collaboration across geographically distant groups. CBRAIN team members interact with international teams concerning matter of experimental design, pre-processing, analysis, and user support on a weekly basis.

Genetics and Genomics Analysis Platform (GenAP) - Platform

Project Name:
Genetics and Genomics Analysis Platform (GenAP)

Discipline:
Genomics

Applicant:
Dr. Guillaume Bourque, Associate Professor, Department of Human Genetics, McGill University; and, Director of Bioinformatics, McGill University & Genome Quebec Innovation Centre Dr. Jacques Pierre-Etienne, Assistant Professor, Sherbrooke University

Research Summary (ie. 100-200 words):
Starting in 2013, funding from CANARIE and Génome Québec was used to develop the Genetics and Genomics Analysis Platform (GenAP) as a hub to facilitate the installation and the sharing of genetics and genomics state-of-the-art analysis pipelines on Compute Canada’s High Performance Computing (HPC) facilities. GenAP also facilitates the implementation and the distribution of a new generation of user-friendly bioinformatics tools analyzing both private and public data sets. GenAP helps realize the potential of genetics and genomics research and bring new insights to different fields of biological sciences by empowering a user community that traditionally has not been using HPC resources very effectively. GenAP also offers a number of genomics services and tools to the community at large. These include: the International Human Epigenome Consortium (IHEC) Data Portal and a public mirror of the UCSC Genome Browser and of Galaxy.

Applicable Criteria:
Applications that provide a public platform that will make use of Compute Canada computing or storage
Through GenAP, researchers can login using their Compute Canada ID (CCID) to process and analyze data using advanced genomic tools on HPC resources, from their allocated computing and storage quotas. GenAP also offers a number of genomics services and tools to the community at large, including the International Human Epigenome Consortium (IHEC) Data Portal and a public mirror of the UCSC Genome Browser and of Galaxy. These tools have a growing user base both nationally and internationally with already more than 100 visitors per day. A major outcome of GenAP is to build a flexible environment to install, share and run genetics and genomics tools. This integrated platform also supports data sharing and exchange in unparalleled ways for life science researchers. Some of the services are restricted to national users (with CCID accounts) while others (e.g. IHEC Data Portal) have international reach.

Top