Compute Canada Report: Increasing Canadian Research Impact
Compute Canada Bibliometric Analysis
Informed by the collection of CVs from more than 2,300 Canadian faculty members who are active Compute Canada (CC) users, Compute Canada has embarked on an analysis of CC-enabled publications by those faculty members. CC-enabled publications are seen to be well above both the world and Canadian average scientific impact across a broad range of disciplines. When considered institution-by-institution, CC-enabled publications generally have greater impact than the non-CC publications at the same institution. The greatest differences are seen in emerging ARC disciplines in the social sciences and humanities. This document uses common bibliometric metrics to describe publication impact over the collected dataset.
Dataset Collection and Processing
In 2016, Compute Canada introduced the Canadian Common CV (CCV) into the account renewal process. This is the same CV system used by the federal granting councils. As such, the reporting of research outcomes is more fulsome than in previous years. In all, more than 2,300 CCVs were collected between March 2, 2016 and April 8, 2016. From these CCVs, more than 50,000 journal articles and nearly 20,000 conference publications were reported. In addition to the CCV, PIs were asked to identify which of their papers were enabled by Compute Canada support. The reporting from researchers on which of their publications were enabled by Compute Canada was accepted as-is. This resulted in more than 25,000 reported publications and more than 8,000 conference publications enabled by Compute Canada support. Summary tables of collected data is shown below.
|Unique CCVs Submitted||2,306|
|Total contributions submitted||76,420|
|Contributions enabled by CC||35,666|
|Publication Specifics (enabled by CC)|
|Publication Type||Total||Enabled by CC|
|Clinical Care Guideline||46||18|
|Publications by Research Area|
|Research Area||Unique CCVs Submitted||Total Pubs||Enabled by CC||% enabled|
|Biological and Life Sciences||417||12,296||4,392||36%|
|Chemistry and Biochemistry||267||7,465||3,042||41%|
|Computer and Information Science||209||5,548||2,302||41%|
|Environmental and Earth Science||166||5,042||2,562||51%|
|Mathematics and Statistics||136||3,428||1,717||50%|
The bibliometric analysis has so far been pursued for the set of journal articles plus conference publications. This represents a starting dataset of more than 70k articles with more than 33k enabled by CC. However, the current version of the dataset was captured about a week before the renewal deadline and so the number of publications is about 10% lower than what is shown above.
In addition, since some of the PIs are collaborators, the reported publication lists above includes non-negligible double-counting. Before being subject to a bibliometric analysis, we attempted to obtain a Digital Object Identifier (DOI) for each publication using the Crossref DOI API. Duplicate DOIs were then removed. The analysis then proceeded with a set of unique DOIs. Unfortunately, there were some articles for which the tool could not find a DOI, even when a good DOI exists. As a result, only about 18k CC enabled articles were available for bibliometric analysis. The approximately 9k articles (including duplicates) for which no DOI was found were examined. It was found that certain broad categories of publication were over-represented in the no-DOI set. This included particle physics papers with long author lists. These papers were re-inserted by-hand using alternate data sources provided by the corresponding experiments.
The list of CC-enabled articles was then processed using SciVal (from Elsevier). At this time, 15,874 CC enabled articles have been fully processed and are included in the results below.
Metrics Considered – FWCI
In addition to counting articles and citations, we have made extensive use of the metric known as Field-Weighted Citation Impact (FWCI). The description of FWCI below is taken from Snowball Metrics Recipe Book, available from http://www.snowballmetrics.com.
The FWCI is the ratio of the total citations received divided by the average for the subject field. This means that:
- An FWCI of exactly 1 performs as expected for the global average
- An FWCI of more than 1 means that the output is more cited than expected according to the global average
- An FWCI of less than 1 means that the output is less cited than expected according to the global average.
As an example, a score of 1.6 means that the paper received 60% more citations than the world average for that discipline.
The advantage is FWCI is that it accounts for disciplinary differences in publication rates, citation rates and collaborative authorship. It also takes into account year of publication, counting citations obtained for up to 3 full calendar years after the publication date. When combining different fields, a harmonic average is used.
Distribution of Articles by Discipline and by Institution
Compute Canada supported researchers are highly collaborative both with their Canadian colleagues and with other colleagues around the world. The map below shows the number of collaborating institutions involved with the CC-enabled publications, broken down by country.
Impact by Discipline
It is possible to obtain the FWCI for the CC-enabled dataset, broken down by discipline and to compare to both the Canadian and world averages in each discipline. The chart below shows the average FWCI for a broad range of disciplines in which they are at least 100 CC-enabled publications reported. In all disciplines, the CC-enabled average FWCI is substantially above the world average (represented by the vertical yellow line at 1.0). Further, in all disciplines the CC-enabled average FWCI exceeds the Canadian average (represented by the grey bars). The chart is ordered from top to bottom by the difference between the CC-enabled average FWCI and the Canadian average (represented by the blue bars).
Generally speaking, the largest difference above the Canadian average is seen in emerging disciplines. This implies that in disciplines where ARC adoption is not yet widespread, those researchers who take advantage of CC resources see a large gain in impact, while disciplines with wide ARC adoption naturally see an impact which is closer to the field average.
The impact of CC-enabled publications can also be compared to the average impact in comparator countries. The chart below shows the discipline-by-discipline breakdown of CC-enabled papers to the average impacts from 6 different countries.
Impact by Institution
It is also possible to breakdown the publication impact results by Canadian institution. The chart below shows the CC-enabled average FWCI (dark blue bars) compared to the institution overall average (light blue bars). The number written within each bar represents the total number of publications included in the average. In nearly all cases, the CC-enabled average FWCI exceeds the overall institutional average.
Compute Canada has enabled more than 33,000 research publications since 2010. A bibliometric analysis of approximately half of those publication reveals that CC supported research is above both the world and Canadian averages in all disciplines with at least 100 publications reported. The strongest differences in impact are seen in ARC disciplines in which ARC adoption is in relatively early stages, in particular, in social science and humanities.