Categorising grants by theme to help a major donor understand their impact

Grants tagger

CLIENT

Wellcome Trust

NLP

CONTENT CATEGORIZATION

MODEL

TRANSFORMERS TF-idf SVM CNN BiLSTM

GIT

github share icon

Objective

The Wellcome Trust needed better visibility on where their research grants were being spent so they could better understand the grants’ impact and use this to inform ongoing strategic decisions.

Problem

The Wellcome Trust awards more than 800 grants for research each year, totalling over £1 billion. However, Wellcome had no structured way to break down this big portfolio into fine-grained, meaningful categories, such as specific diseases like malaria or thematic categories like infectious diseases.



Every time Wellcome wanted to answer nuanced questions about about funded research in a specific domain (e.g. specific disease), they needed to fall back on a labour-intensive process of manually categorising grants.

Solution

The biomedical sector has a number of systems for classifying information according to its theme. One of the most comprehensive is the National Library of Medicine's Medical Subject Headings (MeSH).



We worked with the Wellcome Trust to implement an automated solution using natural language processing (NLP). The solution we built classifies Wellcome's grants by their content using the MeSH system and achieves a 65% accuracy, which is near the state of the art accuracy figure of 71%.



To best support Wellcome, we also built a dashboard to provide a quick-glance summary of the grant classification.

Impact

Anyone at the Wellcome Trust can now access an easy-to-use, standardised grant classification dashboard, which includes functionality such as freetext or granular filters to help locate grants or grant themes. This visibility of grant classification supports better-informed decision making, and allows for finer grained analysis of the grant portfolio.



The solution we developed with Wellcome runs automatically in the cloud, continuously categorising all the new grants Wellcome awards. We integrated the solution with Wellcome’s existing infrastructure to minimise any disruption, and the new classifications are saved directly to Wellcome’s existing data warehouse.

Extracting bibliographic references from grey literature

NEXT PROJECT ->

Are you interested in working with us?