The Research, Condition, and Disease Categorization (RCDC) System


RCDC - How the Process Works

The RCDC computer-based process sorts NIH-funded projects into categories of research area, disease, or condition. The four main steps in the RCDC categorization process are outlined below.

Step One: Choose the Category

A category can be a research area such as neuroscience, a disease such as diabetes, or a condition such as chronic pain. The RCDC process will continue to report on the more than 280 categories the NIH has historically reported to Congress and the public.

The categories are listed at: Categorical Spending Page.

Step Two: Create the Category Definition

Scientific experts from across the NIH Institutes and Centers worked together to define each category. They followed these four steps to create the definitions.

1. Choose the terms and concepts
A category definition is a series of concepts that are most relevant to the category. These are chosen from the RCDC thesaurus which consists of more than 180,000 biomedical concepts and synonyms. The RCDC thesaurus combines terms and concepts from several sources:

  • National Library of Medicine’s MeSH (Medical Subject Headings) thesaurus
  • CRISP thesaurus
  • The National Cancer Institute’s thesaurus
  • Metathesaurus
  • Jablonsky’s dictionary
  • Other specific types of concepts from NIH Institutes and Centers
  • Additional words or phrases added by NIH scientific experts to ensure capture of specific areas

2. Add a weight
The scientific experts can add a weight (using a mathematical formula) to each term or concept. The weight helps show the relative significance of that term or concept to the overall category definition. The weight also helps RCDC sort projects into the most appropriate categories.

3. Set a threshold
The scientific experts also set a threshold for each category. The threshold is the minimum number of matched terms and concepts between the category and funded project. If a project meets or exceeds a category's threshold, RCDC will include it within that category. Thresholds reduce the chance that a funded grant or contract will be included in an unrelated category.

4. Validate the definition
The final step is to validate the category definition. Validation is an important part of the categorization process. NIH scientific experts want to be sure the RCDC process is as sensitive and specific as possible. To validate a definition:

  • All projects in the entire database are tested against the category definition.
  • Experts review the resulting list of projects assigned to the category.
  • Then they suggest ways to refine the category's terms, concepts, weights, and thresholds to help develop the most valid list of categorized grants and contracts possible.

Once the scientific experts have chosen the terms and concepts, added weights, set thresholds, and run through the validation testing, the RCDC categories are defined. Scientific experts will periodically review and update the category definitions to account for new science or other changes.

Step Three: Create the Project Summary

Project summaries are lists of scored terms and concepts that the RCDC process uses to describe NIH funded projects. The RCDC process creates a project summary for each funded grant and contract listed in the NIH database. RCDC includes the following types of funded grants and contracts:

  • Grants awarded to scientists outside the NIH campus (extramural grants)
  • Research and development contracts
  • Research projects conducted by NIH staff scientists on the NIH campus (intramural grants)

To create a project summary, RCDC:

  1. Searches through the project's title, abstract, specific aims, and public health relevance section to find terms and concepts, or their synonyms, that match the RCDC thesaurus.
  2. Ranks the matching terms and concepts based on how often they occur within the searched sections of the project. The more times a term or concept appears, the higher the score that term or concept gets. Terms and concepts found in the title always get the highest weight no matter how often they occur.

The resulting list of scored terms and concepts is the RCDC project summary for that NIH-funded project.

Step Four: Match the Projects to the Categories

The RCDC system compares the project summary to the category definition to determine how closely they match. If the RCDC project summary meets the threshold score set by NIH scientific experts, RCDC assigns that grant or contract to that category.

RCDC compiles a list of all the funded grants and contracts that fit into specific categories. The list of grants and contracts under each category, or project listings, also includes such details as funding amounts.

The RCDC process enables the NIH to apply the latest technology to consistently report on how America's tax dollars are spent to support medical research. The computer technology (knowledge management application) allows the NIH to categorize funded research consistently across all of the NIH. With the RCDC process, the NIH is able to provide direct public access on a website to detailed and complete project listings.