Frequently Asked Questions about Research, Condition, Disease Categorization (RCDC) and the NIH Categorial Spending Reports
- What is RCDC?
- How does RCDC work?
- What is a category?
- What is the difference between a category and a concept?
- What is an RCDC project listing?
- How is a category created?
- Are there categories that do not follow the standard RCDC process?
- Why weren't all categories automated in 2009 when RCDC launched?
- Can a category definition or name be modified over time?
- Historically manual categories are not published once automated. How can I get these data?
- How was the RCDC process developed?
- Why did the NIH develop RCDC?
- Did RCDC change the way NIH funds research?
- How is the RCDC process different from previous NIH reporting systems?
- What are the benefits of the RCDC reports?
- How are the RCDC research area, disease, and condition categories chosen?
- Does RCDC affect the way researchers apply for funding?
- Do the numbers in RCDC reports add up to the total NIH budget?
- What percentage of project dollars are reported in each category?
- Can categories be related or overlap?
- Why might reported dollars change for a category from year to year?
- Can the categorization of individual project be reviewed and edited if erroneous?
- How are the NIH categorical spending estimates calculated?
- How might the data differ between the NIH categorical spending reports and RePORTER search results?
- How are prevalence and mortality data calculated?
1. What is RCDC?
RCDC (Research, Condition, and Disease Categorization) is a computerized reporting system the National Institutes of Health (NIH) has used to categorize its funding in biomedical research since fiscal year 2008. RCDC reports NIH funding in more than 300 research, condition, and disease categories.
RCDC reports on several types of NIH funding:
- Research grants (extramural research)
- Research and development contracts and Inter-Agency Agreements
- Research conducted in NIH's own laboratories and clinics (intramural research)
- Other Transactions Authority
2. How does RCDC work?
RCDC uses an automated indexing tool that uses text from a project's title, abstract, public health relevance, and specific aims. These sections are used because they are consistent across most types of projects and produce uniform, precise analyses of NIH's projects.
The indexing tool matches project text against scientific terminology, referred to as concepts, from RCDC's biomedical thesaurus. Matched concepts create a "project index," which is a weighted list of concepts in the project. Each category contains a list of concepts that are highly relevant to that topic called a "category fingerprint." The tool compares the project index to the category fingerprint to create a "match score." If this match score is above an empirical threshold value set for the category, the project is reported in the category.
3. What is a category?
A category is any particular subject encompassing a:
- Research area (e.g., Coronaviruses, Neurosciences, Prevention, Opioids)
- Disease or disorder (e.g., Asthma, Heart Disease, Bipolar Disorder, Pediatric Cancer)
- Condition (e.g., Chronic Pain, Spinal Cord Injury, Infertility)
To see a list of categories the NIH reports on its website, go to the Categorical Spending Page.
4. What is the difference between a category and a concept?
Concepts are terms or keywords found in project text that are useful for categorization. These concepts can be anything ranging from specific diseases, gene names, types of diagnostic tools, therapeutic techniques, to patient populations. A category comprises a list of concepts that describe an entire research area. For example, "e-cigarette," "nicotine abuse," and "oral tobacco" are examples of concepts used in the Tobacco category.
5. What is an RCDC project listing?
A project listing provides details for the totals listed on the categorical spending table. The NIH funding amount for each project, along with administrative data, is available on the project listing. These projects can be extramural grants, contracts, interagency agreements, or intramurals. Projects can be reported in more than one category. Categories are not mutually exclusive.
6. How is a category created?
Categories are created based on NIH-wide discussions with NIH subject matter experts. From these discussions, the scientific areas to include or exclude are determined (called category parameters). Parameters are periodically reviewed and updated to account for new or evolving scientific topics.
Based on the category parameters, a category fingerprint is developed using a list of concepts from the RCDC thesaurus that are relevant to the category. Each concept in the fingerprint is weighted based on its relevance to the topic. The thesaurus is curated to add or remove concepts and synonyms based on scientific trends (see How does RCDC work?).
Once the project listing is produced, NIH subject matter experts validate the relevance of the projects using the category parameters.
7. Are there categories that do not follow the standard RCDC process?
Yes. Some categories do not follow the automated RCDC process. These are manually curated by NIH subject matter experts due to unique official reporting requirements.
8. Why weren't all categories automated in 2009 when RCDC launched?
While most RCDC categories were automated at launch, some categories remained manual until feasibility analyses to address unique reporting requirements for those categories could be completed. As analyses were completed, these categories were converted to the automated process. Only a handful of manual categories remain, and NIH will continue to convert categories that can be automated.
9. Can a category definition or name be modified over time?
Existing categories are refined periodically to reflect scientific advancements. Category names or parameters may be modified with NIH-wide consensus to keep pace with current terminology and knowledge of the scientific subject matter.
10. Historically manual categories are not published once automated. How can I get these data?
Since manual and automated figures are not comparable, the historical funding numbers for manual categories will no longer be posted once automated categories are created on the topic, but the data are available upon request by contacting rcdc@mail.nih.gov.
11. How was the RCDC process developed?
RCDC is a complex process that categorizes ongoing NIH-supported research. NIH technical and scientific experts helped create the RCDC methods and processes. This input from NIH staff laid the groundwork for developing parameters to categorize NIH-supported research. While the system is automated, NIH expertise is used to formulate the categories and validate the results.
All NIH Institutes and Centers are actively engaged in the process from establishing new NIH reporting policies to creating the category definitions. The more than 300 RCDC categories include those that have been requested by Congress and other federal agencies for reporting to the public.
12. Why did the NIH develop RCDC?
The NIH, funded by U.S. tax dollars, supports biomedical research across the country and around the world. The American people want to know how the NIH spends their tax dollars.
With advances in data science and text mining technologies, the NIH recognized that it could transform its process for developing spending reports. Prompted by two National Academy of Sciences reports, NIH reviewed options for improving research categorization processes. In 2004, the NIH tested the small-scale application of a new computerized process that demonstrated the potential to accurately sort NIH-funded research projects into categories. Section 402B of the National Institutes of Health Reform Act of 2006 required the implementation of a searchable electronic system to uniformly categorize research grants and activities from all NIH Institutes or Centers. In its current form, the RCDC system enables NIH to create, maintain, and validate categories for official reporting and analysis.
13. Did RCDC change the way NIH funds research?
No. The way the NIH funds research remains the same. NIH does not expressly budget by category. NIH receives its budget from Congress at the beginning of each fiscal year, and funds the most meritorious investigator-initiated scientific research proposals using a two-level peer review process. At the end of each fiscal year, the NIH reports how much it spent in more than 300 categories.
14. How is the RCDC process different from previous reporting systems?
The RCDC process is different in two important ways:
First, RCDC applies a consistent categorization process for each research area, disease, or condition and provides a uniform report for all of the NIH. Before 2008, each NIH Institute or Center (IC) categorized its funding based on its own mission. Reports were not always produced using the same definitions across the NIH, even though many ICs do research in related areas. The RCDC process uses NIH-wide category definitions and applies them uniformly to all types of research across all ICs.
Second, the public can access the RCDC categorical spending reports on a public website. RCDC's funding reports are detailed so that the public can see a complete list of projects by title in each category and the associated dollars spent.
15. What are the benefits of the RCDC reports?
RCDC offers the public, scientists, and NIH staff a quick and easy way to get a complete list of research projects funded in more than 300 specific research areas, diseases, or conditions. RCDC produces consistent, reliable reports with all NIH Institutes or Centers using the same NIH-wide category definitions and categorization process for each research area, disease, or condition .
RCDC reports provide the following detailed information for each category with estimates for the next two years posted on the public RePORT website:
- A total dollar amount for a category
- An exportable list of all projects reported in a category that can be sorted by:
- Dollar amounts for each project within the category
- Title of the research project
- Name(s) of the principal investigator(s)
- Name of the organization conducting the research
- NIH project identifier number (e.g., grant number)
- Funding institute or center
16. How are the RCDC research area, disease, and condition categories chosen?
The more than 300 categories include those that were, over time, requested by Congress, the White House, advocacy groups, and NIH leadership for reporting to government and the public.
17. Does RCDC affect the way researchers apply for funding?
No, RCDC has no impact on the grant application and review process.
18. Do the numbers in RCDC reports add up to the total NIH budget?
No. All of the categories added together will not equal the total NIH appropriation for the following reasons:
- Research projects are often reported to more than one RCDC category.
- RCDC categories are by their nature overlapping (for example, Brain Disorders, Neurosciences, and Mental Health categories may have projects in common).
- RCDC categories do not encompass all types of biomedical research.
As a result, some NIH-funded projects might fall into several of the more than 300 reported categories, whereas other projects will not be categorized at all. For example, a hypothetical project with the title, "Depression in older men with diabetes" could fall into four categories: Depression, Aging, Mental Health, and Diabetes.
19. What percentage of project dollars are reported in each category?
For automated categories, each project will be represented with 100% of the projects' dollars for each category in which it appears. For manually collected categories, the direct appropriated dollars may be prorated to account for the percentage of the project meeting the category definition.
20. Can categories be related or overlap?
In some cases, an entire category's project listing can be contained within another category. For example, Breast Cancer and other cancer-related categories are reported in the project listing for Cancer.
21. Why might reported dollars change for a category from year to year?
The projects included in a category may change as new research is funded and projects end. Additionally, category definitions are updated as science advances and research evolves; these parameter modifications impact how many projects are included in the category, leading to increases or decreases in reported annual obligations.
22. Can the categorization of individual projects be reviewed and edited if erroneous?
Because the information in the RCDC reports is frozen, corrections are not made after the reports are public. However, categories are reviewed annually so if there are concerns regarding projects, they can be addressed the future years.
23. How are the NIH categorical spending estimates calculated?
The annual funding levels estimated for future years are derived from a combination of budget policies articulated in Congressional Justifications, funding priorities reflected in enacted appropriations (or sustained in Continuing Resolutions), and baseline funding associated with the continuation of projects from previous fiscal years.
24. How might the data differ between the NIH categorical spending reports and RePORTER search results?
The RePORT Expenditures and Results (RePORTER) system is an electronic tool that allows users to search a repository of NIH-funded and administered research projects and access publications and patents resulting from NIH funding.
RePORTER is updated weekly and provides the most up-to-date information possible on funded projects. Changes in the administrative details of prior awards can occur in real time (e.g., when a principal investigator changes institutions, or an award is provided a no-cost extension). By contrast, RCDC data in the categorical spending reports is not updated once the fiscal year data is released to the public.
25. How are prevalence and mortality data calculated?
NIH reports estimates of disease burden data alongside NIH's categorical funding estimates in order to provide the public and policymakers with information that is helpful for understanding the NIH research portfolio and its relationship to public health needs.
Estimates of United States disease burden are from CDC's National Center for Health Statistics (NCHS) and are provided as mortality and prevalence figures. Figures for mortality data are drawn from the National Vital Statistics System, and indicate the number of deaths in which a particular disease or condition was mentioned on a deceased individual's death certificate. Prevalence figures are drawn from National Health Interview Survey responses and report the percentage of respondents who indicated that they were affected by a particular health condition. Standard error, a measure of the uncertainty in the estimate, is presented in parentheses alongside the prevalence estimate. Estimates of mortality and prevalence are not available for all NIH spending categories.