Appendix B: Methodology
RAND (Critical Technologies Institute) MethodologyRAND's Critical Technologies Institute (CTI) derived an estimate of the fiscal year 1995 Federal research and development on children and adolescents primarily from the RaDiUS (Research and Development in the United States) database that it is developing. This database contains information submitted annually to the U.S. Office of Management and Budget (OMB) by all Federal agencies about their research and development (R&D) projects and seeks to place all project data in a common format. It currently contains approximately 80 percent of all Federal domestic R&D projects. Prior to RaDiUS, there was no centralized R&D database across Government agencies. Rather, each agency tracked its own R&D projects with varying degrees of centralization, commonality of data elements, and consistency with OMB definitions. The Children's Initiative was one of the first major efforts to utilize the RaDiUS database to develop Government-wide estimates for an area of research. As such, it revealed both the strengths and limitations of the current database, identified areas for improvement, and, in many cases, enabled the testing of RaDiUS estimates with agency estimates.
For each R&D project in the database, RaDiUS collects information on overall budget levels, fiscal year budget level, length of contract, project title and abstract, and responsible contracting institution. The database contains approximately 200,000 projects across Federal cabinet departments and independent research agencies (e.g., the National Science Foundation).
Estimating the amount of R&D on children and adolescents would have been fairly straightforward if four conditions were met:
An estimate of Government-wide research on children and adolescents could then be obtained by either reading all abstracts or sampling a sufficient number of abstracts and classifying them as either directed or not directed toward children and adolescents. The complexity of the methodology used was necessary to account for the four above conditions not being satisfied.
The first condition was not met since RaDiUS currently contains about 80 percent of Government-wide R&D projects. For some projects in RaDiUS, a key data element is missing which is necessary to classify the project or to estimate its fiscal year 1995 budget1. Research and development estimates include the estimated amount of R&D that was missing from RaDiUS or for which data was incomplete. Estimates were based on the assumption that the proportion of an agency R&D budget devoted to children and adolescents was the same for those projects for which we had or did not have complete information. For instance, if there was complete information on projects which represented 80 percent of an agency's R&D budget, and 10 percent of those projects were classified as directed toward children and adolescents, then it was assumed that 10 percent of the missing budget was also directed toward children.
In general, about one-quarter of the total estimate across agencies pertains to missing projects or data. The assumption that missing projects had the same proportion of emphasis on children and adolescents as on complete-information projects introduces more uncertainty into agency estimates than there is in Government-wide estimates. For some agencies, the missing projects were not random, but represented all R&D from a particular subagency. In some cases, the subagency might place either much less or much greater emphasis on children. Better agency estimates could be made through more research to determine the source of missing data and the mission of that particular agency.
The second condition was not always met because different agencies use different classification for what is included in R&D. Two examples are evaluation projects and "training" projects. Some agencies classify major evaluation projects as R&D and some do not. Similarly, projects involving training are classified differently among agencies. In general, these differences show up between R&D projects submitted to OMB and internal agency estimates of R&D. RaDiUS uses the OMB classification, but in some cases cannot identify which agency projects have been designated as meeting the OMB guidelines. Because training and evaluation projects constitute a small portion of R&D, these differences across agencies probably make only small differences in overall estimates. However, they can introduce larger uncertainty into particular agency estimates.
The third condition was not met because initial sampling of abstracts and classification by several researchers revealed that several issues would arise with regard to what should and should not be included as R&D devoted to children. Examples include research using animals but directed toward children's health problems, research on children outside the United States, and topics indirectly involving or benefiting children such as divorce, teacher quality, community policing and curriculum development for high schools. In general, each agency presented a unique set of classification issues. Our approach was to classify projects into three categories:
For each agency, several types of projects included in Category 2 were defined. Thus, estimates can be made which include or exclude types of projects. The base estimate for children and adolescent research which includes only projects classified in category one was 1.8 billion. This includes a major research category of animal research directed toward issues of children and adolescents. The estimate for category two projects was approximately 700 million. The 2 billion estimate given in the body of the report thus would include a portion of projects classified as Category 2.
Finally, Condition 4 was not met since the time and resources for the project were not sufficient to read every abstract in RaDiUS for classification. Thus a sampling strategy was utilized based on two considerations. CTI wanted to focus more effort on reading abstracts of projects with large budgets than on reading abstracts of projects with small budgets. CTI also wanted to focus more effort on reading abstracts of projects more likely to be related to children. Implementing the first consideration meant reading all abstracts for larger budget projects in each agency and reading only samples for smaller budget projects. "Larger budget" projects were defined as projects having a fiscal year 1995 budget above the average project budget for the agency.
The second consideration was implemented by identifying two groups of projects within each agency -- those having a high likelihood of being directed at children and adolescents and those having a small likelihood. High-likelihood projects were identified through key word searches of abstracts for words that would identify most of the projects related to children. CTI read all abstracts for larger budget and high-likelihood projects, but only sampled the remaining smaller budget and low- likelihood projects. The sampling ratios varied by agencies depending on the number of projects in the latter category, but were typically from 1 in 3 to 1 in 8. CTI attached an appropriate weight to each of the sampled projects that was used in estimating the total amount of research related to children and adolescents.
To develop final estimates for research related to children and adolescents, CTI weighted sums of budgets across all projects classified as Category 1 or 2, and then divided by the proportion of total agency R&D budget authority represented in RaDiUS projects with complete information. Estimates were made by each agency also, although more uncertainty was attached to individual agency estimates than to RaDiUS estimates. This uncertainty pertains primarily to some agencies for which representation in RaDiUS was 60 percent or below. For those agencies that had representation of 80 percent or above and for which there were few classification issues, CTI believes the estimates were much better. For some agencies with very little children's research, but with very large research budgets, estimates were quite uncertain since there were few high- likelihood projects and large numbers of low-likelihood projects. For these agencies, large sampling fractions were used because very few projects were likely to be classified as Category 1 or 2. Thus, large uncertainty may be attached to those agencies with very little research or children relative to that in other agencies.2 No efforts were made to improve these estimates becasue such improvements would not have affected the estimates of total research.
Federal Departments and Agencies