Skip to content Skip to navigation

NRC Assessment FAQ

Note: More information will become available after the NRC report is released on September 28; this page will be updated accordingly.

The NRC Assessment of Research Doctorate Programs is a national study that aims to evaluate the quality of PhD programs across the United States. It was conducted by the National Research Council (NRC). The rankings were released on September 28, 2010. In all, 4,838 doctoral programs at 212 universities in 62 fields were rated. The data focus on many dimensions of doctoral programs to facilitate comparisons among programs in the same field.

The 2010 report is the third time such an assessment has been conducted. The first two were published in 1995 and 1982. The current project was conducted between 2005 and 2010. The data were collected in 2006-07, based on data provided by each university about students and faculty during the 2005-06 academic year.

There are substantial differences between the 2010 and 1995 rankings, from data collection to statistical analysis to the format of the rankings themselves. These differences are addressed throughout this FAQ; some of the major changes include the following:

An overview of the differences can also be found here.

The University’s public position on the NRC Assessment is:

“Stanford University does not comment on the specific rating or ranking of the individual departments and programs included in the National Research Council’s report.

Many aspects of a high quality education cannot be reduced to quantitative measures. Every student’s assessment of where to pursue graduate studies should be based on his or her own careful analysis of what the program has to offer. The decision of where to enroll should not be based on a rating or ranking from any organization.

Stanford takes great pride in the quality of our many graduate programs, which have long track records of innovation and excellence across a broad range of fields. Stanford is particularly known for promoting interdisciplinary research and education that has yielded generations of outstanding leaders in academia, industry and government.”

Forty-seven of Stanford’s doctoral programs were rated. Several Stanford programs were ranked in the same NRC-defined field. For example, Geological & Environmental Sciences and Geophysics are both ranked in the NRC field Earth Sciences. The complete list of rated programs, and their NRC fields, is found here.

The NRC used 20 variables (see pp. 7-8 of this overview) that it considers “indicators of program quality.” Variables include measures of faculty research activity, student support and outcomes, and faculty and student demographics. The indicators come from the extensive data provided by the institutions themselves as well as some data collected by the NRC (e.g., faculty awards, publications, and citations).

Each program received five ranges of rankings:

  • Overall S-rankings (“Survey”): Based on 20 variables, weighted based on field-specific faculty opinions of the relative importance of the various program factors.
  • Overall R-ranking (“Regression”): Based on 20 variables, weighted based on field-specific faculty rankings of actual programs.
  • Research Activity subscale: Based on 4 variables used in the Overall rankings.
  • Student Support and Outcomes subscale: Based on 5 variables: 4 used in the Overall rankings, plus “Whether the program collects student outcome/placement data.”
  • Diversity of the Academic Environment subscale: Based on 5 variables used in the Overall rankings.

More information on how the ranges were calculatedhow the variables are defined, and data sources are below.

The rankings are presented in a different form than most other rankings. Rather than receiving a single ranking (e.g., 1st, 5th, 32nd), each program’s five sets of rankings are presented in ranges. The ranges mark 90% confidence intervals.

A program’s range of rankings might be, for example, 2-8 or 4-27 or 13-37. These ranges reflect the inherent uncertainty of ranking a particular program due to differences among raters, statistical uncertainty, and variability in year-to-year data. These ranges of rankings are intended to reflect greater statistical certainty. A range of 2-8 should be read, “It is 90% certain that the program is ranked between 2nd and 8th in this field.”

More information on how the ranges were calculated is below.

Overall, Stanford did well. However, evaluating a program’s actual “quality” is inherently complex. These rankings reflect that complexity.

Full results for all rated programs at all universities are available, in an Excel spreadsheet, directly from the NRC. Quantitative data are proxies. Departments with very similar ranges of rankings may or may not differ meaningfully.

The ranges of rankings were produced from a complex statistical analysis. A brief summary follows. The complete methodology is found here.

Overall S- and R- ranges of rankings are derived from the values of the indicators and the field-specific weights for each variable. The S- and R- weights differ by field, recognizing that faculty members in different disciplines value different aspects of doctoral programs. The 20 variables (see pp. 7-8 of this overview) are weighted to produce quantitative estimates of program quality. The field-specific weights are based on two faculty opinion surveys conducted in spring 2007.

  • S-weights, based on surveys: The first survey asked all faculty across all fields to rate the importance of 21 variables that influence overall program quality.
  • R-weights, based on regressions: The second survey, the “implicit” or “anchoring study,” asked subset of faculty to rate a sample of programs in their field. Regression analysis was then used to determine which quantitative variables at what weights most closely predicted the program rankings in each field.

Ranges of rankings: Rankings from many raters were aggregated and arranged in order to yield ranges of rankings. The NRC study used a “random halves” procedure in which weights are calculated based on the responses of a randomly selected subset of faculty respondents. This is done 500 times, calculating 500 rankings that are ordered each time. The500 resulting rankings are ordered from best to worst, and the bottom five percent and top five percent are dropped. This results in two scores for each program covering the middle 90% of the 500 rankings.

The data categories and definitions used by the NRC are often different from those used in most Stanford reports. The data may not coincide with numbers in Stanford fact books and other university information sources. Therefore, in understanding and checking your program’s data, it is important to understand the details of how each variable is defined, what it measures and how it was calculated. Data definitions are given briefly on the sample program data sheet, and in detail in Appendix E of the NRC's final report.

The data generally reflect the academic year 2005-06. Many characteristics of Stanford’s rated programs have changed substantially over the past five years; this is discussed below.

Some of the most important data definitions are:

  • Faculty data are based on the number of Core, New, Associated, and Allocated Faculty for each program, as defined by NRC. “Core” faculty members are generally Academic Council members with a primary, secondary, or joint appointment in the department. “New” faculty members are like Core faculty members, but with an appointment beginning between 2003 and 2006. “Associated” faculty members are affiliated with the program through a Courtesy, Acting or other similar appointment, or through dissertation advising.
    Assignment of Core, New and Associated faculty was done by Stanford’s Institutional Coordinator, in consultation with School Coordinators, department chairs, and program directors based on the NRC definitions. NRC then determined the number of “Allocated Faculty” using an algorithm based on data about dissertation committee supervision and membership to allocate faculty members on a proportional basis to all programs with which they were affiliated.
  • Student data are based on an NRC-defined set of entry cohorts, as well as criteria for continuous enrollment, population generally thought of as associated with a particular program.
  • The 18 student activities measures (e.g. “Is there an orientation for graduate students in this program?”) give each program credit if each activity provided by either the university (answered centrally, for all programs) or by the program (answered by each program). Each program received credit for nine activities provided by Stanford university-wide, plus any others provided by the program.

Stanford participated in the data collection process by providing data about its programs, faculty and students to the NRC in 2006-07. Some data were also developed directly by the NRC, including data on publications, citations and grants. The full list of data used in these rankings, with sources for each, is available on the sample program data sheet.

Some Stanford data were generated centrally by staff in the offices of the University Registrar and Institutional Research and Decision Support. Other data were provided by programs. All data were checked by School Coordinators designated by each school; these coordinators worked directly with programs to ensure accuracy. A list of the School Coordinators is found on the final page of Stanford’s overview of the NRC assessment.

Rana Glasgal, Associate Vice Provost for Institutional Research and Decision Support (rana@stanford.edu; (650) 725-1327) is available to discuss the NRC methodology and Stanford’s data.

There is no mechanism for updating the data in the NRC report.

Across the country, most programs have changed since 2005-06, the time period reflected in the study. These changes may include demographic shifts, policy changes, or departmental reorganizations. At Stanford, 462 faculty members have been hired and 325 have left since the end of spring 2006 (out of a university-wide faculty of approximately 1,900).

All faculty members in programs participating in this assessment were asked to complete the “Faculty Questionnaire” (available on pages 204-221 of the NRC's final report). A subset of faculty were also asked to complete the “Survey of Program Quality,” also known as the “anchoring study” (available on pages 194-203 of the NRC's final report).

Data from these faculty surveys contributed primarily to developing the variable weights used in the two Overall rankings, as described above.

The general Faculty Questionnaire also supplied data for the final rankings on how many faculty members in each program are supported by grants. Other data from this questionnaire, like much of the data provided by programs and universities, was ultimately not used. (Institutional and Program Questionnaires are available on pp. 131-193 of the NRC's final report.) This is the result of the NRC’s statistical process for identifying a small set of variables (ultimately 20) which they propose as indicators of program quality.

The NRC selected the fields to be ranked. Many programs at Stanford were not included in the data collection phase of this assessment. A separate assessment of Education programs is now underway by the American Education Research Association (AERA) and the National Academy of Education (NAEd).

In some fields, although data were collected, there were ultimately not enough programs for the NRC to be able to calculate statistically valid rankings. This is the case for a group of programs in “Languages, Societies and Cultures” (at Stanford, East Asian Languages and CulturesModern Thought and Literature, and Slavic Languages and Literatures). The NRC may make some of the data collected for these fields available for comparisons across programs.

For those programs categorized as “Emerging Fields” (at Stanford, Biomedical Informatics and Computational Mathematics and Engineering), the NRC was initially aware that not enough data would be available to calculate valid rankings. In order to be able to provide some comparative information on these programs, a small amount of data was collected. See pages 194-197 of the NRC's final report for the 3-item Emerging Fields questionnaire.

These rankings may help programs identify characteristics of their students, faculty, and program features, in comparison to other programs in their field. For example, one can determine whether the program had more or fewer female faculty members, or how their mean time to degree compares. Even so, this should be done with caution.

Be aware that each of these data items is very precisely defined by the NRC, and that the definitions are not necessarily intuitive, or the same as those used in most Stanford data reports. Data definitions are described above.

Patricia J. Gumport, Vice Provost for Graduate Education (gumport@stanford.edu; (650) 736-0775), is available to further discuss how these rankings may be interpreted and used by Stanford programs.

One possible use of these rankings is to allow prospective students considering doctoral studies to compare programs. Every student’s assessment of the best place to pursue graduate studies should be based on his or her own analysis of what the program will have to offer when they plan to undertake pursue their degree. The decision of where to enroll should not be based on a rating or ranking from any organization.

Prospective students could use the information in the study to help them consider and inquire further about different dimensions of a particular program, for example in Research Activity, Student Support, and Diversity, and then place more weight on those program characteristics that are more important to each individual. The values from which these rankings were generated are available from the NRC, allowing students to analyze which programs might be the best match for their own priorities.

Current graduate students considering academic careers, much like prospective graduate students, may use these data to compare and inquire further about characteristics of programs at different universities.

Graduate students may also use the data to put their educational experiences in a wider context. For example, graduate students may have a good understanding of the demographics of their particular program, or of the average number of publications by their program’s faculty. They may not, however, know whether these properties are typical of programs in their field. The comparative data provided by the NRC can assist students in contextualizing their experiences.