Data Lab is realigning to better serve you! You may notice some content being redirected to  and Beginning Fall 2022, you can expect all Data Lab content to redirect to our sister websites. Be sure to update your bookmarks.
An official website of the U.S. government
Data Lab Logo of an abstract American flag referencing a bar chart
Data Lab Logo of an abstract American flag referencing a bar chart

Data Sources and Methodologies


Step 1: Identifying Institutions

First, we identified educational institutions across the United States by downloading data from the National Center for Education Statistics’ Integrated Postsecondary Education Data System (IPEDS). We gathered meta-data from IPEDS on all 2-year, 4-year, public and private post-secondary institutions, including:

  • Institution Type: 2yr/4yr, public/private
  • Undergraduate Population
  • Graduate Population
  • State
  • UnitID
  • Latitude/Longitude

Step 2: Connecting Institutions to Spending

Next, we connected the institutions to federal spending data using information available on and DUNS numbers. The USAspending data output consisted of:

  • Institution
  • Agency/Sub agency
  • Procurement/Assistance Obligation
  • Product Service Code (PSC)/ Catalog of Federal Domestic Assistance (CFDA)

To begin, we derived a unique identifier between the IPEDS data in step 1 and the data listed above. IPEDS lists several unique identifiers for each institution, and we normalized the name and state associated with each institution for consistency. To normalize the name, we connected any variation of an institution name to one common name. Data normalization is a process in which values measured on different scales are adjusted to a common scale to enable an accurate analysis and/or comparison of values.

Next, we normalized the names of the institutions, grouping records by their DUNS number and the unique_recipient_id field. To easily identify a recipient name as an institution of higher education, the records were paired with the IPEDS version of the institution name. However, there were two types of instances when a judgement call was required:

  1. In situations where institutional systems are listed but a specific campus is not identified, the record was attributed to the main campus of that system. (For example, if listed as Regents of the University of California, it is attributed to University of California - Berkeley.)
  2. If it was not clear whether the recipient name corresponded to an institution of higher education but the Business Type field indicated the record belonged to a college or university, the DUNS number was used in a reverse look up to collect the address registered to that DUNS number. The address was then researched.
    • If the address was on a college campus, that campus was used.
    • If the address related to a research facility or to a research hospital, the main campus was identified as the recipient.

Step 3: Identifying Institutions in the USAspending data

The third step was to identify institutions in the USAspending data. We began by pulling 2017 USASpending contract and assistance data. Then, we filtered and isolated the unique recipient name and DUNS number combinations into a new data set. Lastly, we identified and standardized the institution names.

Once we had the institution names, we filtered the 2017 USASpending data by the Business Type field by: higher_education, public_institution_of_higher_education, and private_institiution_of_higher_education. For records where it was easy to identify the recipient name as an institution of higher education, the names were added to the DUNS list. However, for records where it was not clear if the recipient names corresponded with an institution of higher education, we conducted a reverse look up of their DUNS number to obtain the address. Two common scenarios emerged:

  1. If the address belonged to a college campus, the campus name was normalized to the appropriate institution and the DUNS number was collected.
  2. If the address belonged to a research center, research hospital, college or university satellite office, or some other location, further research was conducted.
    • If the entity website listed the entity as being affiliated with a specific college or university the name was normalized to the appropriate institution and the DUNS number was collected.
    • If there was no mention of affiliation, the record was not used in this analysis.

Next, we filtered 2017 USAspending data on the DUNS numbers that were identified in the previous three passes, and normalized the names of the selected records. The remaining records were analyzed again to ensure no minor or unusual entities were missed.

  • If the record consisted of unusual entities, we did a reverse look up on the DUNS number and researched the associated address to determine if the entity is connected with a qualifying institution.

To update this data to fiscal year 2018 the process outlined in the above paragraph was repeated and the new identifiers were added to the fiscal year 2017 data. We also pulled student aid data for the 2017-2018 academic year, which included student loan programs, the Federal Pell Grant program, the Teacher Education Assistance for College and Higher Education program, and the Iraq Afghanistan Service Grant program. This data was converted to fiscal year. For annually reported federal student aid programs such as, Perkins loans, Federal Supplemental Educational Opportunity Grants, and Federal Work Study program data, the most recent data available is for the 2016-2017 academic year. This data was used as a proxy for 2018 data in our calculated totals.

Step 4: Sorting Investments by Category

The fourth step was to sort the USAspending award data into one of three investment categories: contracts, grants, and research grants (a subset of grants that were awarded for research purposes).

We created a chart with 2 rings that aggregates contracts and grants by category (inner ring) and program (outer ring).

  • The detail table shows the top five awards and the recipient institution by category or program.
  • The file available for download includes all data used to form the chart, including the awarding federal agency and sub agency.

Step 5: Converting Academic Year Data to Fiscal Year

The last step was to convert the academic year information into the federal fiscal year. We began by retrieving federal student assistance data from the Department of Education, which was organized by quarter and re-organizing it according to the federal fiscal calendar.

However, we did encounter a few areas where data could not converted, such as institution data for Federal Work Study, Perkins Loans, and Federal Supplemental Educational Opportunity Grant programs. In these instances, data is reported annually by calendar year.


For more information on Federal Student Aid, visit:

To join the conversation or share your ideas visit the’s Community Page.