Merative’s MarketScan


Merative advances health and social care by providing innovative healthcare data and technology solutions, collaborating with thousands of providers and major organizations 1,2. Their MarketScan Research Databases offer longitudinal, patient-level data on healthcare costs and outcomes, supporting diverse research applications with data from over 273 million patients and more than 2,600 peer-reviewed publications. MarketScan’s detailed and HIPAA-compliant data enhance research across disease areas, backed by powerful analytic tools 3.

At Yale, the MarketScan database is licensed for research use by the Yale Biomedical Informatics and Computing (YBIC) office, with support from the Harvey Cushing/John Hay Whitney Medical Library and the Yale Center for Clinical Investigation (YCCI). Yale researchers can access and analyze MarketScan data by submitting a request form for assistance from the YBIC team 4,5.

Updated: October 10th, 2025

Overview

Merative advances health and social care by innovatively providing healthcare data and technology solutions for patients, program managers, radiologists, and researchers. They collaborate with over 4,500 healthcare providers, top US health plans, government agencies, Fortune 100 employers, and all leading life sciences companies to drive shared health progress 1,2.

Merative’s MarketScan Research Databases provide longitudinal, patient-level data that cover the full continuum of healthcare costs and outcomes, including detailed prescription drug information. These databases, with data from over 273 million unique patients across diverse care points, support numerous research applications such as pharmacoeconomic outcome evaluations, economic burden studies, and therapeutic pathway analyses, documented in more than 2,600 peer-reviewed publications 3.

MarketScan offers several unique advantages: it includes comprehensive patient-level details, tracks the full continuum of care, and provides detailed prescription drug information for longitudinal research. Its large sample size supports studies on unique patient populations, and linked data enhance research across various disease areas, maintaining appropriate claims linkages and HIPAA compliance. Equipped with analytic tools for efficient data exploration, MarketScan facilitates understanding of disease progression, treatment patterns, and health outcomes for patients, employers, health plans, and government entities 3,5.

The YBIC license includes the following MarketScan datasets 5:

  • Commercial Database (CCAE): Contains data from active employees, early retirees, COBRA continuees, and dependents with employer-sponsored plans, including lab results. Its table structure includes Inpatient Admissions, Facility Header, Inpatient Services, Outpatient Services, Outpatient Pharmaceutical Claims, Annual Enrollment Summary, Enrollment Detail, and Lab Results. This dataset represents approximately 69 million patients and 8.7 billion records.

  • Medicare Database (MDCR): Originally designed for Medicare-eligible retirees with employer-sponsored Medicare Supplemental and Medicare Advantage plans, primarily containing fee-for-service plan data. Its table structure matches that of the Commercial Database and includes both Medicare-paid and employer-paid supplemental insurance amounts, limited to plans where both types of payments are available and evident on claims. This dataset represents approximately 4 million patients and 1.55 billion records.

  • Medicaid Database (MDCD): Captures healthcare service use for Medicaid enrollees across various states, covering both fee-for-service and managed care plans. It includes records of inpatient services, admissions, outpatient services, prescription drug claims, long-term care, and demographic variables such as age, gender, Federal Aid, Disability, TANF, and race. This dataset represents approximately 28 million patients and 8.26 billion records.

  • Dental Database: An independent product that can be linked to specific years and versions of the Merative MarketScan Commercial Database and the MarketScan Medicare Database. This dataset represents approximately 28 million patients and 1.73 billion records.

  • Commercial Insurance Weights Database: The Merative MarketScan Commercial (CCAE) and Medicare (MDCR) Databases contain data on individuals with employer-sponsored insurance (ESI), either as primary or supplemental coverage. The MarketScan Commercial Insurance Weights, created using the Public Use Microdata Sample (PUMS) from the American Community Survey (ACS), project this data to the national population with ESI.

Gaining Access

Do I Qualify?

The YBIC has licensed the MarketScan database for Yale community members to use in their research.

Typical Timeline

Time constraints may vary for applicants.

Step-by-Step Guide

Researchers need to fill out an application requesting access, listing the dataset and population they want access to, describing their research plan, disclosing funding sources, and signing the data use agreement.

After receiving approval, the Harvey Cushing/John Hay Whitney Medical Library, the Yale Center for Clinical Investigation (YCCI), and YBIC will collaborate to assist with data retrieval, analysis, and ensuring compliance with data use agreements. The request form is accessible on the MarketScan Database DataMed webpage.

Publications

This section presents a selection of PubMed articles that utilize the dataset and are authored by individuals affiliated with the Yale School of Public Health. These articles are provided to inspire researchers and students to use the data in their own work.

Back to top

References

1.
2.
Merative. About merative.
3.
4.
Informatics, Y. B. & (YBIC), C. MarketScan YBIC page.
5.
Informatics, Y. B. & (YBIC), C. DataMed.