Data Hub background image
Data Hub background image

Data Hub

About the Data Hub

Hematology encompasses a wide variety of diseases, many of which are rare. For rare diseases especially, it is difficult to bring together a critical mass of data, as the patient populations are small, and their information is stored in locations around the world. Questions about the populations, the natural history of disease at the individual patient level, practice patterns, and the quality of care, are often left unanswered.  

Recognizing that this limitation inhibits the pace of discovery, the ASH Research Collaborative (ASH RC) Data Hub was created to facilitate the sharing of data on hematologic conditions in support of scientific inquiry and discovery. The American Society of Hematology (ASH) is uniquely positioned as a trusted convener to collaborate with different stakeholders to grow the data in the Data Hub and provide access to researchers and clinicians working to conquer blood diseases.

The Data Hub is able to ingest a wide variety of data (e.g., clinical and laboratory data, genomic or molecular correlates, patient reported outcomes, aggregated population data) from disparate sources (e.g., inpatient and outpatient clinical sites, industry or government datasets, other registries, U.S. or international, directly from patients) and can accommodate data in different formats (e.g., structured electronic health record data, trial datasets, patient-reported instruments, manual chart abstraction, legacy dataset import). Both retrospective and prospective data are supported. What is ultimately collected will vary from disease to disease, depending on what is available and what is prioritized for a given condition. Existing data sources in hematology tend to be narrow in scope and do not offer this flexibility.

The Data Hub uses state-of-the-art technology so that data collection can be automated wherever possible (to minimize burden associated with data submission); can scale up (in the type of data captured, adding new diseases); and can readily draw from the data it contains to answer research questions (e.g. through data exploration, advanced querying, and integration with tools for graphically displaying data).

Access to the data and analyses housed in the Data Hub will be governed by policy and overseen by project leadership. Initially, stakeholders with clinical research and industry perspectives will be core consumers. As the resource grows, bioinformaticians and clinicians are also anticipated to have an interest in accessing Data Hub data. 

Hematology encompasses both malignant and non-malignant diseases; accordingly, the initial focus areas of the Data Hub, multiple myeloma and sickle cell disease, are representative of these two domains. It is anticipated that additional diseases will be added in the future.