-
PDF
- Split View
-
Views
-
Cite
Cite
Stephanie Niño de Rivera, Yihong Zhao, Shalom Omollo, Sarah Eslami, Natalie Benda, Yashika Sharma, Meghan Reading Turchioe, Marianne Sharko, Lydia S Dugdale, Ruth Masterson Creber, Integrating public preferences to overcome racial disparities in research: findings from a US survey on enhancing trust in research data-sharing practices, JAMIA Open, Volume 8, Issue 3, June 2025, ooaf031, https://doi.org/10.1093/jamiaopen/ooaf031
- Share Icon Share
Abstract
Data-sharing policies are rapidly evolving toward increased data sharing. However, participants’ perspectives are not well understood and could have an adverse impact on participation in research. We evaluated participants’ preferences for sharing specific types of data with specific groups, and strategies to enhance trust in data-sharing practices.
In March 2023, we conducted a nationally representative online survey with 610 US adults and used logistic regression models to assess sociodemographic differences in their willingness to share different types of data.
Our findings highlight notable racial disparities in willingness to share research data with external entities, especially health policy and public health organizations. Black participants were significantly less likely to share most health data with public health organizations, including mental health (odds ratio [OR]: 0.543, 95% CI, 0.323-0.895) and sexual health/fertility information (OR: 0.404, 95% CI, 0.228-0.691), compared to White participants. Moreover, 63% of participants expressed that their trust in researchers would improve if given control over the data recipients.
Participants exhibit reluctance to share specific types of personal research data, emphasizing strong preferences regarding external data access. This highlights the need for a critical reassessment of current data-sharing policies to align with participant concerns.
It is imperative for data-sharing policies to integrate diverse patient viewpoints to mitigate risk of distrust and a potential unintended consequence of lower participation among racial and ethnic minority participants in research.
Lay Summary
As data sharing for research becomes more common, much remains unknown about how people feel about it. This uncertainty can influence whether they choose to participate in research studies. Our study explored the types of personal data people are willing to share, whom they are comfortable sharing it with, and how trust in researchers can be strengthened. We surveyed 610 US adults in March 2023 and found that racial differences influence the willingness to share sensitive data. Black participants were significantly less likely to share health data, notably mental and sexual health information, with public health organizations compared to White participants. Most respondents (63%) said they would trust researchers more if they had control over who accessed their data. These findings suggest that data-sharing policies need to better consider diverse perspectives to prevent reduced participation in research, especially from racial and ethnic minoritized communities.
Background and significance
Data-sharing policies are evolving in the current research environment, presenting an opportunity for researchers to establish data-sharing policies that prioritize transparency and foster trust in research. Recent shifts, including increased data-sharing requirements from funding agencies like the National Institutes of Health (NIH) and mandates from leading journals, now require that datasets be made available in public repositories, with varying levels of accessibility.1 Individuals who decide to participate in research often grant broad consent for secondary uses and frequently do not have a clear understanding of how their data might be used and by whom in the future. This lack of clarity has the potential to erode trust between researchers and participants who perceive limited transparency and control over how their research data is shared, especially when it involves more sensitive data such as sexual health, mental health, and genomic data.2,3 While there is a substantial body of literature exploring public attitudes toward data sharing in research, particularly regarding specific types of data (eg, genetic, biospecimen4–6) and the varying entities involved (eg, health-technology companies, medical professionals, governmental organizations, commercial entities),7–10 there is a notable gap in examining perspectives on data-sharing practices in the current landscape, especially among racial and ethnic underrepresented communities.
Researchers’ data-sharing practices influence participants’ trust in researchers.3,11 Trust research is crucial to promote diverse participation in research, and it is important that current data-sharing practices do not contribute to participants’ hesitancy to participate in research. Previous studies support that trust in researchers can increase among participants when there are transparent efforts to inform participants about the future intended use of their data.3,12–14 Data sharing is important to promote transparency and reproducibility in science, but it must be done in an inclusive manner that incorporates the preferences of the diverse communities with whom researchers work. Racial and ethnic minorities have historically been and continue to be underrepresented in research due to concerns surrounding exploitation, fears of discrimination, and cultural and language barriers.15–20 These experiences have contributed to a perception among them that researchers are not transparent about potential conflicts of interest with external industries.21,22 This distrust can contribute to lower representation of racial and ethnic minorities in health-related research,23 which limits the generalizability of research findings to these populations, creating cycles of inequity.
To ensure an inclusive research environment is established and does not perpetuate the exclusion of underrepresented communities or cause intervention-generated inequities,24 it is necessary to explore research on the views of a diverse representation of participants. Addressing concerns about sharing sensitive data is essential to protecting participants’ autonomy and ensuring their priorities are reflected in data-sharing policies, especially with the shift toward public repositories. This study aims to fill this gap by examining how these changes in data-sharing policies affect participants’ trust, using a representative sample from the United States to ensure that research practices are inclusive of diverse populations.
Objective
In this study, we evaluated public data-sharing preferences regarding the intersection of: (1) the type of research data shared (eg, sexual health and fertility data, mental health data, imaging data, etc.) and (2) with whom the data would be shared with (eg, health policy institutions, public health organizations, health-technology companies). Given the importance of inclusiveness in data-sharing policies, our analysis focused on assessing sociodemographic differences in data-sharing preferences. We also evaluated the impacts on willingness to participate and trust in research if participants could decide on the potential external recipients of their research data.
Materials and methods
Ethics statement
The Institutional Review Board at Columbia University approved this study. Participants were administered an information sheet about the study and provided informed consent by checking a box on the online survey. Participants were able to stop the survey at any time and withdraw participation.
Study design
A cross-sectional survey was administered in March 2023 using Prolific,25 an online survey sampling tool, to recruit a sample balanced on age, gender, and race to match the US census. Participants were verified Prolific users of the age 18 or older with English proficiency. The survey was first piloted with 10 participants, and then administered to an additional 600. In total, 610 participants completed the survey with no missing data.
Survey items
Survey questions were developed by team members with experience in prior work evaluating the impact of trust on data sharing,3 a pediatrician, and medical ethicist. The survey had 4 primary areas of focus: (1) types of research data participants are willing to share with external recipients, (2) types of research data participants want to be returned to them, (3) the impact of returning data on willingness to participate in research, and (4) changes in trust toward researchers when they can choose who their data is shared with.
First, participants selected the types of data they would want to access after participating in a research study. The 7 different types of data presented for the questions included: (1) biological, (2) genetic, (3) clinical, (4) mental health, (5) sexual health and fertility, (6) imaging, and (7) consumer-generated data. Each type of data presented was based on the NIH guideline for types of data researchers should share in data repositories.26 Examples for each type of data were provided for clarity. Additional information regarding the descriptions for the types of data presented in the survey can be found in the Supplementary Material. Next, participants were asked to select whether their willingness to participate in research would change if they could access the data researchers collected about them.
Using lay terms, participants were then informed about the possibility of research studies sharing their anonymous data with outside groups for secondary uses. An example of secondary uses was also provided: “An example of an outside group includes health-technology companies who make applications for health monitoring.” Then, participants had the option to select the types of research data they would share with different external recipients. The 5 types of external recipients included: (1) doctors and nurses, (2) health policy institutions, (3) public health foundations, (4) health-technology companies, and (5) private foundations. Next, participants were asked whether they wanted to have the option to review, remove, or do nothing with their data before sharing it with external recipients. Participants were also asked whether trust or willingness to participate in research would change if they could select which outside groups had access to their data. Finally, participants answered sociodemographic questions including age, gender, race, ethnicity, socioeconomic information, and health literacy.27
Data collection
We created the survey using Qualtrics, a Health Insurance Portability and Accountability Act (HIPAA) compliant survey development tool provisioned by Columbia University. Prolific recruited eligible individuals to participate, and upon consenting to participate in the study, participants were directed to a link with the Qualtrics survey. Participants were able to review their answer choices before submitting the survey and compensated $15 per hour.
Statistical analysis
Descriptive statistics on sociodemographic variables and closed-ended survey responses were generated for the overall sample. Bivariate associations between survey responses and self-reported race were assessed with Pearson’s Chi-squared test or Fisher’s exact test. Native American or Alaska Native, Native Hawaiian or Pacific Islander, Multirace, or participants who identify as a race not listed were combined into “Another race” category due to the small number of participants in these race categories. Logistic regression models were used to assess whether there were racial differences in participants’ willingness to share different types of data with health policy institutions and public health institutions. Age, gender, education level, and health literacy status were included in the models as control variables. All tests were 2-sided with a statistical significance level set at 0.05.
Qualitative analysis
A thematic analysis was conducted for the open-ended survey responses on perspectives about what makes a researcher trustworthy. Two members of the research team identified relevant codes and themes for analysis. The interrater reliability (IRR) was calculated using 10% of the double-coded qualitative responses to calculate Cohen’s kappa score. After establishing the IRR, the remaining responses were independently coded and any discrepancies between coders were resolved by reaching a consensus through discussion.
Results
Sample characteristics
Participant characteristics are summarized in Table 1. The analytic sample included 610 participants with a mean of 46 years (±16; range 19-93). Among the sample, 50% were female, 79% identified as White, and 7% identified as Hispanic/Latinx. Over half of participants completed a college degree or higher, and 31% of participants reported financial instability/not having enough resources to make ends meet. The majority of participants (95%) had adequate health literacy and 5% of participants had limited or marginal health literacy. The average time of completion for the survey was 13 minutes and 55 seconds.
Demographic characteristic . | Overall (N = 610) . |
---|---|
Age (mean, SD) | 46 (16) |
Gender (n, %) | |
Female | 303 (50%) |
Male | 293 (48%) |
Nonbinary or gender diverse | 14 (2%) |
Race (n, %) | |
American Indian or Alaska Native | 3 (0.5%) |
Asian | 36 (5.9%) |
Black/African American | 78 (12.8%) |
Native Hawaiian or Pacific Islander | 1 (0.2%) |
White | 475 (77.9%) |
Multirace | 11 (1.8%) |
Identifies as a race not listed | 6 (1.0%) |
Ethnicity | |
Hispanic or Latino origin | 43 (7%) |
Non-Hispanic or non-Latino origin | 567 (93%) |
Education (n, %) | |
High school graduate or less | 86 (14%) |
Some college or Bachelor’s | 422 (69%) |
Graduate degree | 102 (17%) |
Financial resources (n, %) | |
Not enough | 192 (31%) |
Enough | 347 (57%) |
More than enough | 71 (12%) |
Health literacya (n, %) | |
Adequate | 582 (95%) |
Limited/marginal | 28 (5%) |
Demographic characteristic . | Overall (N = 610) . |
---|---|
Age (mean, SD) | 46 (16) |
Gender (n, %) | |
Female | 303 (50%) |
Male | 293 (48%) |
Nonbinary or gender diverse | 14 (2%) |
Race (n, %) | |
American Indian or Alaska Native | 3 (0.5%) |
Asian | 36 (5.9%) |
Black/African American | 78 (12.8%) |
Native Hawaiian or Pacific Islander | 1 (0.2%) |
White | 475 (77.9%) |
Multirace | 11 (1.8%) |
Identifies as a race not listed | 6 (1.0%) |
Ethnicity | |
Hispanic or Latino origin | 43 (7%) |
Non-Hispanic or non-Latino origin | 567 (93%) |
Education (n, %) | |
High school graduate or less | 86 (14%) |
Some college or Bachelor’s | 422 (69%) |
Graduate degree | 102 (17%) |
Financial resources (n, %) | |
Not enough | 192 (31%) |
Enough | 347 (57%) |
More than enough | 71 (12%) |
Health literacya (n, %) | |
Adequate | 582 (95%) |
Limited/marginal | 28 (5%) |
Health literacy was measured using a standardized instrument.20
Demographic characteristic . | Overall (N = 610) . |
---|---|
Age (mean, SD) | 46 (16) |
Gender (n, %) | |
Female | 303 (50%) |
Male | 293 (48%) |
Nonbinary or gender diverse | 14 (2%) |
Race (n, %) | |
American Indian or Alaska Native | 3 (0.5%) |
Asian | 36 (5.9%) |
Black/African American | 78 (12.8%) |
Native Hawaiian or Pacific Islander | 1 (0.2%) |
White | 475 (77.9%) |
Multirace | 11 (1.8%) |
Identifies as a race not listed | 6 (1.0%) |
Ethnicity | |
Hispanic or Latino origin | 43 (7%) |
Non-Hispanic or non-Latino origin | 567 (93%) |
Education (n, %) | |
High school graduate or less | 86 (14%) |
Some college or Bachelor’s | 422 (69%) |
Graduate degree | 102 (17%) |
Financial resources (n, %) | |
Not enough | 192 (31%) |
Enough | 347 (57%) |
More than enough | 71 (12%) |
Health literacya (n, %) | |
Adequate | 582 (95%) |
Limited/marginal | 28 (5%) |
Demographic characteristic . | Overall (N = 610) . |
---|---|
Age (mean, SD) | 46 (16) |
Gender (n, %) | |
Female | 303 (50%) |
Male | 293 (48%) |
Nonbinary or gender diverse | 14 (2%) |
Race (n, %) | |
American Indian or Alaska Native | 3 (0.5%) |
Asian | 36 (5.9%) |
Black/African American | 78 (12.8%) |
Native Hawaiian or Pacific Islander | 1 (0.2%) |
White | 475 (77.9%) |
Multirace | 11 (1.8%) |
Identifies as a race not listed | 6 (1.0%) |
Ethnicity | |
Hispanic or Latino origin | 43 (7%) |
Non-Hispanic or non-Latino origin | 567 (93%) |
Education (n, %) | |
High school graduate or less | 86 (14%) |
Some college or Bachelor’s | 422 (69%) |
Graduate degree | 102 (17%) |
Financial resources (n, %) | |
Not enough | 192 (31%) |
Enough | 347 (57%) |
More than enough | 71 (12%) |
Health literacya (n, %) | |
Adequate | 582 (95%) |
Limited/marginal | 28 (5%) |
Health literacy was measured using a standardized instrument.20
Participant preferences for sharing data
Participants’ willingness to share research data differed based on the type of data and which group it was shared with. Regardless of research data type (eg, biological, genetic, clinical, mental health, sexual health and fertility, imaging, and consumer-generated data), participants were most willing to share with their doctors and nurses (Table S1). For example, 77% of participants would share their mental health data with doctors and nurses, compared to 40% with health-technology companies and 28% with private foundations. Similarly, 88% of participants would share their imaging data with their health-care team, compared to 54% with health policy institutions and 49% with health-technology companies. There were also significant differences by race, when choosing to share with doctors and nurses. Fewer Black/African American participants wanted to share with doctors and nurses their biological (70%, P < .001), genetic (70%, P = .019), clinical symptom (81%, P = .014), imaging (73%, P < .001), and mental health data (59%, P < .001) compared to other racial groups.
Figure 1 presents data-sharing preferences with external entities across different racial categories. Racial differences were most prominent among data shared with health policy institutions and public health organizations, compared to health-technology companies and private foundations. For example, a lower proportion of Black/African American participants (33%), Asian participants (42%), and participants who identified with another race (48%) were willing to share their imaging data with health policy institutions compared to White participants (58%) (P <.001). Similarly, fewer Black participants (59%), Asian participants (58%), and participants who identified with another race (62%) were inclined to share their clinical symptom data with health policy institutions compared to White participants (72%) (P = .035). Similar trends were observed for willingness to share with public health organizations. A smaller percentage of Asian (11%) participants expressed willingness to share sexual health data with health-technology companies compared to Black participants (29%), White participants (34%), and participants who identified with another race (38%) (P = .020).

Bar charts showing Sharing preferences for sharing specific types of data (sexual health and fertility, clinical symptom, imaging, and genetic data) with external recipients by different racial groups.
Figure 2 shows the adjusted odds ratio (ORs) and 95% CI for willingness to share different data types with public health institutions and health-technology companies by self-reported race. When deciding to share research data with public health organizations, Black participants were significantly less likely to share most of the data types presented (biological—OR: 0.562 [95% CI, 0.342-0.920]; clinical symptom—OR: 0.455 [95% CI, 0.275-0.756]; mental health—OR: 0.543 [95% CI, 0.323-0.895]; sexual health and fertility data—OR: 0.404 [95% CI, 0.228-0.691]; and imaging data—OR: 0.488 [95% CI, 0.258-0.714]; consumer-generated data—OR: 0.432 [95% CI, 0.258-0.714]) compared to White participants. Similarly, participants who identified as another racial group were significantly less likely to share their sexual health or fertility data (OR: 0.534 [95% CI, 0.284-0.971]), imaging data (OR: 0.466 [95% CI, 0.255-0.832]), and consumer-generated data (OR: 0.493 [95% CI, 0.271-0.885]) with public health organizations compared to White participants. Furthermore, Black/African American participants were significantly less likely to share consumer-generated data (OR: 0.539 [95% CI, 0.324-0.885]) and imaging data (OR: 0.480 [95% CI, 0.285-0.794]) with health-technology companies than their White counterparts. Similar trends were noted for willingness to share with health policy institutions, with all racial minority participants exhibiting lower likelihood of sharing different types of data than White participants (Figure S1).

Odds ratios (95% confidence intervals) of participants’ willingness to share different types of research data with health-technology companies and public health organizations. The “Another Race” category comprises individuals identifying as Asian, Multirace, Native American or Alaska Native, Native Hawaiian or Pacific Islander, or a race not listed.
Participant preferences for types of data they want returned to them
Participants indicated the types of research data they would want returned to them after participating in a research study (Figure 3). The majority of participants (93%) wanted researchers to return at least one of the indicated data types, with the highest proportion of participants selecting clinical symptom data (75%). Racial differences were not found among the types of data participants preferred to receive.

Types of research data participants want returned to them after participating in a research study.
Data sharing and impacts on participation and trust
Participants’ willingness to engage in research, and trust in researchers is influenced by being able to choose who gets access to their data (Figure 4). When asked if they could select which external groups could access their data, 52% of participants would increase their willingness to participate in research, 44% of participants would have no change, and 4% would decrease their willingness. Moreover, when asked about changes in trust if they could select which external recipients, 63% of participants would increase their trust, 35% would not change their trust, and 2% would decrease their trust toward researchers. Sixty percent of participants reported increasing their willingness to participate in research if their data were returned to them. Racial differences were not found for changes in willingness to participate and changes in trust.

Changes in willingness to participate in research and trust toward researchers.
Finally, 34% of participants wanted to review their data before sharing, 37% wanted to be able to remove some of their data, and 30% did not feel it was necessary to review or remove their data before sharing with external groups. A significantly greater proportion of Black/African American (47%) and Asian (50%) participants were more likely to want to remove their data from data repositories compared to White (35%) and participants who identified with another race (24%).
Qualitative results
Cohen’s kappa score was high (κ = 1.0), indicating strong agreement between coders. Participants were asked to describe what they believe makes a researcher trustworthy. Table 2 shows the four core themes found from the responses: (1) transparency (59.0%), (2) practices surrounding data management, sharing, and privacy (23.7%), (3) institutional reputation (45.8%), (4) ethical oversight (8.0%). There were a few missing responses or those that could not be categorized (6.9%).
Open-ended responses regarding themes for the question: What do you believe makes a researcher trustworthy?
Emergent theme or category (# of responses) . | Description . | Exemplary quote . |
---|---|---|
Transparency (n=361) | Participants wanted transparency around the purpose of the research study, any risks, and clear information on potential conflict of interests | “Full disclosure of conflicts of interest, up front about risks for participants” |
“They are honest and tell you exactly what the data will be used for. They don't make promises they can't keep, and they respect your identity as a person. They don't treat you like a means to an end.” | ||
Data management, sharing, and privacy (n=145) | Participants want clear, easy-to-read information about how their research data will be used in the study, and who may have access to the data (and for what purpose) after the study conclusion. Some participants wanted the ability to remove all identifying data if they chose | “Someone who will not take your data and use it for advertisement purposes and are honest and upfront on how they will use your data.” |
“[…] Clearly outlined research guidelines including protections for the researcher/the patient/the data, clearly stated guidelines for who can access the data and why, clear instructions on how and what control I retain over my data over time (i.e, can I withdraw it from the research pool in the future?).” | ||
Institutional reputation (n=280) | Participants stated that they trusted academic research institutions | “I think what makes a researcher trustworthy is who they are doing it for and [the] institutions behind it. If [the] government was involved I would be less likely to trust it as opposed to an academic researcher.” |
Ethical oversight (n=49) | Participants wanted assurance of IRB approval of research studies, in addition to concerns around data security | “Having some type system to make sure they do not take advantage of participants (IRB) and strong security around data.” |
Emergent theme or category (# of responses) . | Description . | Exemplary quote . |
---|---|---|
Transparency (n=361) | Participants wanted transparency around the purpose of the research study, any risks, and clear information on potential conflict of interests | “Full disclosure of conflicts of interest, up front about risks for participants” |
“They are honest and tell you exactly what the data will be used for. They don't make promises they can't keep, and they respect your identity as a person. They don't treat you like a means to an end.” | ||
Data management, sharing, and privacy (n=145) | Participants want clear, easy-to-read information about how their research data will be used in the study, and who may have access to the data (and for what purpose) after the study conclusion. Some participants wanted the ability to remove all identifying data if they chose | “Someone who will not take your data and use it for advertisement purposes and are honest and upfront on how they will use your data.” |
“[…] Clearly outlined research guidelines including protections for the researcher/the patient/the data, clearly stated guidelines for who can access the data and why, clear instructions on how and what control I retain over my data over time (i.e, can I withdraw it from the research pool in the future?).” | ||
Institutional reputation (n=280) | Participants stated that they trusted academic research institutions | “I think what makes a researcher trustworthy is who they are doing it for and [the] institutions behind it. If [the] government was involved I would be less likely to trust it as opposed to an academic researcher.” |
Ethical oversight (n=49) | Participants wanted assurance of IRB approval of research studies, in addition to concerns around data security | “Having some type system to make sure they do not take advantage of participants (IRB) and strong security around data.” |
Abbreviation: IRB, Institutional Review Board.
Open-ended responses regarding themes for the question: What do you believe makes a researcher trustworthy?
Emergent theme or category (# of responses) . | Description . | Exemplary quote . |
---|---|---|
Transparency (n=361) | Participants wanted transparency around the purpose of the research study, any risks, and clear information on potential conflict of interests | “Full disclosure of conflicts of interest, up front about risks for participants” |
“They are honest and tell you exactly what the data will be used for. They don't make promises they can't keep, and they respect your identity as a person. They don't treat you like a means to an end.” | ||
Data management, sharing, and privacy (n=145) | Participants want clear, easy-to-read information about how their research data will be used in the study, and who may have access to the data (and for what purpose) after the study conclusion. Some participants wanted the ability to remove all identifying data if they chose | “Someone who will not take your data and use it for advertisement purposes and are honest and upfront on how they will use your data.” |
“[…] Clearly outlined research guidelines including protections for the researcher/the patient/the data, clearly stated guidelines for who can access the data and why, clear instructions on how and what control I retain over my data over time (i.e, can I withdraw it from the research pool in the future?).” | ||
Institutional reputation (n=280) | Participants stated that they trusted academic research institutions | “I think what makes a researcher trustworthy is who they are doing it for and [the] institutions behind it. If [the] government was involved I would be less likely to trust it as opposed to an academic researcher.” |
Ethical oversight (n=49) | Participants wanted assurance of IRB approval of research studies, in addition to concerns around data security | “Having some type system to make sure they do not take advantage of participants (IRB) and strong security around data.” |
Emergent theme or category (# of responses) . | Description . | Exemplary quote . |
---|---|---|
Transparency (n=361) | Participants wanted transparency around the purpose of the research study, any risks, and clear information on potential conflict of interests | “Full disclosure of conflicts of interest, up front about risks for participants” |
“They are honest and tell you exactly what the data will be used for. They don't make promises they can't keep, and they respect your identity as a person. They don't treat you like a means to an end.” | ||
Data management, sharing, and privacy (n=145) | Participants want clear, easy-to-read information about how their research data will be used in the study, and who may have access to the data (and for what purpose) after the study conclusion. Some participants wanted the ability to remove all identifying data if they chose | “Someone who will not take your data and use it for advertisement purposes and are honest and upfront on how they will use your data.” |
“[…] Clearly outlined research guidelines including protections for the researcher/the patient/the data, clearly stated guidelines for who can access the data and why, clear instructions on how and what control I retain over my data over time (i.e, can I withdraw it from the research pool in the future?).” | ||
Institutional reputation (n=280) | Participants stated that they trusted academic research institutions | “I think what makes a researcher trustworthy is who they are doing it for and [the] institutions behind it. If [the] government was involved I would be less likely to trust it as opposed to an academic researcher.” |
Ethical oversight (n=49) | Participants wanted assurance of IRB approval of research studies, in addition to concerns around data security | “Having some type system to make sure they do not take advantage of participants (IRB) and strong security around data.” |
Abbreviation: IRB, Institutional Review Board.
Discussion
Inclusion of research participants’ perspectives regarding their preferences on external data sharing presents an opportunity for researchers to cultivate trust in data-sharing practices. Despite the evolution of data-sharing policies geared toward increased data sharing,1 there is limited data on participants’ viewpoints regarding updated data-sharing policies. Our findings demonstrate that the perceived sensitivity of the type of data and the identity of the recipient significantly contribute to participants’ willingness to share data. Across diverse data categories, our results are consistent with hesitancy among racial minority communities to share data with external recipients, such as health policy institutions and public health institutions. To enhance trust in research practices, it is important to implement data-sharing approaches that address participants’ concerns and promote initiatives that provide participants with more control over the sharing of their data. These efforts can be particularly beneficial in establishing trust within underrepresented communities in research.
Participants generally felt most comfortable sharing their research data with health-care providers. However, we found that Black participants showed less willingness to share certain data types, such as genetic and mental health data, with their health-care providers. This reluctance may stem from historical discrimination and abuse by health professionals against the Black community.28–31 In addition, ongoing discrimination in health-care settings can contribute to fears that researchers may exploit their genetic data32–35 alongside the stigmatization of mental health conditions.36 Previous research has highlighted hesitancy in sharing genetic data among Black communities due to fear of exploitation and mistrust.37–40
Across all racial groups, participants are less willing to share genetic data with external entities including health policy institutions, health-technology companies, and private foundations, which is possibly due to the heightened concerns regarding the uniqueness and traceability of genetic data, impact on progeny, and the ambiguity of future uses.41–44 These apprehensions are justified, as even with efforts to ensure de-identification, genetic data can still be linked to individuals due to its unique nature.45 Furthermore, advancements in technologies such as facial recognition have sparked additional fears around the potential for reidentification through facial images.46
In addition, participants across all racial groups expressed greater hesitancy in sharing sexual health and fertility data compared to other data types. In particular, Black and Asian participants reported the least comfort in sharing sexual health and fertility data with nonhealth-care providers. This finding aligns with current public perspectives given that health-technology companies acquiring and collecting sexual health and fertility data have been scrutinized for their privacy practices and commercialization of data.47–49 However, we did not expect participants to share this hesitancy with groups that are often perceived as serving the public good (such as public health institutions and health policy institutions). Increasing fears following the overturn of Roe v. Wade could contribute to this broader reluctance, as information about fertility or menstruation could expose individuals to penalties and legal consequences.50–52
Black and Asian participants were also less comfortable sharing imaging data with health policy institutions and public health organizations compared to White participants. We hypothesize that along with distrust, participants may perceive imaging data to be more sensitive given that medical images may capture inherently identifiable features. For instance, with advances in artificial intelligence and facial recognition, it is more feasible to match images generated from MRI scans to photographs of an individual.53
To further investigate increased hesitancy to share with organizations established for the public good, when controlling for age, gender, educational attainment, and health literacy, we continued to find that Black/African Americans and other racial minority participants were less likely to share various types of data with public health organizations and health policy institutions compared to health-technology companies. Heightened distrust in governmental institutions post-COVID could have contributed to a reluctance to share with entities that are associated with the medical system and the government.54,55 In particular, racial and ethnic minorities were significantly impacted by COVID-19 pandemic due to disparities in medical treatment and systemic racism.56–58
With increasing distrust in health policy institutions, it is crucial to develop data-sharing practices that do not further exacerbate distrust in research. Previous research has demonstrated that returning participant data can serve as a method to increase trust in researchers.59 Our findings align in that most participants want access to their research data.60 This adds to the importance of discovering methods of returning data to participants in a manner that is comprehensible and inclusive of health literacy and numeracy, allowing patients to participate in research.
Previous research underscores the impact of data sharing on research participation, highlighting a decrease in trust when participants perceive a lack of transparency in the use of their data when shared with others.3 Therefore, we assessed whether having the choice to select external recipients of their data would influence research participants’ trust in researchers. We found that granting greater autonomy to select external data recipients can influence trust in research and participation. Black and Asian participants expressed a desire for greater autonomy over their data, with a higher proportion indicating a preference for the option to review or remove some of their research data before sharing it with external recipients. This discovery aligns with literature emphasizing participants’ desire for more customization when consenting to participate in a research study.61–64 Furthermore, our findings of racial differences in the desire for greater autonomy in data-sharing practices highlight an opportunity for researchers to build trust through data sharing with racial and ethnic minorities.17 However, current research practices, particularly those funded by the NIH, promote broad consent for secondary uses of data.56–59,65 Many participants in studies lack a clear understanding of how their data might be used in the future.66–68
Our qualitative results highlighted that participants believe transparency regarding future uses of their research data is a critical attribute of a trusted researcher. Supporting our findings, the Pew Research Center conducted a survey in 2022 on trust in medical research, revealing that racial minorities were more likely to believe that, despite existing strict de-identification practices, researchers could escape consequences for data breaches resulting in public disclosure of identifiable data.69 Such findings make it imperative for researchers to consider strategies to improve participants’ understanding of data-sharing practices, with an emphasis on confidentiality and the de-identification process of personal health data.70 This will also help promote a trustful, transparent research community, and not further perpetuate the exclusion of racial and ethnic minorities from research studies. Therefore, data-sharing practices must consider the priorities and concerns of racial and ethnic minority communities to ensure current practices are guided by diverse perspectives.
Our findings demonstrate the importance of researchers moving toward a participant-centered approach that is inclusive of participants’ concerns and priorities regarding how they want their data to be shared with external entities.70,71 The patient perspective on evolving data-sharing practices requires further exploration, with an emphasis on obtaining a better understanding of the concerns shared among racial minorities and participants from diverse backgrounds (eg, participants with low health literacy, low socioeconomic status, etc.). Including these perspectives will help the research community develop best practices for data-sharing policies that are both patient-informed and trust-building, especially with underrepresented communities.
Our work emphasizes the need to move beyond broad consent by offering customizable options that allow individuals to specify which data can be shared and with whom. However, achieving this level of customization requires a robust informatics infrastructure to support individualized consent mechanisms that warrant further exploration. By enhancing transparency and control, such systems could help alleviate fears of data misuse. Additionally, the informed consent process must communicate data-sharing practices clearly and accessibly to ensure participants are empowered and fully informed about the benefits and risks of sharing their data. This approach may bridge the trust gap through transparency, inclusivity, and participant autonomy.
Limitations
Limitations of this study include the use of the online survey platform, Prolific in English. To use Prolific, respondents must have a high degree of English proficiency, access to the Internet, and familiarity with technology. These individuals tend to have higher levels of health, graph, and subjective numerical literacy. Additionally, they likely have a higher trust in researchers, seeing as they are participating in research. The Prolific survey is representative of the US population based on census data. However, it is only representative of age, race, and sex and it is not representative of ethnicity (Hispanic or Latinx). We also acknowledge the limitations in the way we measured and presented results by race. The NIH sets 5 categories for race (Asian, Black/African American, Alaska Native, Native Hawaiian or Other Pacific Islander, and White). Respondents to this survey were provided only with a summary of the current NIH Data Management and Sharing policy and were not fully informed about specific data privacy and security measures for deidentified or anonymized data. As a result, participants answered the questions based on their own knowledge or assumptions. In our data analysis, due to limitations with our sample size, we collapsed them into 4 race categories (Asian, Black or African American, Another race, and White). Due to the limitations of using Prolific, further research efforts focusing on sample populations with lower health and technology literacy and different language skills should be carried out.
Author contributions
Stephanie Nino de Rivera (Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Visualization, Writing—original draft), Yihong Zhao (Data curation, Formal analysis, Methodology, Software, Visualization, Writing—original draft, Writing—review & editing), Shalom Omollo (Formal analysis, Visualization, Writing—review & editing), Sarah Eslami (Visualization, Writing—review & editing), Yashika Sharma (Writing—review & editing), Natalie C. Benda (Visualization, Writing—review & editing), Meghan Reading Turchioe (Writing—review & editing), Marianne Sharko (Writing—review & editing), Lydia Dugdale (Conceptualization, Writing—review & editing), and Ruth Masterson Creber (Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing—original draft, Writing—review & editing)
Supplementary material
Supplementary material is available at JAMIA Open online.
Funding
This work was supported by the National Institute of Neurological Disorders and Stroke, R01NS123639-03S2, 5R01NS123639-03, R01NS123639 and National Heart, Lung, and Blood Institute Division of Intramural Research, R01HL161458, R01HL152021. The funders had no role in the study design, data collection, analysis, decision to publish, or preparation of the manuscript.
Conflicts of interest
Our team has the following competing interests to disclose: MRT Boston Scientific (consulting), Iris OB Health (equity). The remaining authors have no conflicts of interest to disclose. This does not alter our adherence to JAMIA policies on sharing data and materials.
Data availability
The dataset underlying this study cannot be publicly shared in a repository due to current restrictions imposed by the Columbia University Institutional Review Board (IRB) and the participants did not consent to the public sharing of their data. However, the data are available upon request from the Columbia University IRB for researchers who meet the criteria for accessing confidential information. Please contact the corresponding author to request the dataset.