-
PDF
- Split View
-
Views
-
Cite
Cite
Fengyi Wang, Angeliki Marouli, Pisit Charoenwongwatthana, Chien-Yi Chang, Learn from artificial intelligence: the pursuit of objectivity, Letters in Applied Microbiology, Volume 78, Issue 3, March 2025, ovaf021, https://doi.org/10.1093/lambio/ovaf021
- Share Icon Share
Abstract
Humans continuously face threats from emerging novel pathogens and antimicrobial resistant bacteria or fungi, which requires urgently and efficient solutions. Alternatively, microbes also produce compounds or chemicals highly valuable to humans of which require continuous refinement and improvement of yields. Artificial intelligence (AI) is a promising tool to search for solutions combatting against diseases and facilitating productivity underpinned by robust research providing accurate information. However, the extent of AI credibility is yet to be fully understood. In terms of human bias, AI could arguably act as a means of ensuring scientific objectivity to increase accuracy and precision, however, whether this is possible or not has not been fully discussed. Human bias and error can be introduced at any step of the research process, including conducting experiments and data processing, through to influencing clinical applications. Despite AI's contribution to advancing knowledge, the question remains, is AI able to achieve objectivity in microbiological research? Here, the benefits, drawbacks, and responsibilities of AI utilization in microbiological research and clinical applications were discussed.
Here, we discuss the good and the bad of artificial intelligence (AI) in reducing human bias within scientific research and clinical diagnosis. It highlights both the transformative benefit of AI and the concerns of new bias introduced through AI development. The insights offered aim to inform the integration of AI in scientific and clinical practices while acknowledging the need for ongoing vigilance in AI application.
Introduction
Scientists aim to uncover objective truths. In the pursuit of finding these truths, biases, commonly referred to as systematic errors involving human bias and objective limitations of methodologies, can arise at various stages of fundamental and clinical research. Such biases include distortions in data collection, analysis, and interpretation, influenced by researchers' perceptions, cognitions, and preferences. The extent to which they influence research is greatly dependent on the nature of procedures performed and human intervention. For instance, selection and interpretation of data influenced by expectations and hypotheses may lead to the skewing of results. Over the years, scientists have developed various methods to reduce human bias in research, including blinding design, control settings, experimental replicates, peer reviews, standardized protocols, and data verification. Scientists embark on a never-ending pursuit of minimizing subjectivity in their research; therefore, some would argue that artificial intelligence (AI) utilization for research purposes can substantially reduce introduction of biases.
AI involves the simulation of human intelligence to perform problem-solving tasks that are of great benefit to research, rendering it a driving force both in industrial and research settings. A significant example of AI application in research is the recent introduction of AlphaFold 3. This tool is considered as a milestone in computational biology, which allows for fast and accurate prediction of protein-molecule complexes, compared to already existing experimental methods, consequently having great potential to minimize the drug discovery research period in the future (Abramson et al. 2024). Furthermore, AI has been integrated into different research branches in microbiology with broad extent. CRISPR-Cas9, originally discovered in Streptococcus pyogenes, is a powerful gene-editing tool allowing for the selective addition and deletion of specific genetic information at targeted genomic sites for various purposes i.e. gene function studies and personalized therapy. Several review articles have discussed the advances of AI technology to enhance gene editing technologies, including identifying disease-associated genes (e.g. cancer) and designing guide RNAs of CRISPR-Cas systems for precise medicine (Bhat et al. 2022, Dixit et al. 2023). AI development and involvement accelerate the developmental process of novel and robust gene editing tools that are ever increasing in need. Through protein engineering with AI inputs, robust and precise molecular ‘scissors’ to cleave target DNA can be designed, generated, and confirmed in the laboratory, overcoming the natural limitation of gene editing systems (Thean et al. 2022, Ruffolo et al. 2024). Based on advances in machine learning combined with massive datasets of whole genomes of microorganism and of diverse protein families, new gene editing systems can be identified (Madani et al. 2023, Nguyen et al. 2024). From molecules to bio-systems, AI has been used in engineering biology to refine microbial enzymatic and metabolic pathways to boost production of valuable compounds (Jang et al. 2022).
The World Health Organization (WHO) has brought attention to the growing emergence of pathogenic microorganisms resistant to antimicrobials posing an important human health risk. The emergence of infectious diseases and the necessity for novel antimicrobial agents and vaccines are topics that widely concern the field of microbiology. This was also highlighted during the COVID-19 pandemic where rapid pathogenic characterization, strategic planning, and drug and vaccine developments were urgently required. It is therefore understood that AI is in high demand and expanding in the healthcare and clinical research. However, this progress naturally requires addressing certain questions, one of extreme importance being: Can AI, to a certain extent, replace humans’ scientific workload and provide more accurate and unbiased results to inform decision-making? Here, we discussed this question across three key aspects of microbiological research: conducting experiments, data processing, and clinical diagnosis, to obtain a broader understanding of how AI is reshaping the landscape of research, addressing both the opportunities and challenges it brings.
AI and human bias: conducting experiments
Experimentation is the cornerstone of research. An important attribute of experiments is reproducibility. Researchers can replicate results and verify conclusions, thereby enhancing reliability and scientific validity. However, in practice, lack of reproducibility of significant results by other researchers has been a notable issue (Baker 2016). There are likely several factors contributing to this. A significant example being the lack of detail provided in the methodological descriptions published in papers that often hinder reproducibility of the work. Further factors aggravating this situation include, technical limitations, human errors (e.g. unintentional misconduct and incorrect recordkeeping), and a metric-driven research culture that devalues findings that can turn out to be impactful. In terms of human errors, the research sector has gradually turned towards the use of automated platforms to conduct low-complexity experiments in a desired manner assuring consistency and precision ultimately reducing human bias-associated variables (Holowko et al. 2021). The involvement of AI has further transformed experimental workflows allowing machines to mimic human decision-making while maintaining consistency and accelerating processes that are often found difficult to achieve (Wiggins et al. 2023). One of the significant advantages of using AI in conducting experiments is the reduction of variability introduced by cognitive bias, which is often a subtle but a potentially very impactful factor in experimental outcomes. For example, in a scenario where researchers are required to select the area of live/dead fluorescent staining images for cell analysis, reserachers’ decisions may vary greatly. One researcher might exclude cells that were aggregated together. Another may have personal preference for certain numbers of cells to be used. These personally biased decisions can lead to variability in results and subsequently conclusions, ultimately affecting the reproducibility of a study. AI can be trained to have a standardized and consistent decision-making process, by which the same criteria are applied across different instances. Although current AI is not optimal, it can continually learn from new data and refine their performance. Moreover, AI has been trained to systemically review new findings and discoveries from literatures and draw conclusions (de la Torre-lopez et al. 2023, dos Santos et al. 2023). Researchers can learn from AI, draw conclusions to explore new hypotheses that may open new research areas.
The ability to combine the complexity of human-like decision-making with the uniformity of machine operations provides AI with the potential to become an invaluable tool in experimentation that not only enhances efficiency of experimental operations but also reduces human bias. The concept of combining AI with automatic research platforms to create fully or partially automatic research pipelines in microbiology research, without human intervention, is not novel (Dama et al. 2023). AI-driven automated research pipelines are certain to improve research efficiency and potentially quality. However, an area to consider is the capacity and need for creativity in these pipelines. Similar concerns have been addressed by other researchers, who examined how the probabilistic mechanisms that govern large language models (LLMs) may constrain creativity (Bender et al. 2021, Hamid 2024). They argued that LLMs’ reliance on probability-based decision-making limits LLMs’ ability to produce truly original thoughts and creative adaptation, as LLMs’ responses are bound to pre-existing patterns within the training data. This, in turn, impacts LLMs’ capacity to innovate or navigate ambiguity as effectively as human researchers (Hamid 2024). It reminds that AI's capacity to emulate the creative processes remains fundamentally limited, which should be considered when we place emphasis on the significance of AI application in science.
AI and human bias: data processing
Following experimentation and data acquisition, data analysis is an integral part of research as it provides a key step towards understanding and forming conclusions from the data obtained. It is a time-consuming process that requires precision and meticulous data handling to ensure appropriate analysis. Over the years, manual data analysis was associated with limitations, including the volume of data capable of being handled, the time required for processing, and the vulnerability to bias. Technological innovations have led to substantial advancements in AI and its potential application for sophisticated data analysis.
Machine learning, an important branch of AI, is equipped to use algorithms to extract features or patterns from large-scale data. In the context of research, machine learning has been exploited as a tool to investigate the correlation between the genome and accruing phenotype from the analysis of large datasets (Makrai et al. 2023, van Hilten et al. 2024). Indeed, the ability to perform multiomics-derived analysis on large datasets provides an important tool in the context of microbiological research. For example, antimicrobial resistance (AMR) is one of the most important challenges that the healthcare industry currently faces. The significant risk posed calls for urgent solutions to tackle AMR microbes. To battle this threat, genome screening and phenotypic prediction of AMR-associated factors are pivotal. AI has been increasingly employed to statistically learn the linkage between known genomes and corresponding antimicrobial phenotypes to predict AMR in pathogens (Kim et al. 2022, Signoroni et al. 2023, Zagajewski et al. 2023, Carl et al. 2024). Experimental genome-derived data were also used to design and develop algorithms enhancing AI-powered precision diagnostics in the medical field (Bajwa et al. 2021, Wang et al. 2024). However, this faces several obstacles, such as the requirement of a vast amount of known data to support the identification of predictors, the need for continuous database updates for machine learning due to the emergence of mutated variants, and the paucity of data analysis to support structural and functional linkages between genes and phenotypes. Moreover, it requires human intervention when genome data are transformed and selected, introducing selective and reporting bias.
Analytically, AI-driven data processing will be performed in a standardized and programmed manner; however, it is arguable whether objectivity is completely achievable. One of the concerns is whether ensuring bias identification and removal from the data is integral in the AI-driven analysis. Multiple repeats using different datasets are required to be analysed to successfully detect patterns of potential bias introduced during processing. On the contrary, there are concerns regarding trust in algorithmic programming. It is appreciated when it comes to data analysis, AI and its programs are advantageous tools that are able to enhance precision, and time-effectiveness while also allowing for the simultaneous analysis of a single or multiple large datasets, but there are concerns about accuracy of analysis and trust of the algorithm.
AI and human bias: clinical diagnosis
AI has emerged as a transformative tool in the healthcare system, supporting medical professionals across various applications, including disease screening, diagnosis, outcome prognosis, treatment planning, and drug discovery and development. Researchers have demonstrated the efficiency of AI in precisely diagnosing imaging data, such as chest radiographic images and histopathological images (Bajwa et al. 2021). The increasing adoption of AI in microbial diagnosis holds significant promise (Cheng et al. 2023, Zhang et al. 2024). AI can enhance the accuracy and speed of microbial identification in clinical samples. For instance, a convolutional neural network model has shown remarkable efficacy in detecting the presence or absence of intestinal parasites in specimens, achieving ∼98% true positive and true negative rates (Mathison et al. 2020). High-quality input data from humans are still necessary for training AI models to achieve optimal performance. However, a significant challenge lies in potential bias within any AI algorithm. This bias can arise from the insufficient data utilized to ‘train’ and develop the AI tool. If the data are not representative of an entire population due to demographic biases in the development stage, misdiagnosis or misinterpretation may result (Norori et al. 2021, Mittermaier et al. 2023). To break through the limitations in the process of AI development, model-centric AI focuses on optimizing algorithms, whereas data-centric AI focuses on utilizing comprehensive data to train effective models. While high-quality input data from human sources are essential, addressing biases remains a complex issue that necessitates a combination of both data-centric and model-centric approaches (Hamid 2023). It is also important to note that AI systems are susceptible to errors. One study reported the incidence of diagnostic errors generated by AI-powered automated history-taking system at 11% (Kawamura et al. 2022). Additionally, data privacy concerns raise significant challenges. Developing AI models necessitates vast amount of data, which may include sensitive patient information (Chu et al. 2023). Patients might not consent to use their data for AI training, leading to limits in data collection. These restrictions can introduce bias into the algorithm development, potentially affecting AI performance. To reduce errors made by AI, healthcare professionals still require a comprehensive understanding of patients and diseases. Thus, decisions made by humans remain necessary in the optimization of AI technology, ensuring it assists effectively and addresses gaps in current medical practice. For an AI development team, it is essential to ensure that data are continuously updated to represent current information. This ongoing action is crucial to decrease a risk of errors in AI utilization within clinical settings. Nonetheless, this requirement presents a challenge, as it demands a perpetual process of data collection.
As AI continues to evolve and emerge, it is essential to establish guidelines of AI usage in clinical settings. For instance, Translational Evaluation of Healthcare AI (TEHAI) is a valuable framework to guide AI applications in healthcare practice. This guideline includes capability, utility, and adoption as the main domains to evaluate using of AI comprehensively (Reddy et al. 2021). In addition to considerations of AI usage, credibility and accountability are required to address as the outcome-related issues. AI has the potential to serve clinicians for enhancing their expertise to aid in clinical diagnosis and treatment planning. Previous research mentioned above has demonstrated AI's ability to assist medical professionals in making more precise and faster diagnoses. However, when a treatment outcome is successful, the credit rightfully belongs to the clinicians. Conversely, if AI assistance leads to a misdiagnosis and subsequent treatment failure, the question arises: Who bears the responsibility, the clinician, the AI development team, or AI itself due to its self-learning? This question argues the mutual relationship between human expertise and machine intelligence in terms of liability, related to medical ethics.
Discussion: the good and the bad
We are currently in an era of rapid technological advancement, an integral and significant aspect of which is the introduction of AI in our lives and inevitably in science and research. Researchers have gained significant advances in AI technologies and AI-supported biomedical research, including molecular-level discovery, engineering and systemic bio-pathways improvement, and clinical diagnosis. The application of AI in research and clinical practice presents both opportunities and challenges. One of the advantages of AI is its consistency and precision, reducing the introduction of human intervention and performing tasks uniformly. It is notably important in accomplishing complex and repetitive tasks, in which minor variations can result in significant skewing of results. In terms of data processing, AI is revolutionizing the ability to process vast amounts of experimental datasets showing huge potential for massive data analysis, or even experimental design. Despite the influence of AI in all industries undoubtedly expanding, the technology should be implemented thoughtfully, considering its limitations and underlying blind spots beyond its current success. For instance, AI can reduce human bias, but the extent of its effect is difficult to determine. Observable biases might be reduced, but unnoticed and latent biases may persist in the shadows, such as algorithmic bias. Human bias can still be introduced at various stages of AI development. Therefore, it is reasonable to speculate: when the existing biases are removed, are new biases introduced? Does AI minimize human bias after all? There is no doubt that AI provides solutions in research, but it is not a panacea since it involves biases and uncertainties as well. Several repeats in different experimental settings, within different research groups are still required to reach objective and credible conclusions.
A major concern is that there is a growing tendency of subconscious reliance on technology for solutions, and blind trust that advanced technology intrinsically provides better outcomes than humans can. This notion could lead to complacency in the application of AI and ultimately misuse of this tool. The fact that AI at its core, is a human creation, is often overlooked. By taking into consideration AI's development, it is important to remember that the underlying algorithmic programming, training, and direction of AI are fundamentally developed and designed by humans themselves. When most scientists rely on one technology, which is created by few people, whether it is ‘trained’ by the masses or not, there is high risk of reducing diversity, which itself is a bias. These ultimately raises another important and perhaps controversial question; would it ever be possible to eliminate human bias when human intervention is so deeply rooted into the core of AI development? Undoubtedly, AI offers promising possibilities and enhances the accuracy of scientific research and clinical diagnosis. However, it is crucial to maintain an awareness of the human factors influencing AI development, application and decision making, ensuring that the scientific pursuit for objectivity remains vigilant and discerning.
Acknowledgements
We thank Prof. Mark Geoghegan and Dr Luisa Wakeling at Newcastle University for providing insightful comments and proofreading. C.Y.C. thanks the NIHR Newcastle Biomedical Research Centre (BRC) for the infrastructural support of his research lab in the School of Dental Sciences at Newcastle University. F.W. thanks Newcastle University for her PhD studentship. A.M. thanks NIHR Newcastle BRC and Newcastle University for supporting her PhD studentship. P.C. thanks Newcastle University and Mahidol University for his PhD studentship. The views expressed are those of the authors and not necessarily those of the NIHR or the Department of Health and Social Care.
Author contributions
Fengyi Wang (Conceptualization [equal], Writing – original draft [equal], Writing – review & editing [equal]), Angeliki Marouli (Conceptualization [equal], Writing – original draft [equal], Writing – review & editing [equal]), Pisit Charoenwongwatthana (Conceptualization [equal], Writing – original draft [equal], Writing – review & editing [equal]), and Chien-Yi Chang (Conceptualization [equal], Funding acquisition [lead], Resources [lead], Supervision [lead], Writing – original draft [equal], Writing – review & editing [equal])
Conflict of interest
No AI is involved in writing this article. All authors have no competing interest and approve the submission.
Funding
The funding agencies have no role in the preparation of this manuscript.
Data availability
No new data were generated or analysed in support of this Opinion article. All data and materials were published in the cited reference.