Integrating bioinformatic strategies in spatial life science research

Numerous spaceflight biology studies have been dedicated to finding out organismal health threats in space [11]. They measured biological adaptation changes in the presence of space environment factors from multiple perspectives and provided data with multiple dimensions, including but not limited to multi-omics data. These data were incorporated into a variety of associated spaceflight biodata resource platforms [1], such as NASA’s GeneLab database (https://genelab.nasa.gov/) [12] and the Life Sciences Data Archive (https://lsda.jsc.nasa.gov/). GeneLab is a comprehensive space-related omics database that provides access to data from experiments that explore the molecular response of terrestrial biology to the spaceflight environment. The Life Sciences Data Archive is a publicly accessible active archive of data from spaceflight, flight-analog and ground-based life sciences research investigations. In addition, Earth-based human space simulation research [13, 14] will continue to produce more experimental data, which are more affordable and accessible than space missions [6]. How leveraging these accumulated data resources to reveal more comprehensive patterns of biological adaptations is the challenge of the day, making it important to manage and integrate data across multiple platforms, followed by data analysis and interpretation to achieve biological understanding and provide countermeasures.

In this review, we summarized the multi-level adaptive changes that occur in the living systems in response to space environment factors and their intrinsic connections, including molecular, cellular and systemic changes at the physiological level, as well as psychological outcomes. We also revealed many unknown parts that remain to be complemented. Furthermore, we compiled accessible metrics at each level, including omics and phenotypic data, and outlined common challenges in data analysis. Accordingly, we proposed some optional bioinformatic strategies and assessed related models/tools to provide a reference framework for the analysis of space biological data (Figure 1).

Biological adaptations to space environment factors

The effects of environmental factors such as radiation, microgravity, confinement/isolation and distance from Earth on organisms have now been explored in a number of ways, including single-factor and multi-factor studies [6, 9, 15]. We summarized the biological effects of these widely investigated environmental factors in space. In addition to causing the global environmental shift, ‘the distance from Earth’ has been singled out for its significant impact on human psychology. These studies have shown that these factors may have both psychological effects, such as increased stress and mood disturbances, as well as physical health problems, such as altered musculoskeletal structure and function, sensory-motor impairment and cardiovascular dysfunction [7, 8]. And these factors’ combined effects differ from their individual effects and require further investigation. There are numerous findings related to these effects that need to be systematically sorted out to reveal their intrinsic connection. Thus, we summarized the multi-level adaptive changes in the living systems in response to space environment factors and their intrinsic connections (Figure 2).

Figure 2

A collection of biological adaptations at various levels, including molecular features, cellular responses and systemic changes. The white arrows and lines represent cause–effect relationships or potential associations between different changes. CNS, central nervous system.

Biological effects of microgravity

All life has evolved to form its present organismal structure under constant gravity on Earth. In the microgravity environment of space, the balance between cellular structure and external forces is disturbed, leading to extensive changes at the cellular and subcellular levels [16]. Studies on mice after space flight found significantly altered genes, i.e. Gridley et al. [17] reported that the expression of apoptosis-related genes, as well as genes involved in extracellular matrix proteins and stem cell signaling proteins in mouse lung cells, was significantly altered. In addition, Hammond et al. [15] reported that the expression of genes involved in apoptosis and cell death were significantly upregulated in mouse kidneys and liver. It was found that microgravity has different impacts on apoptosis of different cell types, mediated by different signal transduction processes [18]. Changes in signal transduction in microgravity-induced apoptosis have led to new insights into the underlying regulatory mechanisms of apoptosis. And cancer researchers have discovered a new direction for cancer therapy. In most (but not all) tumor cell lines, the ability of microgravity to trigger cell apoptosis has been proven [19, 20]. However, under some circumstances, apoptosis of some cancer cells can be reduced in microgravity environments [21–23]. Overall, the mechanisms and outcomes of microgravity affecting different tumor cell types vary and need to be further investigated.

In addition to the increased probability of apoptosis, cellular changes under microgravity exposure include differentiation, adhesion, migration and proliferation. By promoting apoptosis or other changes in various cell types, microgravity may affect multiple physiological systems in astronauts, including the musculoskeletal system [24], the cardiovascular system [25], the immune system [26], the digestive system [27] and the central nervous system [28, 29]. It has also been linked to eye problems (e.g. cataracts) after space missions [30, 31]. In conclusion, microgravity requires more investigation, given the significant effects on numerous aspects of living systems.

Biological effects of radiation

Radiation exposure in spaceflight poses a major potential risk to astronauts’ health in the long run. The main effect of radiation exposure is the damage to DNA, including base damage, single-strand breaks (SSBs), double-strand breaks (DSBs), chromosomal aberrations, micronuclei and genomic instability [32]. While SSBs can be repaired by excision repair [33], DSBs involve a more complex repair process. The repair process may be subject to misrepair, further causing cell cycle arrest, cell death, mutations and chromosomal rearrangements [34, 35]. The cellular responses to DNA damage differ depending on the cell type, cell cycle stage and degree of damage [32]. Damage at varying levels in cell types causes multi-system damage, including the central nervous system, musculoskeletal system, cardiovascular system [36, 37] and immune system [38]. The carcinogenic risk of space radiation is also a major health concern for astronauts because ionizing radiation-induced genomic instability is a driving factor for radiogenic carcinogenesis [39, 40]. The degree of carcinogenic risk varies by tissue type, radiation type and age at exposure. Single particle responses have been examined more widely, whereas the impacts of mixed radiation types are less clear and lack appropriate study support. Moreover, since outer space radiation occurs in a microgravity environment, it is unknown if clustered DNA damage occurs and is repaired under their dual action.

Combined effects of multiple space environment factors

Biological effects in outer space are responses of organisms when they are exposed to multiple space environment factors simultaneously, while most studies only examine the effects of individual factors in a static environment. To fully comprehend the biological effects in space, it is necessary to accurately assess the combined effects of multiple factors. The performance of cells [41] and mouse models [9] exposed to radiation and microgravity simultaneously revealed that the dual effect posed a greater health risk than radiation alone. According to Xu et al. [9], heavy ion radiation-induced human B lymphocyte apoptosis increased in microgravity. We compiled a list of biological responses resulting from the combined effects of all environment factors in space, including the physiological changes and the psychological consequences.

Oxidative stress and redox imbalance are typical molecular features of spaceflight, induced by radiation and microgravity, which may also trigger DNA damage. And DNA damage is often correlated to apoptosis when there are defects in the DNA repair system [42]. At the physiological level, oxidative stress and redox imbalance lead to dysregulation of the cardiovascular, immune, neurological and metabolic systems. Additionally, oxidative stress is closely associated with mitochondrial dysfunction. Mitochondrial dysfunction is characterized by a reduction in the expression of the mitochondrial oxidative phosphorylation (OXPHOS) gene encoded by nuclear DNA. Moreover, oxidative stress can induce epigenetic changes through chromatin relaxation and thus regulate gene expression. Dynamic alterations in telomere length have also been observed during spaceflight, which has been linked to age-related disorders including dementia, cardiovascular disease and cancer, all of which have the potential to influence astronaut health and performance during and after long-term missions [6]. The space environment can also cause a shift in the microbiome [43, 44]. Interactions between the microbiome and the host affect key human physiological processes, including inflammatory responses, metabolic functions, hormone levels, disease susceptibility and pathogenesis [45]. The gut microbiome, for example, is implicated in the pathogenesis of numerous digestive diseases [46].

Aside from the major molecular features listed above, there are a number of functional pathways that have been linked to spaceflight health. The NF-κB pathway, for example, has been linked to recognized spaceflight-related health hazards such as immunological dysfunction, bone loss, muscle atrophy, central nervous system dysfunction and space radiation dangers [47]. Accordingly, we suggest that the space environment induces a wide range of adaptive changes at the molecular level, and many are left to be discovered. Also, researches on the combined effects of multiple spatial environmental factors are still at a preliminary stage, and more studies for multi-factor situations are needed.

Biological effects of confinement and isolation

In long-term confined/isolated environments, such as the Mars-500 mission [48] and the 180-day controlled ecological life support system (CELSS) experiment [14], many aspects of human health may be affected, including mental–emotional disturbances [49, 50], reduced muscle activity [13], changes in immune responses [51], gut microbiota [52] and metabolism [44]. In addition, mood disorders such as anxiety brought on by long-term isolation are associated with abnormal bone metabolism [53]. Confinement/isolation also disrupts circadian rhythms [54], the disruption of which may affect mood, cognition and performance [55] and further lead to additional health disturbances.

Furthermore, prolonged isolation could trigger psychological stress, which might result in a shift in biological vulnerability to radiation danger. According to studies in which mice were subjected to both psychological stress and low linear energy transfer radiation, stress improved bone marrow radiation susceptibility in some susceptible animals, but it did not affect hematological toxicity or genotoxicity in wild-type mice [32]. The mechanisms of how psychological stress modulates radiation susceptibility have not yet been elucidated. Hence, more experiments are needed to produce additional data for further research.

Advances in technology will enable exploration at farther distances from Earth, where medical and surgical events will be limited, thus endangering the safety of astronauts. As the exploration mission becomes further away from Earth, the crew may experience communication delays. A Mars mission could cause communication delays of up to 20 min with Earth. And there will be many unknown environmental factors, such as higher doses of radiation and changes in the light and dark cycles [56]. So, astronauts will be more stressed as they travel further away from Earth. The exact impact is to be supported by the conduct of relevant studies.

Multi-level measurements/data types

Multifaceted experiments were conducted to explore the effect of spatial environmental factors on organismal health, including neuroimaging, electrophysiology, biochemistry, systems biology and clinical questionnaires, thus producing large amounts of high-dimensional data. These data include but are not limited to the following: multi-omics measurements at the molecular level (epigenomics, transcriptomics, proteomics, metabolomics, microbiomics, etc.), systems level (biochemical index data, image data, electrophysiological data) and psychological level (stress surveys, mood). In this review, we have compiled various measurements and the biological issues they can reflect (Figure 3), which will assist in designing experiments to dissect the biological effects in space.

Figure 3

Multi-level measurements/data types that can be used to investigate biological effects. Multi-level measurements include multi-omics measurements at the molecular level, phenotype (Pheno) measurements at the system level and measurements of psychological (Psycho) impact. Multi-level measurements will provide a more comprehensive understanding of biological adaptations in the space environment.

Multi-omics measurement

Whether in spaceflight simulations or actual spaceflight experiments, space biologists around the world are increasingly reliant on omics approaches due to their ability to maximize the knowledge gained from rare spaceflight experiments [57]. We reviewed common omics measurements used in space biology, focusing on the biological issues they can reflect and the available detection platforms.

Epigenomics

Epigenomics can be used to detect space environment-induced reversible modifications at DNA or RNA level, such as DNA methylation, histone acetylation, RNA methylation, etc. Modifications like these perform critical regulatory roles in gene transcription and subsequent cellular functions [58]. They can also be used as biomarkers, for example, one of the earliest events in the DSB damage response is the phosphorylation of histone H2AX to produce γ-H2AX, which can be used as a sensitive tool for detecting DSB [59–61]. Space environment factors can trigger alterations in cell fate by changing these modifications, which are sometimes reversible and sometimes permanent [62]. Related techniques include the next-generation sequencing (NGS) and EPIC array to quantify epigenetic changes [63].

Transcriptomics

Transcriptomics examines genome-wide changes in RNA levels caused by the space environment. Up to 80% of the genome is transcribed to produce RNA, including both coding and non-coding RNA [64]. RNA-Seq studies enable the discovery of RNA molecules with critical roles in many physiological adaptations [65, 66] and their potential use as biomarkers or therapeutic targets. Related techniques include probe-based arrays [67, 68] and RNA-Seq [69, 70]. Furthermore, nanopore sequencing technology is quickly improving in terms of accuracy. It can be used to sequence single DNA and RNA molecules, with extra-long read lengths and high throughput [71, 72]. Instrument mass and volume, crew operating time and instrument functioning are all restricted in space. Nanopore sequencing techniques are more portable and have simpler sample preparation processes, suggesting that they might be used to perform DNA sequencing during space flights to closely monitor crew health in the future [73].

Proteomics

Proteomics allows quantification of peptide abundance, modifications and interactions. These measurements can be used to reflect functional changes at the cellular level, thus linking changes at the systemic level. Mass spectrometry (MS)-based approaches are commonly used for protein analysis and quantification [74]. Protein modifications such as glycosylation, phosphorylation and ubiquitination [75–77] can also be measured directly by MS by comparing the corresponding changes in protein mass before and after the modification [78]. Protein interactions can be discovered utilizing unbiased approaches (e.g. MS, yeast two-hybrid tests) or affinity purification methods (using antibodies or genetic tags). Affinity methods can also examine overall interactions between proteins and nucleic acids (e.g. ChIP-Seq).

Metabolomics

Metabolomics simultaneously quantifies multiple small molecule metabolic function products in cells, including amino acids, fatty acids, carbohydrates and other small molecules. Metabolite levels and relative ratios reflect metabolic functions, and deviations from the normal range are usually associated with diseases. Small molecule abundance can be quantified using MS-based methods [79–82].

Microbiome

The space environment, irregular diet and disrupted circadian rhythms may lead to changes in the ecosystem of the microbiome [83, 84], including the environmental microbiome [85], the skin microbiome, the oral microbiome [86] and the gut microbiome. The microbiome can be analyzed by amplifying and sequencing certain highly variable regions of bacterial 16S rRNA genes, or by birdshot metagenomics sequencing that sequences total DNA. Several analytical tools for NGS data targeting 16S or metagenomics analysis have been developed, such as QIIME (Quantitative Insights into Microbial Ecology) [87], which can be used to identify taxa associated with diseases or other phenotypes of interest [88].

Phenotype measurements

Phenotypes are the observable characteristics or traits of an organism and can provide valuable explanations for the consequences of living system responses to space environments. Phenotypic data can be used to link genetics and phenotype. Phenomics is a field that deals with high-dimensional phenotypic data at the organismal scale and is an important complement to genomics. The current phenotypic number throughput is low, and technological advancements can reduce costs to enhance phenotype throughput [89]. For space response assessments, we compiled commonly used multi-system phenotypic metrics.

Skeletal muscular system measurements

This includes both bone and skeletal muscle. Bone strength is reflected by measuring bone mineral density or bone mineral content [90], and changes in bone mass are interpreted using markers of bone status assessment (such as osteocalcin, OC; procollagen type I N-terminal propeptide, P1NP; procollagen type I C-terminal propeptide, P1CP; bone alkaline phosphatase, BAP; calcitonin, CT; osteoprotegerin, OPG; tartrate resistant acid phosphatase, TRAP). Skeletal muscle mass, function and muscle fiber changes are measured to assess maximum voluntary isometric contraction of the calf (mainly type I fibers) and maximum voluntary isometric force of the quadriceps/hamstrings (mainly type II fibers) [13]. Reliable non-invasive measurements of muscle function, such as muscle fiber type composition, muscle fiber size, cross-sectional area, etc., can be performed using surface electromyography [91].

Cardiovascular system measurements

Cardiovascular function is reflected by measuring heart rate variability (HRV), cardiac and macrovascular morphology and function, and endothelial status [92]. HRV is recorded using a 24-h EKG and autonomic activity is assessed by time- and frequency-domain indices of HRV analysis [93]. Left ventricular diastolic volume, output per beat, cardiac output, aortic velocity and myocardial thickness are estimated to characterize cardiac morphology and function. Carotid intima-media thickness, carotid artery dilatability and portal diameter are estimated to characterize the morphology and function of the great vessels.

Immune system measurements

Immune cells, cytokines, chemokines, proinflammatory and regulatory proteins are all involved in immune regulation and induction of inflammation in the body. Absolute leukocyte counts and percentages of each type of leukocyte are measured in whole blood samples by a hematology analyzer, and peripheral blood immunophenotyping is performed by flow cytometry [51].

Measurement of brain change

Numerous studies have revealed that spaceflight influences the brain’s macrostructure as well as the microstructure and connectivity of brain tissue. Of these, the integrity of the central nervous system and the brain is the primary concern [94]. Cortical activity before and after exercise is recorded using electroencephalography (EEG) [95]. Neuronal and especially axonal integrity is assessed using diffusion tensor imaging [96]. Non-invasive ultrasound and lumbar puncture are used to assess intracranial pressure [29]. The cognitive abilities (Wechsler Memory Scale), visuospatial working memory (Corsi Cubes test) and spatial reasoning (Kohs Cubes test) of subjects are also measured [97].

Sleep–wake cycle measurements

The duration of active arousal, sleep or wakeful rest is recorded using a wrist activity recorder [54]. Drowsiness and alertness are assessed using the Karolinska Sleepiness Scale and the Brief Psychomotor Vigilance Test.

Investigation of psychological impact

Stress levels

A study examining the relationship between stress and simulated flight performance assessed changes in stress awareness using the Stress Rating Questionnaire and evaluated crews’ acute psychological stress state using heart rate and HRV [98]. In a Mars 105-day isolation experiment, stress levels were evaluated by tonic cortisol levels, which were measured using urinary free cortisol test-kit DKO018 Lot 1730 from DIAMETRA, Milan, Italy and the Perceived Stress Scale questionnaire. These researchers also recorded sleep EEG to investigate the relationships between stress and sleep during isolation [97].

Emotional state

Subjects’ emotional state is usually measured in the form of questionnaires and can be reflected by some hormone levels [50]. During the Mars 520-day mission, crewmembers completed a series of psychological measures including the Social Desirability Scale 17, Visual Analog Scales, Profile of Mood States—Short Form, Beck Depression Inventory and Conflict Questionnaire, which described the crews’ subjective ratings of mood, psychological distress, health, stress, fatigue, sleep quality and workload [99]. In addition, levels of four plasma hormones, cortisol, 5-hydroxytryptamine, dopamine and norepinephrine were also collected and analyzed [49]. A test run with 105 days of isolation was performed prior to 520 days of isolation, and mood assessments were made using MoodMeter®, which included three dimensions: perceived physical state (PEPS), psychological state (PSYCHO) and motivational state (MOT). Meanwhile, EEG data were recorded and correlation analysis revealed a significant relationship between mood data and electrocortical activity [50].

Challenges in space biological effect data analysis

We highlighted the common characteristics of data generated from longitudinal experiments, which are also the major challenges faced in data analysis. For individual variables (e.g. the expression values of a gene at different time points), we considered the time-series properties of the environmental adaptation experimental design, as well as the range and trend of fluctuations. We believe that the fluctuation pattern of time series can reflect the process of biological adaptation to the environment. In addition, there also exist several obstacles to overcome in space life science data analysis, including but not limited to complex influencing factors, small sample size, high dimensionality as well as the heterogeneity of data and asynchronously changed features.

Limited experimental subjects

Owing to the extraordinary expense of space launch payload delivery systems and the limitations of orbital platform capacity, the number of experimental replicates and variables in space flight is very limited. Despite the relatively inexpensive Earth-based experiments in support, scientific evidence is still restricted by the limited number of experimental subjects. Small replicate numbers constrain statistical power, in which case the impact of interindividual variability on statistical outcomes must be carefully evaluated. And it is necessary to carry out more experiments in space or on Earth for the advancement of the field of knowledge. Notably, each individual is usually sampled at multiple time points for various measurements in environmental adaptation experiments.

Characteristics of individual variables produced by longitudinal studies

Time-series experiments

To detect adaptive changes in the living systems due to the space environment, the multi-level performance is usually tracked and measured before, midway and after the space flight, such as the Mars-500 mission [48], the 180-day CELSS experiment [14] and the NASA twin study [11]. The resulting measured data are time-sequenced, rather than the common case/control experimental design. Time-series experiments sample the same individual at different times and obtain multiple samples with strong autocorrelation between the measured values, more specifically the measured values at a certain time are correlated with the measured values over the previous period. In contrast, static experiments assume multiple samples are measured simultaneously and the resulting values are independent. As a result, conventional statistical analysis tools established for static data are inapplicable to time-series data analysis in operations like difference analysis, clustering analysis, missing value filling, etc. It is required to build or introduce more specialized analysis procedures.

Changing trends within the normal range

The majority of biological adaptations induced by the space environment do not necessarily progress to a pathological state in a short period, but rather show a pattern of progressive changes within the normal range [11]. However, these changes are still notable, given that these changes may break the threshold of the normal level in longer stays of future space travel missions [100]. On the other hand, effects within the normal range, although not pathological, can still cause stress in the body and thus increase the risk of pathogenesis [101].

Overall characteristics of the datasets derived from multiple measurements

Comprehensive and multidimensional data types

Because biological function requires synergistic control at multiple levels, measurements of different systems at multiple levels yield multidimensional data, ranging from molecular to systemic. The Mars-500 [48], for example, measured not only multiple omics data (e.g. epigenomics, transcriptomics) but also various biochemical indexes (e.g. cortisol levels), as well as psychological assessments (e.g. mood), with a variety of data types (discrete, continuous). Organizing the multi-level datasets and extracting the information interactions between them is one of the challenges of such large studies.

Asynchronous changes in different features

Changes in organismal systems do not always happen simultaneously, and even alterations of two genes with regulatory links are not completely synchronized. Analyzing cascading changes at different levels might provide more information on causal associations. As a result, while evaluating the connection between distinct characteristics, the issue of time delay should be considered.

Proposed research directions and methods

To provide solutions to the main challenges faced in longitudinal space experiment data analysis, we have compiled relevant bioinformatic strategies as well as available models/tools for data analysis. In cases where sample sizes are limited, it may be considered in determining whether individual differences mask environmental effects and should be pre-treated (Figure 4). Analyses can then be conducted including forecasting and difference analysis methods for univariate time series, the integration and classification of multiple variables and the identification of their regulating relationships. All of the above analysis methods have considered temporal attributes of variables from longitudinal space experiments.

Figure 4

A suggested process for mining both general pattern and individual characteristics.

Mining both general pattern and individual characteristics

Mining the general adaptation pattern of experimental subjects

It is generally accepted that a larger sample size is more beneficial for mining commonalities between samples. However, due to the specificity of the spaceflight environment, the number of subjects that can participate in the experiment is limited, and each individual could be sampled several times during the process. In this case, the presence of individual differences must be carefully evaluated. We offer a perspective here by treating samples from various individuals as separate batches. The degree of interindividual differences could be assessed like batch effects, and if significant individual differences do exist, batch effect removal methods can be used to eliminate the effect of individual differences (Figure 4). We have compiled a list of common approaches for the evaluating and correcting batch effects (Figure 5).

Figure 5

Challenges in biological effect measurement data analysis and proposed solutions, including forecasting and difference analysis methods for univariate time series, the integration and classification for multiple variables, and the evaluation of whether individual differences mask environmental effects and should be preprocessed.

Principal Variance Component Analysis (PVCA) [102] and Manifold Approximation and Projection (UMAP) [103] can be used for evaluation and visualization of batch effects. A commonly used algorithm to remove gene expression batch effects is the empirical Bayesian approach, based on which the ComBat method is more effective for small sample data [104, 105]. It can be implemented using the combat function of the sva package [106] in R. The Removing Unwanted Variation approach, which relies on negative control genes and duplicate samples to remove unwanted variance from microarray gene expression data, is more suitable for large-scale datasets [107]. BatchServer is a web server that includes autoComBat, a modified version of ComBat, as well as PVCA and UMAP, which can be used to evaluate, visualize and correct batch effects [108].

The presence of batch-correlated variation may skew analysis in two ways without batch-effect correction: false positives and false negatives. With batch-effect corrections, the results may skew according to the way how the batch effects are removed, e.g. the batch-group design, the completeness of the batch-effect removal and appropriateness of the batch-effect removal. In a multi-category sample analysis, variations across samples can come from a variety of causes, but we are only interested in differences are the result of experimental factors. If additional non-experimental factors are causing significant batch effects, we may be unable to isolate the differences of interest. In these situations, it will be helpful to remove batch effects properly, while excessive batch effect correction may make slight differences significant, leading to false positive results. Therefore, it is necessary to conduct repeated tests to determine whether or not it is appropriate to remove the batch effect. And the degree of correction of batch effects by different methods should be compared to choose the most suitable treatment method.

Explore individual adaptation patterns for each subject

It is of great importance to gain insights into general pattern; moreover, depicting individual characteristics matters, as health assessment and early warning will be highly personalized during spaceflight. To address this issue, both data accumulation and methods development would be crucial.

On the one hand, with the accumulation of spaceflight data, there will be a sufficient amount of cohort data as reference, it will be more easily and more directly to exact individual characteristics from general pattern of spaceflight cohort; thus, the limits of small sample size will be overcome eventually. However, it puts forward higher demands for the experiment design and data type consistency throughout sequential spaceflight missions. On the other hand, analysis methods aiming to model with insufficient data would help. In each specific analysis step, we mentioned some of the analysis methods applicable to small sample size data (Figure 5).

Univariate analysis method

Forecasting methods on time-series biological data

Although many tools for analyzing biological datasets with time-series properties are available, irregular input data from space experiments often lead to inaccurate clustering results, such as missing values [109], unequal time intervals and an unequal number of time points in various features [110]. Time-series forecasting methods may be the solution to the above problems. Forecasting can estimate the values for missing data points and predict the performance of specific genes at future time points where experimental values are not available.

There are few studies dedicated to the prediction of time-series gene expression data, but many statistical and machine learning-based methods have been developed for time series forecasting in other fields. ARIMA (autoregressive integrated moving average) [111] and Holt-Winters (tri-exponential smoothing) [112] are two of the most popular and widely used statistical forecasting methods in various fields. ARIMA combines autoregressive model, moving average model and different methods to describe the autocorrelation between historical data to predict the future. It assumes that the future will repeat the historical trend, which requires the time series to be stationary [111]. The Holt–Winters model is suitable for non-stationary time series containing linear trends and periodic fluctuations, using the Exponential Smoothed Moving Average calculation method to allow the model parameters to gradually adapt to changes in the non-stationary series [112].

In addition, there are time-series forecasting models based on deep learning, for example, Gluon Time Series (GluonTS) developed by Alexandrov et al. [113], a toolkit for probabilistic time-series modeling, focusing on deep learning-based models including different generative, discriminative and autoregressive models. And Long Short-Term Memory (LSTM) is an artificial recurrent neural network architecture model with the advantage of being relatively insensitive to gap length. Tripto et al. [114] evaluated Holt–Winters, ARIMA, LSTM, Artificial Neural Network (ANN) and GluonTS feedforward neural networks for forecasting time series in five sets of temporal gene expression profile data of different sizes, and found that ARIMA and ANN worked better.

Differential expression analysis of time-series data

Since the values of time-series data from longitudinal space experiments are probably not independent of each other, commonly used differential expression analysis methods such as t-tests are no longer applicable, and therefore tools for differential expression analysis dedicated to time-series data have arisen.

maSigPro (significant gene expression profile differences in time course microarray data) [115] is an R package for analyzing time-series data, supporting experiments with only time series as well as complex designs with both time series and grouping. This R package fits the relationship between factors such as time, experimental conditions and gene expression based on a multiple linear regression model and then uses stepwise regression to find the best combination of independent variables. It can identify genes with significant expression changes by statistical procedures and cluster genes with significant expression changes over time. ImpulseDE2 [116] is another differential expression algorithm for time-course sequencing experiments that simulates temporal changes with a simple continuous function single pulse (impulse) model. ImpulseDE2 employs a noise model specific to count data from multiple batches and combines it with a likelihood ratio test, leading to a much faster and more accurate inference. It performs best when looking for differential genes in time-course data in some review articles [117]. In addition, R package limma [118] is widely used in differential expression analysis, which uses linear models to determine the size and direction of the changes in gene expression. Through borrowing information across genes, it has features that make the analyses stable even for experiments with a small number of samples. Additionally, it could handle time-series data with group information.

In summary, maSigPro is suitable for the case where samples are grouped (e.g. from male and female astronauts). The performance of ImpulseDE2 may be better when grouping is not considered, while limma provides one more possible choice. In practice, given the poor robustness caused by small sample size, a validation among different kinds of methods could help.

Multivariate integration analysis method

Organization of multidimensional data types

Available sequencing technologies and computational methods allow people to obtain measurements of a wide range of analytes from the molecular to the macroscopic level. For example, at the cellular level, lymphocytes not only can be directly counted by cell sorting, but also the proportion of various lymphocytes can be estimated by computational methods based on tissue transcriptome sequencing data. A commonly used tool is the CIBERSORT method [119] which estimates the relative content of multiple immune cells by an inverse convolution algorithm. Ultimately, direct measurements or estimates of multi-level measurements can be obtained, including at molecular level, cellular level, tissue-organ level, system level, etc. The challenge is to figure out how to organize these datasets so that complex changes at various levels can be resolved.

Data combination and scaling

One of the important reasons why different analyte measurements cannot be analyzed simultaneously is that they have different magnitudes. In the NASA twin study [11], to identify complex changes over time that occur across different analyte classes, different data types were combined and scaled for the subsequent analysis. In fact, the main focus of time-series analysis is usually on trends of change rather than specific measurements, so combining and analyzing the features at different levels by removing the magnitudes can make it easier to observe them at the same time. A simple operation is to normalize the data, which only changes the range of values without influencing their distribution.

Clustering time-series biological data

While static experiments typically focus on common patterns among samples, and the most common analysis method is to cluster samples based on profiles, time-series experiments focus on patterns that change over time, necessitating the clustering of numerous time-series features. In the NASA twin study [11], the c-mean clustering analysis was performed to observe features with the same pattern of change. And several tools designed for clustering multiple time series according to patterns of variation are already available, which can be used in the analysis of multi-omics data or other biological datasets with time-series properties. A few commonly used tools are listed as follows:

R package Mfuzz (http://mfuzz.sysbiolab.eu) [120] is a clustering tool based on Fuzzy C-Means Clustering, which is a soft clustering algorithm with better noise-tolerance compared to hard clustering algorithm. It can be used to analyze transcriptomic and proteomic data with time-series properties to obtain temporal trends of gene or protein expression, and to cluster genes or proteins with similar expression patterns. TCseq package has similar functions to Mfuzz. It has more options for time-series clustering methods, including fuzzy c-mean clustering, hierarchical clustering, k-mean clustering, etc. Short Time-series Expression Miner (STEM) [121], a commonly used tool for clustering temporal expression patterns, is a Java program that can be used to cluster, compare and visualize gene expression data from short time series (typically within eight time points). STEM is based on a novel clustering algorithm. First, a unique and representative set of temporal expression sequences (pattern sequences) is selected, and then other genes are individually assigned to the pattern group closest to that gene expression profile [122]. Also, STEM can perform functional enrichment analysis on gene sets with the same temporal expression pattern.

In addition to the above tools, machine learning algorithms can also be considered. Few-shot learning uses limited numbers of samples to build a model, the key step of which is to reduce parameter dimension and combine regularization with loss functions to resolve the overfitting problem. It can be performed through various tools like Torchmeta [123], Meta-Transfer Learning for Few-Shot Learning [124], LibFewShot [125], etc. Transfer learning reuses a pre-trained model on a different but related task. It develops rapidly in deep learning for the advantage of training with much less data, which quite fits the scenario of spaceflight. And transfer learning methods can be used for time-series classification [126]. It has been applied to solve a variety of biological problems, including but not limited to medical image analysis [127], drug discovery [128], cancer morbidity prediction [129], cancer classification [130], etc.

The above tool visualizes the multidimensional features after clustering, helping understand the dynamic patterns of these biological molecules over time. Based on the resulting clusters, some interesting sets of genes or other features from the graph can be identified, such as certain clustered groups of genes showing the expected trend of increasing or decreasing over time, or observing a clear inflection point at a certain time point, etc.

Gene set scoring

In differential expression analyses, a high level of significance is usually selected and some subtle gene expression changes are ignored. Such subtle changes are generally considered insignificant, but assuming that a set of genes that perform similar functions are all slightly altered, it may result in significant changes in that function. Therefore, detecting overall differences in the activity of a functionally important gene set can compensate for subtle changes missed by single-gene differential analysis. Four unsupervised, single sample enrichment methods have been developed, Gene Set Variation Analysis [131], Pathway Level Analysis of Gene Expression [132], single sample GSEA [133] and the combined z-score [134]. For small datasets (the number of samples < 25), the singscore method may help. All genes are first sorted by expression level and then an enrichment score is calculated based on the position of the gene set in the overall sort, which can be used to assess the activity of each gene set in each sample. Once the activity of the gene sets is obtained, how they change over time can be analyzed. In addition, the idea of integrating features with similar meanings can be extended to analyze other types of high-dimensional biological data.

Association prediction between different features

In addition to tracking trends in features over time, it is valuable to broadly predict associations between different analytes, such as transcriptional regulatory networks. In biology, constructing regulatory networks is a typical approach to addressing causes and correlation issues. Predicting regulatory relationships based on time-series expression data has unique advantages and challenges. Compared with static expression data, time-series expression data of the same size contain additional information due to temporal order, which can be utilized to develop regulatory networks. However, there are also some challenges. First, time-series expression data usually detect a few time points, which have a great impact on the estimation of model parameters. Another issue is the time difference between changes in multiple dimensions, and even transcriptional regulation between genes has a delay problem. Many tools for gene regulatory network prediction based on time-series gene expression data are now available, and we exemplify some representative ones (Table 1).

Table 1

Open in new tab

Available methods/tools for regulatory network prediction

Method	Type	Open source availa-bility	Short summary	Link
LEAP	Correlation	Yes	LEAP infers gene regulatory networks based on gene co-expression relationships and considers possible lags in time.	https://cran.r-project.org/web/packages/LEAP/index.html
dynGENIE3	Regression	Yes	dynGENIE3 extends GENIE3 by considering changes in expression over time and building dynamic models based on ordinary differential equations.	https://github.com/vahuynh/dynGENIE3
Inferelator	Regression	Yes	Inferelator infers gene regulatory networks by selecting the regulators whose levels are most predictive of gene expression based on a LASSO regression model.	https://github.com/baliga-lab/cMonkeyNwInf
SWING	Granger causality	Yes	SWING is a gene regulatory network inference framework based on multivariate Granger causality and sliding window regression.	https://github.com/bagherilab/SWING
DREM	Probabilistic graph model	Yes	DREM integrates time-series gene expression data and static or dynamic transcription factor–gene interaction data (e.g. ChIP-seq data) and produces as output a dynamic regulatory map.	http://sb.cs.cmu.edu/drem/

Method	Type	Open source availa-bility	Short summary	Link
LEAP	Correlation	Yes	LEAP infers gene regulatory networks based on gene co-expression relationships and considers possible lags in time.	https://cran.r-project.org/web/packages/LEAP/index.html
dynGENIE3	Regression	Yes	dynGENIE3 extends GENIE3 by considering changes in expression over time and building dynamic models based on ordinary differential equations.	https://github.com/vahuynh/dynGENIE3
Inferelator	Regression	Yes	Inferelator infers gene regulatory networks by selecting the regulators whose levels are most predictive of gene expression based on a LASSO regression model.	https://github.com/baliga-lab/cMonkeyNwInf
SWING	Granger causality	Yes	SWING is a gene regulatory network inference framework based on multivariate Granger causality and sliding window regression.	https://github.com/bagherilab/SWING
DREM	Probabilistic graph model	Yes	DREM integrates time-series gene expression data and static or dynamic transcription factor–gene interaction data (e.g. ChIP-seq data) and produces as output a dynamic regulatory map.	http://sb.cs.cmu.edu/drem/

LEAP, lag-based expression association for pseudotime-series; dynGENIE3, dynamical GENIE3; SWING, sliding window inference for network generation; DREM, Dynamic Regulatory Events Miner.

Table 1

Open in new tab

Available methods/tools for regulatory network prediction

Method	Type	Open source availa-bility	Short summary	Link
LEAP	Correlation	Yes	LEAP infers gene regulatory networks based on gene co-expression relationships and considers possible lags in time.	https://cran.r-project.org/web/packages/LEAP/index.html
dynGENIE3	Regression	Yes	dynGENIE3 extends GENIE3 by considering changes in expression over time and building dynamic models based on ordinary differential equations.	https://github.com/vahuynh/dynGENIE3
Inferelator	Regression	Yes	Inferelator infers gene regulatory networks by selecting the regulators whose levels are most predictive of gene expression based on a LASSO regression model.	https://github.com/baliga-lab/cMonkeyNwInf
SWING	Granger causality	Yes	SWING is a gene regulatory network inference framework based on multivariate Granger causality and sliding window regression.	https://github.com/bagherilab/SWING
DREM	Probabilistic graph model	Yes	DREM integrates time-series gene expression data and static or dynamic transcription factor–gene interaction data (e.g. ChIP-seq data) and produces as output a dynamic regulatory map.	http://sb.cs.cmu.edu/drem/

Method	Type	Open source availa-bility	Short summary	Link
LEAP	Correlation	Yes	LEAP infers gene regulatory networks based on gene co-expression relationships and considers possible lags in time.	https://cran.r-project.org/web/packages/LEAP/index.html
dynGENIE3	Regression	Yes	dynGENIE3 extends GENIE3 by considering changes in expression over time and building dynamic models based on ordinary differential equations.	https://github.com/vahuynh/dynGENIE3
Inferelator	Regression	Yes	Inferelator infers gene regulatory networks by selecting the regulators whose levels are most predictive of gene expression based on a LASSO regression model.	https://github.com/baliga-lab/cMonkeyNwInf
SWING	Granger causality	Yes	SWING is a gene regulatory network inference framework based on multivariate Granger causality and sliding window regression.	https://github.com/bagherilab/SWING
DREM	Probabilistic graph model	Yes	DREM integrates time-series gene expression data and static or dynamic transcription factor–gene interaction data (e.g. ChIP-seq data) and produces as output a dynamic regulatory map.	http://sb.cs.cmu.edu/drem/

LEAP, lag-based expression association for pseudotime-series; dynGENIE3, dynamical GENIE3; SWING, sliding window inference for network generation; DREM, Dynamic Regulatory Events Miner.

One class of regulatory network prediction methods is based on correlation, for example, LEAP (lag-based expression association for pseudotime series) [135], which considers the time-lagged correlation of one gene before another and therefore predicts the directed regulatory relationship. The positive/negative coefficients represent the activation/repression regulation between genes. LEAP considers all possible time spans of lags to search for the maximum correlation for each gene pair to construct the regulatory network. There are also several regression-based methods available for determining dynamic interactions in time-series expression data. These include dynGENIE3 (dynamical GENIE3) [136], a modified method based on GENIE3 (gene network inference with ensemble of trees), which is a model-free method for inferring networks based on static expression data [137]. GENIE3 combines regression and random forest (RF) to determine the regulator for each target gene, providing excellent scalability and ease of use due to its non-parametric nature. The improved dynGENIE3 models changes in expression over time with ordinary differential equations (ODEs) and then learns the putative gene interactions using an RF regression framework. Another similar tool is Inferelator [138], which combines regression and ODE to reveal gene regulatory relationships. In addition, Granger causality test is also a time-series regulatory prediction method. Granger causality test is a statistical method for hypothesis testing, which is based on the autoregressive model in regression analysis and can be used to test whether there is a causal relationship between time series. SWING (sliding window inference for network generation) is a tool based on this statistical method [139]. Probabilistic graphical model is another widely used method for inferring interaction networks from time-series data. Based on this method, Dynamic Regulatory Events Miner (DREM) [140] integrates time-series gene expression data and static protein DNA interaction data (e.g. ChIP-Seq data) using input–output hidden Markov models to produce dynamic regulatory maps as output. Dynamic regulatory maps highlight the major divergence events in the time-series expression data and the transcription factors that may be responsible for them.

All of these methods can predict gene regulatory networks based on temporal data, and fully consider the time delay. Correlation-based methods (e.g. LEAP) are the fastest and more suitable for large datasets, while regression-based (e.g. dynGENIE3, Inferelator, SWING) or probabilistic graphical models (e.g. DREM) are more computationally intensive but are expected to be more accurate. DREM method combines data on protein–DNA interactions and the predicted regulatory relationships are more reliable. The other methods that only consider expression value associations may be extended to predict associations between different measures (e.g. between gene expression and phenotype).

As health disorders in spaceflight are complex, it is crucial to uncover the underlying biological mechanism of each individual. Liu et al. [141] developed a sample-specific network analysis method to meet this demand, which implements personalized characterization of disorders.

Future directions in spaceflight biology research

In the future, the costs and hazards of manned space flight may become more affordable to support burgeoning space tourism. The larger sample sizes and more diverse study populations will provide unprecedented opportunities for spaceflight biology research. Some humans may leave Earth and establish permanent bases and larger settlements on the Moon, Mars or elsewhere. Under longer exposure, slight changes in short-term space missions may develop health hazards [100]. When space migration programs become a reality, human populations in the new environment may have evolved to distinct genotypes, at which point an immigration genome project may even be conducted. For these ambitious frontiers, developments in the fields of space biology and aerospace medicine are crucial enablers. Furthermore, multi-omics, longitudinal profiling can capture the combined effects of multiple space environment factors as well as interactions between multiple levels, paving the way for a thorough examination of space biological adaptations.

There are still many mechanisms of space biological adaptations unknown, such as the mechanisms of telomere length dynamics and their long-term consequences. Moreover, some trends within the normal range that have been overlooked in previous studies may also be of interest. Additional studies and systematic research protocols will provide more comprehensive insights. And it still takes a lot of effort to transform detected data into interpretable results, and a systematic analysis process will speed up the process. This review clarifies each method has a specific range of applicability when compiling the known available data analysis methods. Appropriate methods must be chosen based on the data characteristics in a given study. Especially, determining whether interindividual differences should be removed requires careful assessment of the data distribution and considerations of whether this action would disrupt time trends from a single individual.

In bioinformatics, machine learning has become a popular and successful method for extracting knowledge from big data. While traditional machine learning relies on feature selection, deep learning overcomes these limitations to demonstrate advanced performance in bioinformatics problems [142], such as splice site discovery from DNA sequences [143], finger joint identification from X-ray images [144], error detection from EEG signals, etc. However, because most deep learning approaches require appropriate and balanced data to optimize numerous weight parameters in a neural network, they are usually not applicable to restricted and unbalanced data in bioinformatics [120]. This is due to the need to optimize a large number of weight parameters in neural networks. Biological studies usually contain small sample sizes that limit statistical power, in which case simple models with fewer parameters may be more suitable while more parameters may introduce additional errors and overfit. Deep learning is still making efforts to improve interpretability. Both the assessment of applicability of existing methods and the proposal of new improved methods are necessary processes to perform human spatial-omics analysis.

The study of biological adaptations has led to a deeper understanding of the needs of astronauts. In response to these needs, researchers have made many attempts to improve the quality of life of astronauts, which is the ultimate goal of future biological research in space. For example, space synthetic biology aims to leverage local resources to manufacture critical products for the crew. The Space Synthetic Biology (SynBio) project conducted at NASA’s Ames Research Center in California’s Silicon Valley is concentrating on developing in-space nutrient production methods and microbial biomanufacturing technologies that chemically convert carbon dioxide (CO₂) and water into organic compounds for ‘feeding’ microbes to produce food, pharmaceuticals, plastics, etc.

Currently, most of the researches on space response studies are scattered across tissues or systems and lack consideration of temporality. The emergence of spatiotemporal molecular medicine promises to provide more comprehensive insights by integrating clinical spatialization, temporalization, phenomics and molecular multi-omics to present a four-dimensional dynamic picture of disease [125]. The perspective of spatialization encompasses genetics, population distribution and intra-individual location. The temporal perspective considers the disease’s initiation and progression, clinical phenotype changes over time and patient response to treatment. When depicting overall body changes in space, it is essential to note that they were multisystemic related and duration time-dependent. Application of perspectives from spatio-temporal molecular medicine in space physiopathology studies may provide a holistic and dynamic picture. Some aging system research programs that combine temporal, spatial (structural organization) and molecular processes [145] may also serve as references for studying temporal changes.

Key Points

The compilation of previous biological response investigation results not only contributes to refining the process of biological adaptation to the spaceflight environment but also reveals many parts to be complemented.
The collation of multi-level measurements, data types and the biological functions they reflect can be referenced by researchers in designing biological experiments.
A summary of common features of data generated from longitudinal biological experiments related to space environment factors suggests challenges or caveats in data analysis.
This review provides strategies and models/tools to address the challenges in data analysis from a bioinformatics perspective for different analytical goals.

Funding

Space Medical Experiment Project of China Manned Space Program (HYZHXM01004); State Key Laboratory of Space Medicine Fundamentals and Application (SMFA19A03, SMFA19C01, SMFA19B01); National Natural Science Foundation of China (31871322, 31900473).

Author Biographies

Yangyang Hao is a PhD student at the Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China.

Liang Lu is an associate researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Anna Liu is a master student at the Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing,China.

Xue Lin is an associate professor at the Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China. Her research focuses on bioinformatics, data mining and machine learning.

Li Xiao is an associate researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Xiaoyue Kong is a master student at the Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing,China.

Kai Li is an assistant researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Fengji Liang is an assistant researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Jianghui Xiong is a researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Lina Qu is a researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Yinghui Li is a researcher at the State Key Laboratory of Space Medicine Fundamentals and Application, China Astronaut Research and Training Center, No. 26 Beiqing Road, Haidian District, Beijing, 100094, China.

Jian Li is a professor at the Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China. His research interests lie in bioinformatics, genomics and big data computing.

Reference

Warren

International Space Station open-source data

Patterns

2020

;

100172

Pei

Chai

, et al.

Current status of space radiobiological studies in China

Life Sci Space Res

2019

;

–

SpaceX's astronaut launch is a boost for the International Space Station

Nature

2020

;

582

. https://www.theses.fr/2021IPPAX055.

Di Battista

Towards a unified eulerian modeling framework for two-phase flows: geometrical small scale phenomena and associated flexible computing strategies

Institut Polytechnique de Paris

2021

Hassler

Zeitlin

Wimmer-Schweingruber

, et al.

Mars' surface radiation environment measured with the Mars Science Laboratory's Curiosity rover

Science

2014

;

343

1244797

Afshinnekoo

Scott

MacKay

, et al.

Fundamental biological features of spaceflight: advancing the field to enable deep-space exploration

Cell

2020

;

183

1162

–

Kandarpa

Schneider

Ganapathy

Human health during space travel: an overview

Neurol India

2019

;

S176

–

s181

Y-H

L-N

Chen

H-L

Space stress injury and related protective measures

Sheng Li Ke Xue Jin Zhan

2013

;

354

–

Zhao

, et al.

The combined effects of X-ray radiation and hindlimb suspension on bone loss

J Radiat Res

2014

;

720

–

10.

Ruyters

Braun

Stang

Success stories: incremental progress and scientific breakthroughs in life science research

Breakthroughs in Space Life Science Research Springer

2021

;

–

113

11.

Garrett-Bakelman

Darshi

Green

, et al.

The NASA Twins Study: a multidimensional analysis of a year-long human spaceflight

Science

2019

;

364

eaau8650

12.

Berrios

Galazka

Grigorev

, et al.

NASA GeneLab: interfaces for the exploration of space omics data

Nucleic Acids Res

2021

;

D1515

–

13.

Gaffney

Fomina

Babich

, et al.

The effect of long-term confinement and the efficacy of exercise countermeasures on muscle strength during a simulated mission to Mars: data from the Mars500 study

Sports Med Open

2017

;

14.

Yuan

Custaud

M-A

, et al.

Multi-system adaptation to confinement during the 180-day controlled ecological life support system (CELSS) experiment

Front Physiol

2019

;

575

15.

Hammond

Allen

Birdsall

Effects of space flight on mouse liver versus kidney: gene pathway analyses

Int J Mol Sci

2018

;

:4106.

16.

Ingber

How cells (might) sense microgravity

FASEB J

1999

;

(

Suppl

–

17.

Gridley

Mao

Tian

, et al.

Genetic and apoptotic changes in lungs of mice flown on the STS-135 mission in space

In Vivo

2015

;

423

–

18.

Prasad

Grimm

Strauch

, et al.

Influence of microgravity on apoptosis in cells, tissues, and other systems in vivo and in vitro

Int J Mol Sci

2020

;

:9373.

19.

Topal

Zamur

CJSCI

Microgravity, stem cells, and cancer: a new hope for cancer treatment

Stem Cells Int

2021

;

2021

:5566872.

20.

Nassef

Melnik

Kopp

, et al.

Breast cancer cells in microgravity: new aspects for cancer research

Int J Mol Sci

2020

;

7345

21.

Ambrosini

Adida

Altieri

A novel anti-apoptosis gene, survivin, expressed in cancer and lymphoma

Nat Med

1997

;

917

–

22.

Masiello

Cucina

Proietti

, et al.

Phenotypic switch induced by simulated microgravity on MDA-MB-231 breast cancer cells

Biomed Res Int

2014

;

2014

652434

23.

Pietsch

Wehland

, et al.

Differential gene expression profile and altered cytokine secretion of thyroid cancer cells in space

FASEB J

2014

;

813

–

24.

Grimm

Grosse

Wehland

, et al.

The impact of microgravity on bone in humans

Bone

2016

;

–

25.

Shen

Frishman

Effects of spaceflight on cardiovascular physiology and health

Cardiol Rev

2019

;

122

–

26.

Blaber

Marcal

Burns

Bioastronautics: the influence of microgravity on astronaut health

Astrobiology

2010

;

463

–

27.

Yang

Jiang

, et al.

The effects of microgravity on the digestive system and the new insights it brings to the life sciences

Life Sci Space Res

2020

;

–

28.

Panesar

Fernandez-Miranda

Kliot

, et al.

Neurosurgery and manned spaceflight

Neurosurgery

2020

;

317

–

29.

Swinney

Allison

Spaceflight and neurosurgery: a comprehensive review of the relevant literature

World Neurosurg

2018

;

109

444

–

30.

Mader

Gibson

Pass

, et al.

Optic disc edema, globe flattening, choroidal folds, and hyperopic shifts observed in astronauts after long-duration space flight

Ophthalmology

2011

;

118

2058

–

31.

Philpott

Corbett

Turnbill

, et al.

Cosmic ray effects on the eyes of rats flown on Cosmos No. 782, experimental K-007

Aviat Space Environ Med

1978

;

–

32.

Furukawa

Nagamatsu

Nenoi

, et al.

Space radiation biology for "living in space"

Biomed Res Int

2020

;

2020

4703286

33.

Hall

Giaccia

AJP

Radiobiology for the Radiologist

, 6th edn.

J Radiother Pract

2006;

:237–237.

Google Preview

34.

Sridharan

Asaithamby

Bailey

, et al.

Understanding cancer development processes after HZE-particle exposure: roles of ROS, DNA damage repair and inflammation

Radiat Res

2015

;

183

–

35.

Costes

Chiolo

Pluth

, et al.

Spatiotemporal characterization of ionizing radiation induced DNA damage foci and their relation to chromatin organization

Mutat Res

2010

;

704

–

36.

Willey

Britten

Blaber

, et al.

The individual and combined effects of spaceflight radiation and microgravity on biologic systems and functional outcomes

J Environ Sci Health C Toxicol Carcinog

2021

;

129

–

37.

Davis

Allen

Bowles

Consequences of space radiation on the brain and cardiovascular system

J Environ Sci Health C Toxicol Carcinog

2021

;

180

–

218

38.

Todd

Pecaut

Fleshner

Combined effects of space flight factors and radiation on humans

Mutat Res

1999

;

430

211

–

39.

Limoli

Ponnaiya

Corcoran

, et al.

Genomic instability induced by high and low LET ionizing radiation

Adv Space Res

2000

;

2107

–

40.

Imaoka

Nishimura

Daino

, et al.

Risk of second cancer after ion beam radiotherapy: insights from animal carcinogenesis studies

Int J Radiat Biol

2019

;

1431

–

41.

Dang

Yang

Zhang

, et al.

Simulated microgravity increases heavy ion radiation-induced apoptosis in human B lymphoblasts

Life Sci

2014

;

123

–

42.

De Zio

Cianfanelli

Cecconi

New insights into the link between DNA damage and apoptosis

Antioxid Redox Signal

2013

;

559

–

43.

Voorhies

Mark Ott

Mehta

, et al.

Study of the impact of long-duration space missions at the International Space Station on the astronaut microbiome

Sci Rep

2019

;

9911

44.

Feng

Lan

, et al.

Time series analysis of microbiome and metabolome at multiple body sites in steady long-term isolation confinement

Gut

2021

;

1409

–

45.

Siddiqui

Akbar

Khan

Gut microbiome and human health under the space environment

J Appl Microbiol

2021

;

130

–

46.

Ananthakrishnan

Singal

Chang

LJCG

, et al.

The gut microbiome and digestive health–a new frontier

Clin Gastroenterol Hepatol

2019

;

215

–

47.

Zhang

Moreno-Villanueva

Krieger

, et al.

Transcriptomics, NF-kappaB pathway, and their potential spaceflight-related health consequences

Int J Mol Sci

2017

;

:1811.

48.

Liang

Wang

, et al.

Personalized epigenome remodeling under biochemical and psychological changes during long-term isolation environment

2019

;

:932.

49.

Wang

Jing

, et al.

During the long way to Mars: effects of 520 days of confinement (Mars500) on the assessment of affective stimuli and stage alteration in mood and plasma hormone levels

PLoS One

2014

;

e87087

50.

Schneider

Brummer

Carnahan

, et al.

Exercise as a countermeasure to psycho-physiological deconditioning during long-term confinement

Behav Brain Res

2010

;

211

208

–

51.

Rykova

Feuerecker

, et al.

520-d Isolation and confinement simulating a flight to Mars reveals heightened immune responses and alterations of leukocyte phenotype

Brain Behav Immun

2014

;

203

–

52.

Turroni

Rampelli

Biagi

, et al.

Temporal dynamics of the gut microbiota in people sharing a confined environment, a 520-day ground-based space simulation, MARS500

Microbiome

2017

;

53.

Yang

Liu

Chen

, et al.

A GABAergic neural circuit in the ventromedial hypothalamus mediates chronic stress-induced bone loss

J Clin Invest

2020

;

130

6539

–

54.

Basner

Dinges

Mollicone

, et al.

Mars 520-d mission simulation reveals protracted crew hypokinesis and alterations of sleep duration and timing

Proc Natl Acad Sci U S A

2013

;

110

2635

–

55.

Guo

Chen

, et al.

Keeping the right time in space: importance of circadian clock and sleep for physiology and performance of astronauts

Mil Med Res

2014

;

56.

Palinkas

Suedfeld

Psychosocial issues in isolated and confined extreme environments

Neurosci Biobehav Rev

2021

;

126

413

–

57.

Silveira

Fazelinia

Rosenthal

, et al.

Comprehensive multi-omics analysis reveals mitochondrial stress as a central biological hub for spaceflight impact

Cell

2020

;

183

1185, e1120

–

201

58.

Piunti

Shilatifard

Epigenetic balance of gene expression by Polycomb and COMPASS families

Science

2016

;

352

aad9780

59.

Chowdhury

Keogh

Ishii

, et al.

Gamma-H2AX dephosphorylation by protein phosphatase 2A facilitates DNA double-strand break repair

Mol Cell

2005

;

801

–

60.

Fernandez-Capetillo

Lee

Nussenzweig

, et al.

H2AX: the histone guardian of the genome

DNA Repair

2004

;

959

–

61.

Rogakou

Pilch

Orr

, et al.

DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139

J Biol Chem

1998

;

273

5858

–

62.

Gut

Verdin

The nexus of chromatin regulation and intermediary metabolism

Nature

2013

;

502

489

–

63.

Roadmap Epigenomics

Kundaje

Meuleman

, et al.

Integrative analysis of 111 reference human epigenomes

Nature

2015

;

518

317

–

64.

Nature

EPCJ

An integrated encyclopedia of DNA elements in the human genome

Nature

2012

;

489

65.

Knoll

Lodish

Sun

Long non-coding RNAs as regulators of the endocrine system

Nat Rev Endocrinol

2015

;

151

–

66.

Alvarez-Dominguez

Bai

, et al.

De novo reconstruction of adipose tissue transcriptomes reveals long non-coding RNA regulators of brown adipocyte development

Cell Metab

2015

;

764

–

67.

Schulze

Downward

Navigating gene expression using microarrays--a technology review

Nat Cell Biol

2001

;

E190

–

68.

Duggan

Bittner

Chen

, et al.

Expression profiling using cDNA microarrays

Nat Genet

1999

;

–

69.

Wang

Gerstein

Snyder

RNA-Seq: a revolutionary tool for transcriptomics

Nat Rev Genet

2009

;

–

70.

Ozsolak

Milos

RNA sequencing: advances, challenges and opportunities

Nat Rev Genet

2011

;

–

71.

Wang

Zhao

Bollas

, et al.

Nanopore sequencing technology, bioinformatics and applications

Nat Biotechnol

2021

;

1348

–

72.

Lin

Hui

Mao

Nanopore technology and its applications in gene sequencing

Biosensors

2021

;

:214.

73.

McIntyre

ABR

Rizzardi

, et al.

Nanopore sequencing in microgravity

Microgravity

2016

;

16035

74.

Selevsek

Chang

Gillet

, et al.

Reproducible and consistent quantification of the Saccharomyces cerevisiae proteome by SWATH-mass spectrometry

Mol Cell Proteomics

2015

;

739

–

75.

Beck

Nielsen

Matthiesen

, et al.

Quantitative proteomic analysis of post-translational modifications of human histones

Mol Cell Proteomics

2006

;

1314

–

76.

Mann

Jensen

Proteomic analysis of post-translational modifications

Nat Biotechnol

2003

;

255

–

77.

Haas

Dephoure

, et al.

A large-scale method to measure absolute protein phosphorylation stoichiometries

Nat Methods

2011

;

677

–

78.

Choudhary

Mann

Decoding signalling networks by mass spectrometry-based proteomics

Nat Rev Mol Cell Biol

2010

;

427

–

79.

Dettmer

Aronov

Hammock

Mass spectrometry-based metabolomics

Mass Spectrom Rev

2007

;

–

80.

Patti

Yanes

Siuzdak

Innovation: metabolomics: the apogee of the omics trilogy

Nat Rev Mol Cell Biol

2012

;

263

–

81.

Steuer

Review: on the analysis and interpretation of correlations in metabolomic data

Brief Bioinform

2006

;

151

–

82.

Madsen

Lundstedt

Trygg

Chemometrics in metabolomics--a review in human disease diagnosis

Anal Chim Acta

2010

;

659

–

83.

Dong

Chen

, et al.

Simulated manned Mars exploration: effects of dietary and diurnal cycle variations on the gut microbiome of crew members in a controlled ecological life support system

PeerJ

2019

;

e7762

84.

Senatore

Mastroleo

Leys

, et al.

Effect of microgravity & space radiation on microbes

Future Microbiol

2018

;

831

–

85.

Milojevic

Weckwerth

Molecular mechanisms of microbial survivability in outer space: a systems biology approach

Front Microbiol

2020

;

923

86.

Bacci

Mengoni

Emiliani

, et al.

Defining the resilience of the human salivary microbiota by a 520-day longitudinal study in a confined environment: the Mars500 mission

Microbiome

2021

;

152

87.

Caporaso

Kuczynski

Stombaugh

, et al.

QIIME allows analysis of high-throughput community sequencing data

Nat Methods

2010

;

335

–

88.

Org

Mehrabian

Lusis

Unraveling the environmental and genetic interactions in atherosclerosis: central role of the gut microbiota

Atherosclerosis

2015

;

241

387

–

89.

Houle

Govindaraju

Omholt

Phenomics: the next challenge

Nat Rev Genet

2010

;

855

–

90.

Swaffield

Neviaser

Lehnhardt

Fracture risk in spaceflight and potential treatment options

Aerosp Med Hum Perform

2018

;

1060

–

91.

Meigal

Fomina

Electromyographic evaluation of countermeasures during the terrestrial simulation of interplanetary spaceflight in Mars500 project

Pathophysiology

2016

;

–

92.

Yuan

Custaud

, et al.

Multi-system adaptation to confinement during the 180-day Controlled Ecological Life Support System (CELSS) experiment

Front Physiol

2019

;

575

93.

Vigo

Tuerlinckx

Ogrinz

, et al.

Circadian rhythm of autonomic cardiovascular control during Mars500 simulated mission to Mars

Aviat Space Environ Med

2013

;

1023

–

94.

Roy-O’Reilly

Mulavara

Williams

A review of alterations to the brain during spaceflight and the potential relevance to crew in long-duration space exploration

NPJ Microgravity

2021

;

95.

Schneider

Abeln

Popova

, et al.

The influence of exercise on prefrontal cortex activity and cognitive performance during a simulated space flight to Mars (MARS500)

Behav Brain Res

2013

;

236

–

96.

Brem

Lutz

Vollmar

, et al.

Changes of brain DTI in healthy human subjects after 520 days isolation and confinement on a simulated mission to Mars

Life Sci Space Res

2020

;

–

97.

Gemignani

Piarulli

Menicucci

, et al.

How stressful are 105 days of isolation? Sleep EEG patterns and tonic cortisol in healthy volunteers simulating manned flight to Mars

Int J Psychophysiol

2014

;

211

–

98.

Dai

Wang

Yang

, et al.

The effects of emotional trait factors on simulated flight performance under an acute psychological stress situation

Int J Occup Saf Ergon

2021

;

–

99.

Basner

Dinges

Mollicone

, et al.

Psychological and behavioral changes during confinement in a 520-day simulated interplanetary mission to mars

PLoS One

2014

;

e93298

100.

Wan

, et al.

Needs and challenges of space medicine in China's follow-up manned space missions

Manned Spaceflight

2007

;

–

101.

Cagampang

Poore

Hanson

Developmental origins of the metabolic syndrome: body clocks and stress responses

Brain Behav Immun

2011

;

214

–

102.

Bushel

Chu

, et al.

Principal variance components analysis: estimating batch effects in microarray gene expression data

Sources and Solutions

2009

;

141

–

103.

McInnes

Healy

JJAPA

Umap: uniform manifold approximation and projection for dimension reduction

2018

104.

Johnson

Rabinovic

Adjusting batch effects in microarray expression data using empirical Bayes methods

Biostatistics

2006

;

118

–

105.

Müller

Schillert

Röthemeier

, et al.

Removing batch effects from longitudinal gene expression - quantile normalization plus ComBat as best approach for microarray transcriptome data

PLoS One

2016

;

e0156594

106.

Leek

Johnson

Parker

, et al.

The sva package for removing batch effects and other unwanted variation in high-throughput experiments

Bioinformatics

2012

;

882

–

107.

Jacob

RUV for normalization of expression array data

Bioconductor

2014

;

108.

Zhu

Sun

Zhang

, et al.

BatchServer: a web server for batch effect evaluation, visualization, and correction

J Proteome Res

2020

;

1079

–

109.

Bar-Joseph

Gerber

Gifford

, et al. A new approach to analyzing gene expression time series data. In:

Proceedings of the sixth annual international conference on Computational biology

RECOME

2002;39–48.

110.

Rueda

Bari

Ngom

. Clustering time-series gene expression data with unequal time intervals. In:

Transactions on Computational Systems Biology X. Springer

2008

;

5410

100

–

111.

Tseng

F-M

Tzeng

G-H

A fuzzy seasonal ARIMA model for forecasting

Fuzzy Sets Syst

2002

;

126

367

–

112.

Kalekar PSJKRsoiT

Time series forecasting using holt-winters exponential smoothing

Kanwal Rekhi school of information Technology

2004

;

4329008

–

113.

Alexandrov

Benidis

Bohlke-Schneider

, et al.

GluonTS: probabilistic and neural time series modeling in Python

J Mach Learn Res

2020

;

–

114.

Tripto

Kabir

Bayzid

, et al.

Evaluation of classification and forecasting methods on time series gene expression data

PLoS One

2020

;

e0241686

115.

Conesa

Nueda

Ferrer

, et al.

maSigPro: a method to identify significantly differential expression profiles in time-course microarray experiments

Bioinformatics

2006

;

1096

–

102

116.

Fischer

Theis

Yosef NJNar

Impulse model-based differential expression analysis of time course sequencing data

Nucleic Acids Res

2018

;

e119

–

117.

Spies

Renz

Beyer

, et al.

Comparative analysis of differential gene expression tools for RNA sequencing time course data

2019

;

288

–

118.

Ritchie

Phipson

, et al.

Limma powers differential expression analyses for RNA-sequencing and microarray studies

Nucleic Acids Res

2015

;

e47

–

119.

Newman

Liu

Green

, et al.

Robust enumeration of cell subsets from tissue expression profiles

Nat Methods

2015

;

453

–

120.

Kumar

Mfuzz: a software package for soft clustering of microarray data

Bioinformation

2007

;

–

121.

Ernst

Bar-Joseph

STEM: a tool for the analysis of short time series gene expression data

BMC Bioinformatics

2006

;

191

122.

Ernst

Nau

Bar-Joseph

Clustering short time series gene expression data

Bioinformatics

2005

;

(

Suppl 1

i159

–

123.

Deleu

Würfl

Samiei

et al.

Torchmeta: a meta-learning library for pytorch

arXiv preprint

2019;arXiv:1909.06576.

124.

Zhang

Ren

, et al. Computer vision and pattern recognition.

Int J Comput Math

2016;

:1265–1266.

125.

Dong

Tian

, et al.

LibFewShot: a comprehensive library for few-shot learning

arXiv preprint

2021;arXiv:2109.04898.

126.

Fawaz

Forestier

Weber

, et al. Transfer learning for time series classification. In:

2018 IEEE international conference on big data (Big Data)

IEEE

2018

;

1367

–

127.

Yari

Nguyen

HTJIA

Deep learning applied for histological diagnosis of breast cancer

IEEE Access

2020

;

162432

–

128.

Turki

Wei

Wang

JTJIA

Transfer learning approaches to improve drug sensitivity prediction in multiple myeloma patients

IEEE Access

2017

;

7381

–

129.

Song

Zheng

Y-J

Sheng

W-G

, et al.

Tridirectional transfer learning for predicting gastric cancer morbidity

IEEE Trans Neural Netw Learn Syst

2020

;

561

–

130.

Singh

Ahmed

Kumar

, et al.

Imbalanced breast cancer classification using transfer learning

IEEE/ACM Trans Comput Biol Bioinform

2020

;

–

131.

Hänzelmann

Castelo

Guinney

GSVA: gene set variation analysis for microarray and RNA-Seq data

BMC Bioinformatics

2013

;

132.

Tomfohr

Kepler

Pathway level analysis of gene expression using singular value decomposition

BMC Bioinformatics

2005

;

225

133.

Barbie

Tamayo

Boehm

, et al.

Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1

Nature

2009

;

462

108

–

134.

Lee

Chuang

H-Y

Kim

J-W

, et al.

Inferring pathway activity toward precise disease classification

PLoS Comput Biol

2008

;

e1000217

135.

Specht

JJB

LEAP: constructing gene co-expression networks for single-cell RNA-sequencing data using pseudotime ordering

Bioinformatics

2017

;

764

–

136.

Huynh-Thu

PJSR

dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data

Sci Rep

2018

;

–

137.

Petralia

Wang

Yang

, et al.

Integrative random forest for gene regulatory network inference

Bioinformatics

2015

;

i197

–

205

138.

Bonneau

Reiss

Shannon

, et al.

The Inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo

Genome Biol

2006

;

–

139.

Finkle

Bagheri NJPotNAoS

Windowed Granger causal inference strategy improves discovery of gene regulatory networks

Proc Natl Acad Sci U S A

2018

;

115

2252

–

140.

Schulz

Devanny

Gitter

, et al.

DREM 2.0: improved reconstruction of dynamic regulatory networks from time-series expression data

BMC Syst Biol

2012

;

–

141.

Liu

Wang

, et al.

Personalized characterization of diseases using sample-specific networks

Nucleic Acids Res

2016

;

e164

–

142.

Min

Lee

Yoon

Deep learning in bioinformatics

Brief Bioinform

2017

;

851

–