Summary of advantages and disadvantages of different data fusion strategies.
. | Early fusion . | Intermediate fusion . | Late fusion . |
---|---|---|---|
Description | Features from all modalities are merged with no distinction of which features come from which modality | Every modality is processed separately by its own sub-model and the individual outcomes are combined to get a single prediction | |
Pros | Use of cross-modality correlations and interactions They have lower computational complexity compared to other fusion strategies because the fusion occurs at the input level | Effectively balances the use of cross-modality and within-modality correlations and interactions, optimizing the number of parameters required. They have moderate to high computational complexity depending on the complexity of the fusion mechanism employed Robustness to missing modalities Flexibility to choose the level where specific modalities are fused Compared to early fusion, intermediate fusion may be more efficient in terms of capturing interactions between modalities while avoiding excessively high-dimensional input spaces More robust than early fusion to noisy or incomplete data due to fusion at intermediate layers | Easy computational implementation They have relatively lower computational complexity More robust to noisy or incomplete data than early fusion as each modality is processed independently |
Cons | High computational cost due to a high number of connections Risk of learning fake cross-modality correlations High number of parameters and neural connections Less robust to noisy or incomplete data due to direct combination at the input level | Risk of loss of information from cross-modality correlations | Loss of information from potential interactions and cross-modality correlations Late fusion may require more training time compared to early fusion due to the separate processing of each modality |
. | Early fusion . | Intermediate fusion . | Late fusion . |
---|---|---|---|
Description | Features from all modalities are merged with no distinction of which features come from which modality | Every modality is processed separately by its own sub-model and the individual outcomes are combined to get a single prediction | |
Pros | Use of cross-modality correlations and interactions They have lower computational complexity compared to other fusion strategies because the fusion occurs at the input level | Effectively balances the use of cross-modality and within-modality correlations and interactions, optimizing the number of parameters required. They have moderate to high computational complexity depending on the complexity of the fusion mechanism employed Robustness to missing modalities Flexibility to choose the level where specific modalities are fused Compared to early fusion, intermediate fusion may be more efficient in terms of capturing interactions between modalities while avoiding excessively high-dimensional input spaces More robust than early fusion to noisy or incomplete data due to fusion at intermediate layers | Easy computational implementation They have relatively lower computational complexity More robust to noisy or incomplete data than early fusion as each modality is processed independently |
Cons | High computational cost due to a high number of connections Risk of learning fake cross-modality correlations High number of parameters and neural connections Less robust to noisy or incomplete data due to direct combination at the input level | Risk of loss of information from cross-modality correlations | Loss of information from potential interactions and cross-modality correlations Late fusion may require more training time compared to early fusion due to the separate processing of each modality |
Summary of advantages and disadvantages of different data fusion strategies.
. | Early fusion . | Intermediate fusion . | Late fusion . |
---|---|---|---|
Description | Features from all modalities are merged with no distinction of which features come from which modality | Every modality is processed separately by its own sub-model and the individual outcomes are combined to get a single prediction | |
Pros | Use of cross-modality correlations and interactions They have lower computational complexity compared to other fusion strategies because the fusion occurs at the input level | Effectively balances the use of cross-modality and within-modality correlations and interactions, optimizing the number of parameters required. They have moderate to high computational complexity depending on the complexity of the fusion mechanism employed Robustness to missing modalities Flexibility to choose the level where specific modalities are fused Compared to early fusion, intermediate fusion may be more efficient in terms of capturing interactions between modalities while avoiding excessively high-dimensional input spaces More robust than early fusion to noisy or incomplete data due to fusion at intermediate layers | Easy computational implementation They have relatively lower computational complexity More robust to noisy or incomplete data than early fusion as each modality is processed independently |
Cons | High computational cost due to a high number of connections Risk of learning fake cross-modality correlations High number of parameters and neural connections Less robust to noisy or incomplete data due to direct combination at the input level | Risk of loss of information from cross-modality correlations | Loss of information from potential interactions and cross-modality correlations Late fusion may require more training time compared to early fusion due to the separate processing of each modality |
. | Early fusion . | Intermediate fusion . | Late fusion . |
---|---|---|---|
Description | Features from all modalities are merged with no distinction of which features come from which modality | Every modality is processed separately by its own sub-model and the individual outcomes are combined to get a single prediction | |
Pros | Use of cross-modality correlations and interactions They have lower computational complexity compared to other fusion strategies because the fusion occurs at the input level | Effectively balances the use of cross-modality and within-modality correlations and interactions, optimizing the number of parameters required. They have moderate to high computational complexity depending on the complexity of the fusion mechanism employed Robustness to missing modalities Flexibility to choose the level where specific modalities are fused Compared to early fusion, intermediate fusion may be more efficient in terms of capturing interactions between modalities while avoiding excessively high-dimensional input spaces More robust than early fusion to noisy or incomplete data due to fusion at intermediate layers | Easy computational implementation They have relatively lower computational complexity More robust to noisy or incomplete data than early fusion as each modality is processed independently |
Cons | High computational cost due to a high number of connections Risk of learning fake cross-modality correlations High number of parameters and neural connections Less robust to noisy or incomplete data due to direct combination at the input level | Risk of loss of information from cross-modality correlations | Loss of information from potential interactions and cross-modality correlations Late fusion may require more training time compared to early fusion due to the separate processing of each modality |
This PDF is available to Subscribers Only
View Article Abstract & Purchase OptionsFor full access to this pdf, sign in to an existing account, or purchase an annual subscription.