Model pipeline: (i) data processing: includes replacing the NA values, applying |$log_{2}(n+1)$| transformation, and normalization; (ii) cluster quality assessment: examined by silhouette scores to selected well-clustered gene sets; (iii) model training: utilizes selected gene sets for training the diffusion model; (iv) gene augmentation/gene perturbation: implemented data augmentation or gene perturbation according to the task types and employed UMAP plots for outcome display; (v) evaluation: validate the core genes with gene function enrichment analysis.