A diagram explaining the method in broad strokes, like explained in the caption.
Gene expression perturbation from tumor state back to normal.


How it works

Using tumor state gene expression as input, we apply a transformer-based diffusion model conditioned on normal state gene expression to transform tumor gene expression back to a normal state. Additionally, GEMDiff generates synthetic bulk RNA gene expression starting from Gaussian noise. Model pipeline: 1. Data processing: includes replacing the NA values, applying log transformation, and normalization; 2. Cluster quality assessment: examined by silhouette scores to selected well-clustered gene sets; 3. Model training: utilizes selected gene sets for training the diffusion model; 4. Gene augmentation/Gene perturbation: implemented data augmentation or gene perturbation according to the task types and employed UMAP plots for outcome display; 5. Evaluation: Validate the core genes with gene function enrichment analysis..

A diagram explaining the method in broad strokes, like explained in the caption.


Augumentation Results

Cover range of gene features from 8 to 256. A diagram explaining the method in broad strokes, like explained in the caption.