Sha Cao


2014.01-2017.05         Ph.D., Statistics, University of Georgia

2011.09-2014.01         Ph.D student in Bioinformatics, University of Georgia (Incomplete)

2007.09-2011.07        B.S., Mathematics, Beijing Normal University          


Professional Experiences

2017.8-now              Assistant Professor,  School of Medicine, Indiana University, Indianapolis, IN

2011.9-2017.7          Research Assistant, University of Georgia, Athens, GA


Contact Information

410 West 10th Street, Suite 3073

Indianapolis, IN  46202

E-mail: shacao(at); Office: 317-274-2602

I am interested in developing and applying statistical, mathematical and computational methods for mining cancer tissue omic data to tackle fundamental and challenging cancer biology questions including the following areas:

  • Construction of Bayesian networks to infer the dependencies between micro-environmental stresses, key metabolic changes and explosive growth of post-metastatic cancers
  • Bayesian biclustering based derivation of cell type-specific expression modules to digitally dissect mixture tissue expressions into sum of component cell types’ contributions
  • Markov Chain Monte Carlo algorithm based metabolic flux estimation for cancer cells with priors on normal cells’ flux and cancer specific kinetic parameters
  • Large scale association studies on epigenomic markers in cancers with detected micro-environmental stress markers of various types based on integrated studies of epigenomic and transriptomic data

For full publication, please go to:


Selected publications:

  1. Cao S+, Zhou Y+, Wu Y, Song TC, Alsaihati B, Xu Y. Transcription Regulation by DNA Methylation under Stressful Conditions in Human Cancer. Quantitative Biology. (2017) (In revision).
  2. Cao S+, Zhu X+, Zhang C, Qian H, Schuttler HB, Gong JP, Xu Y. Competition between DNA methylation, nucleotide synthesis and anti-oxidation in cancer versus normal tissues. Cancer Res. (2017) DOI: 10.1158/0008-5472.CAN-17-0262.
  3. Cao S+, Zhang C+, and Xu Y. Somatic Mutations May Not Be the Primary Drivers of Cancer Formation. International Journal of Cancer. (2015) DOI: 10.1002/ijc.29639. 


Manuscripts in Preparation

  1. Cao S, Yao F, Xu Y. De-convolution of tissue-based gene-expression data to cell type specific contributions and application to cancer tissue gene-expression data analyses, in preparation.
  2. Cao S, Dong N, Song TC, Xu Y. Two major contributors to cancer tissue-based gene-expression data: biological functions and anti-oxidation, and application to reliable prediction of gene-expression levels of protein complexes, in preparation.
  3. Cao S, Zhang C, Liu C, Ji F, Szemprich A, Zhang Y, Yuan Y, Teng Q, Wang C, Jiang J, Gu J, Xu Y. Oxidized Cholesterol Plays a Key Role in Driving the Rapid Growth of Metastatic Cancer, in preparation.


Department of Biostatistics | 410 W. Tenth St., Suite 3000 | Indianapolis, IN 46202 | Ph: (317) 274-2661 | Fax: (317) 274-2678