I am new on Stackoverflow. I am a biologist, with very basic knowledge of statistics and R, and I find myself with a problem that requires some experts opinion.
I have data from an essay. The assay was performed for 6 different cell culture conditions (columns) and for 6 genes (rows). However, it was also performed 4 times per each gene (for statistical purposes).
An example of our data:
Genes | Condition 1 | Condition 2 |
---|---|---|
Gene 1 replicate 1 | 0.22 | 120 |
Gene 1 replicate 2 | 0.34 | 122 |
Gene 1 replicate 3 | 0.45 | 119 |
Gene 1 replicate 4 | 0.55 | 90 |
Gene 2 replicate 1 | 12 | 0.1 |
Gene 2 replicate 2 | 19.3 | 0.45 |
Gene 2 replicate 3 | 23.4 | NA |
Gene 2 replicate 4 | 11.7 | 0.89 |
And so on…for 6 genes and 6 conditions. Unfortunately we also have 2 NA values, which doesn’t mean they are 0, it’s just that they are undetectable by the machine (below detectable value).
Now. With these data I have to make a heatmap and for that I need z-scores. While I know how to calculate z-scores, I am not sure what to do when there are multiple replicates. Do you do the mean? In another case I saw something with Pearson (?).
In the end the idea is to get 1 z-score for each gene and each condition, so that I can make a heatmap. Moreover, there is an outlier (replicate 4), so it would probably be more meaningful to use the median rather than the mean.
The final heatmap will represent the changes of these genes expression in the different conditions. Basically in the end we want to compare the conditions based on their gene expressions, and state “this cell culture condition has this gene pattern, which is different from that cell culture condition”
I tried a couple of codes to calculate the z-scores, but I’m not sure they are correct.
It was difficult for me to tell R what mean (or median) to use to calculate z-scores, since there are replicates, and then to pull the values together.
Hope you can help me!
Marta is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.