i’m a biologist, and i’m interested in morphological differences between individuals
Currently, i’m working with something called aspect ratio of the caudal fin. I want to identify if individuals with similar size have different aspect ratios.
However, the aspect ratio is long known to have a positive correlation with fish size. Therefore, my first step was remove the influence of size in aspect ratio by creating a new parameter: the resid from a simple linear model between aspect ratio and fish size.
Then, i wanted to identify different groups in my datasets based on this resid values. For that I created a subset with only the resid parameter, and i thought clustering would be the best way to categorize individuals in groups.
My datasets:
| Dataset | nrows |
| ------- | ------- |
| 1 | 101 |
| 2 | 55 |
| 3 | 102 |
Knowing my datasets (each one a species) are small, to determine the optimal number of clusters (k) i’ve applied the NbClust function with a hierarchical method.
NbClust(subset, min.nc = 2, max.nc = 10, method = "ward.D2")
Once with the num_clusters, I did the analysis, plotted the dendrogram, eliminated the outliers and repeated the process.
dist_matrix <- dist(subset)
hc <- hclust(dist_matrix, method = "ward.D2")
plot(hc, main = paste("Dendrograma com", num_clusters, "Clusters"),
xlab = "Observações", ylab = "Distância Euclidiana")
rect.hclust(hc, k = num_clusters, border = "red")
Then, i transformed the clusters in categories in my original dataset
clusters <- cutree(hc, k = num_clusters)
hast$Cluster <- clusters
hast$Cluster<-factor(hast$Cluster)
Finally, i wanted to compare the clusters of individuals
anova<-aov(resid~Cluster, data = dataset)
TukeyHSD(anova)
I don’t know if my approach is correct, neither if clustering is the better option. Therefore, i want help from people more experienced in statistics
Pedro Augusto de Souza Wolf is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.