I am a beginner with R, but I have been analysing a large data set of GPS data, made up of unique individuals (name) (approx 100 unique names) with 1,000,000+ lines of data. Each unique name has varying number of coordinates (lat and lng) each. Each unique name belongs to either group a or b. I have so far done a polygon point count analysis to analyse use of the site between groups a and b. I want to do a hierarchical cluster analysis within each group a and group b to analysis interactions within each group, and then between groups a and b.
I have been advised to do a for loop to get mean of coordinates for each unique ‘name’, and I think I can then use this data to do a hierarchical cluster analysis (either with R or QGIS?). My data is below.
structure(list(lat = c(50.39761959, 50.39757382, 50.39760433,
50.39742123, 50.39768063, 50.39740597, 50.39757382, 50.39769589,
50.39763485, 50.39763485), lng = c(-4.888685435, -4.888639658,
-4.888685435, -4.888746471, -4.88860914, -4.888883803, -4.888670176,
-4.88860914, -4.888563363, -4.888181888), time_stamp = c("15/10/2021 00:21",
"15/10/2021 00:50", "15/10/2021 01:51", "15/10/2021 02:21", "15/10/2021 02:51",
"15/10/2021 03:21", "15/10/2021 03:51", "15/10/2021 04:21", "15/10/2021 04:51",
"15/10/2021 05:21"), name = c("300005", "300005", "300005", "300005",
"300005", "300005", "300005", "B100", "B100", "B100"),
breed = c("a", "a", "a", "a", "a", "a", "a", "b", "b", "b"
)), row.names = c(NA, -10L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x000001a6ce575ae0>)
I’m especially struggling with the for-loop to get the mean coordinates.