I am attempting to change the format of my dataset from long to wide format in order to create a matrix. The data is currently structured as:
year | data.type | plot | species | presence |
---|---|---|---|---|
2024 | nc | 35 | Acer rubrum | 1 |
2024 | nc | 35 | Carya glabra | 1 |
2024 | nc | 36 | Acer rubrum | 1 |
2024 | nc | 118 | Carya aquatica | 1 |
2024 | nc | 39 | Acer rubrum | 1 |
2024 | nc | 116 | Carya glabra | 1 |
2024 | nc | 33 | Acer rubrum | 1 |
2024 | nc | 112 | Carya glabra | 1 |
This is however a simplified version as there are hundreds of unique species in this dataset, but often the same species are recorded many times in different plots.
I would like to restructure the data so that each unique species is a different column and the data attributed is presence/absence data such as:
year | data.type | plot | Acer rubrum | Carya aquatica | Carya glabra |
---|---|---|---|---|---|
2024 | nc | 35 | 1 | 0 | 1 |
2024 | nc | 118 | 0 | 1 | 0 |
I assume that I will need to create a loop code for this, but I’m not sure exactly where to start and whether melt, reshape, or another similar function would be best for this sort of thing. Thanks in advance!
I have tried some basic code with melt, but of course am expecting that I need more sophisticated code for this dataset.
Marshallia is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.