I have 2 csv
files and opened them in R
. if I print the first few lines, they would look like the following:
> head(file1)
X target_id gene_name
1 1 ENSG00000223972.4 DDX11L1
2 2 ENST00000456328.2 DDX11L1
3 6 ENST00000515242.2 DDX11L1
4 10 ENST00000518655.2 DDX11L1
5 15 ENST00000450305.2 DDX11L1
6 22 ENSG00000227232.4 WASH7P
>
>
> head(file2)
target_id length eff_length est_counts tpm
1 ENST00000632684.1 12 13 0 0
2 ENST00000434970.2 9 10 0 0
3 ENST00000448914.1 13 1 0 0
4 ENST00000415118.1 8 9 0 0
5 ENST00000631435.1 12 13 0 0
6 ENST00000390567.1 20 8 0 0
I am trying to merge (join
) them and create something like the following:
target_id gene_name length eff_length est_counts tpm
ENSG00000000003.10 TSPAN6 0 7 7 2
ENSG00000000005.5 TNMD 1 1 8 3
ENSG00000000419.8 DPM1 2 5 1 4
ENSG00000000457.9 SCYL3 1 6 8 5
ENSG00000000460.12 C1orf112 5 7 4 3
ENSG00000000938.8 FGR 3 8 0 0
but what I am getting looks like this:
target_id X gene_name length eff_length est_counts tpm
ENSG00000000003.10 2573196 TSPAN6 NA NA NA NA
ENSG00000000005.5 2573172 TNMD NA NA NA NA
ENSG00000000419.8 2430409 DPM1 NA NA NA NA
ENSG00000000457.9 180498 SCYL3 NA NA NA NA
ENSG00000000460.12 179894 C1orf112 NA NA NA NA
ENSG00000000938.8 45152 FGR NA NA NA NA
the code I am using in R
is:
merged_data <- merge(file1, file2, by = "target_id", all.x = TRUE)
do you know how to fix it?