Table_1
looks like this:
full_name | user_id | random_column_1 |
---|---|---|
John Smith | 1234 | blah |
Joe Smith | 5678 | blah |
Jane Doe | 5978 | blah |
Mark Long | 5971 | blah |
Table_2
looks like this:
user | user_id | random_column_2 |
---|---|---|
Jane Doe | blah | |
Jane Doe | blah | |
Mark Long | blah | |
Mark long | blah |
What I want to do is populate the user_id
column in Table_2
based on the user_id
in Table_1
.
Both of these tables have additional columns that I don’t want to modify in any way.
I am running this query:
UPDATE Table_2
SET user_id = (SELECT user_id
FROM Table_1
WHERE Table_2.user = Table_1.user_id;
But I am getting this error message:
Single-row subquery returns more than one row
I know this is because there are duplicate entries for users in Table_2
. It’s not an option to de-duplicate the users in Table_2
.
Does anyone know how I can fix this?
3
considering that there is some matching value in user and full_name dataset,
UPDATE Table_2 t2
JOIN Table_1 t1
ON t2.user = t1.full_name
SET t2.user_id = t1.user_id;
1
update T
set T.user_id=X.user_id
from Table_2 as T
join
(
select full_name,max(user_id)as user_id
from table_1
group by full_name
)as x on T.user=x.full_name
This syntax is valid for MS SQL Server. Something similar can be written also for PostgreSQL
or you can also use MERGE
MERGE INTO TABLE_2 AS TGT
USING
(
select full_name,max(user_id)as user_id
from table_1
group by full_name
)AS SRC
ON TGT.user=SRC.full_name
WHEN MATCHED THEN UPDATE SET
TGT.User_id=SRC.USER_ID