I couldn’t think of a better title for this but here’s my situation. I have three tables: conversations
, user_messages
and system_messages
. Their basic structure (irrelevant columns removed for brevity) is:
conversations
:
column | type
-----------------
id | int (PK)
user_messages
:
column | type
--------------------------
id | int (PK)
conversation_id | int (FK)
content | text
created_at | timestamp
systen_messages
:
column | type
--------------------------
id | int (PK)
conversation_id | int (FK)
content | text
created_at | timestamp
I am trying to generate an export of “system said X” and “player responded with Y” rows for a given conversation. The issue I’m having is finding a way to ensure that the “user responded with Y” message corresponds to the chronologically next message in the conversation, following the system message. So output would look something like:
system_said | player_said
-----------------------------------------------
Hi! | Hello
How's it going? | I'm alright, you?
I'm good! | Nice.
Attempt 1
This works, but with a poor and inefficient query, using a sub-query that references outer values.
SELECT
content AS system_said, (
SELECT content
FROM user_messages
WHERE conversation_id = sm.conversation_id AND created_at > sm.created_at
ORDER BY created_at
LIMIT 1
) AS player_said
FROM system_messages sm
WHERE sm.conversation_id = ?
Attempt 2
This is where I’m stuck. I’m trying to rewrite the above more efficiently using a proper join, but I can’t force the join to give me the chronologically next message, so I’m getting mixed up pairs like:
system_said | player_said
-----------------------------------------------
Hi! | Nice.
How's it going? | Hello
I'm good! | I'm alright, you?
Here’s the query:
SELECT
DISTINCT ON (sm) sm.id AS message,
sm.content AS system_said,
um.content AS player_said
FROM system_messages sm
JOIN (
SELECT conversation_id, created_at, content
FROM user_messages
ORDER BY created_at
) um ON um.conversation_id = sm.conversation_id AND um.created_at > sm.created_at
ORDER BY sm
WHERE c.id = ?
(I found I had to add in DISTINCT ON ()
otherwise I got duplicate rows.)
I’m sure there’s a way to do this with some sort of clever joining, just me SQL/Postgres knowledge doesn’t stretch that far.