Basically I need to keep track of two clients and need pass messages between the two. I am thinking of creating a tcp connection between the clients and the server and using the server to manage this connection between the two and relay messages as needed. I need to know if my architecture is realistic in the sense that it needs to be able to scale to hold thousands of these relays.
Yes, this is feasible. In fact there are numerous real-world examples (more complex, but the same core idea) that prove it.
For example, any instant messaging network that is not P2P (e.g. Lync) does precisely this. Modern multiplayer games are similar, often using UDP/IP instead of TCP/IP. IRC servers do this.
It is a solid high-level design, but the devils are always in the details. How do you efficiently track many users? You say you want this to scale. You will likely need to consider concurrent data structures to track user connections that perform well under high load.