We are using redis streams as a messaging middleware between two processes.
We typically have a single key "event"
, and the value is a json payload.
XADD stream * "event" "{ "foo": "bar", ... }"
Lately, as we have scaled, we have found the memory requirement for our redis server is becoming quite large.
I’ve been running some tests where I compress the payload, which results in a payload size of ~10% of the uncompressed size.
The problem:
One of the tools we use to debug issues is redis-cli
.
In the old regime, running redis-cli
would result in an output similar to the following:
$ redis-cli XRANGE test.stream - + COUNT 3
1) 1) "1722441037024301192-0"
2) 1) "event"
2) "{ "time": "2024-07-31 15:50:37.024200355", "data": "fttKY6Y6DGvbpjHTw9YxrxNUd8zEbA3f12DGVGL7PiCRQ5wb24" }"
2) 1) "1722441038024875918-0"
2) 1) "event"
2) "{ "time": "2024-07-31 15:50:38.024695189", "data": "MfP8JvPZmNPAtKXk1v5X37cod5x67ChX6o22k5IDShbJZ4cjlI" }"
3) 1) "1722441039025362443-0"
2) 1) "event"
2) "{ "time": "2024-07-31 15:50:39.025331292", "data": "zEx5rPzy10ASbVRFIfyRV9B3YQ34cDvkmRcCWhUWBDwqrANpLJ" }"
A workflow we’ve become used to is to grep for a field we know is present in the payload (eg: "time"
) and then pipe the result to jq
$ redis-cli XRANGE test.stream - + COUNT 3 | grep time | jq
{
"time": "2024-07-31 15:50:37.024200355",
"data": "fttKY6Y6DGvbpjHTw9YxrxNUd8zEbA3f12DGVGL7PiCRQ5wb24"
}
{
"time": "2024-07-31 15:50:38.024695189",
"data": "MfP8JvPZmNPAtKXk1v5X37cod5x67ChX6o22k5IDShbJZ4cjlI"
}
{
"time": "2024-07-31 15:50:39.025331292",
"data": "zEx5rPzy10ASbVRFIfyRV9B3YQ34cDvkmRcCWhUWBDwqrANpLJ"
}
With compression enabled, the output of redis-cli
looks like the following:
$ redis-cli XRANGE test.stream - + COUNT 3
1) 1) "1722441173181340999-0"
2) 1) "event"
2) "x1fx8bbx00x00x00x00x00x00xffxabVP*xc9xccMUxb2RP2202xd150xd756T04xb525xb225xd63xb40460xb7xb4xb4Pxd2QPJI,Ix04)4xf0xcfrxf4xf3JJtx89bwr(L3txf6nMxcex0exf6xcbxf2x0cqxf4xaenrxf44xaeJ,5.6tKxf1xcbx8dnxc8bx0fL2QRxa8x05x00xbex82xa2x8dix00x00x00"
2) 1) "1722441174181899637-0"
2) 1) "event"
2) "x1fx8bbx00x00x00x00x00x00xffxabVP*xc9xccMUxb2RP2202xd150xd756T04xb525xb225xd13xb40xb40xb3xb402Pxd2QPJI,Ix04)xacpxf7sxf6t.xcbxce1x0ctv.Mx8e0xadHx0fxc8bxc8w-6xae,)0+-xf4xf1xf6xf5twxcaNnxf3xcaxb2xf4xc9bbOOURxa8x05x00x14x8dx80x9dix00x00x00"
3) 1) "1722441175182477580-0"
2) 1) "event"
2) "x1fx8bbx00x00x00x00x00x00xffxabVP*xc9xccMUxb2RP2202xd150xd756T04xb525xb225xd53xb4021xb6045Sxd2QPJI,Ix04)xf4xc8xf3xc9xf2xca4xf7xcd5rxactxcc3s*xf1x0cOx8b2x0fxb1tL/txf6xcd*nrxcbx8fbqOxcfxc8+xf1/rxc8(xf5xcd2NQRxa8x05x00x12x97Kaix00x00x00"
We can no longer grep for "time"
, but I have found that I can use awk
to print only every 3rd line:
$ redis-cli XRANGE test.stream - + COUNT 3 | awk 'NR % 3 == 0'
(L3t���MU�RP2202�50�56T04�25�25�3�0460���P�QPJI,I)4��r��JJtw
L2QR�����i
��VP*��MU�RP2202�50�56T04�25�25�3�0�0��02P�QPJI,I)�p�s� .��1
tv.M�0�H�w-6�,)0+-����tw�N
event
My idea was to then pipe this to gzip -d
, but it doesn’t work – gzip pukes with a crc error and length error
$ redis-cli XRANGE test.stream - + COUNT 3 | awk 'NR % 3 == 0' | gzip -d
{ "time": "2024-07-31 15:52:53.181307998", "data": "0OjANJbaDXWEPqf1CJPhWQb4" }
gzip: stdin: invalid compressed data--crc error
gzip: stdin: invalid compressed data--length error
Question:
Is there any way we can continue to use redis-cli
to pull out stream contents and decompress it on the command line?