I want to transform this:
dir1 file01
dir1 file02
dir1 file03
dir2 file04
dir2 file05
dir3 file06
dir4 file07
dir4 file08
dir4 file09
dir4 file10
Into the following:
dir1 file01 file02 file03
dir2 file04 file05
dir3 file06
dir4 file07 file08 file09 file10
I do this sort of thing often enough that I would be surprised if there’s not already a command in linux that does this – I just don’t know what it is or how to search for this. What would you call this? I would call it coalesce
, but maybe that’s just me.
Anyway, I can assume that the first column is already sorted, but if it were not – it would be OK if coalesce
was dumb about processing the input.
I.E.
$ cat <<EOF | coalesce
a 1
a 2
b 3
a 4
EOF
a 1 2
b 3
a 4
Ideally, this command would also optionally indent the coalesced values on a separate line.
cat <<EOF | coalesce --level-indent 2
a 1
a 2
a 3
b 4
b 5
b 6
EOF
a
1 2 3
b
4 5 6
I suppose it would also be neat if it supported multiple levels of keys, too.
cat <<EOF | coalesce --level-indent 2
a b 1
a b 2
a b 3
a c 4
a c 5
d b 6
d b 7
d b 8
d c 1
EOF
a
b
1 2 3
c
4 5
d
b
6 7 8
c
1
I’m thinking that each line of tokens could be treated as a combination of “levels” to coalesce (at the front of the line) and “tokens” to group together.
Finally, it would be useful if the line of tokens could be considered “all levels”, or “non-grouping”. When combined with a maximum number of levels to coalesce, it would produce something like the following.
cat <<EOF | coalesce --level-indent 2 --non-grouping --levels 3
2024-05-25 01:23:45 INFO: Something happened
2024-05-25 01:23:46 INFO: Something happened
2024-05-25 01:23:46 WARN: Something strange happened
2024-05-25 01:23:46 DEBUG: Something typical happened
EOF
2024-05-25
01:23:45
INFO: Something happened
01:23:46
INFO: Something happened
WARN: Something strange happened
DEBUG: Something typical happened
I could probably write something using awk
that retains the seen levels from previous lines and collects matching tokens to be emitted once the level changes – but I’m hoping there’s already a command that does this that I just don’t know about.