I’ve been trying the following SED statement (sedex1) in seddy.dev:
sed '/^$/d ; s/^[ t]*// ; s/[ t]*$// ; /::/N ; s/::/: / ; s/n//'
and the output I get differs depending on whether I do it in one SED command or break it up and use two, first half to strip blanks lines and excess whitespace, second half to join the lines with minor substitutions to get the desired end result.
My end goal is to use this in an Applescript in macOS. Our team needs to process text like this many times a day (from client registration forms) and we need a fast and painless way to do it. we’re thinking pbcopy and pbpaste but obviously we’ll get to that once we have the SED thing under control.
Using that SED command (sedex1) on this input text (in1):
thingOne::
some text for thingOne
thingTwo::
some text for thingTwo
thingThree::
some text for thingThree
thingFour::
some text for thingFour
I get this output text (out1):
thingOne: some text for thingOne
thingTwo: some text for thingTwo
thingThree:
some text for thingThree
thingFour:
some text for thingFour
but what I’m trying to get is this output (wantedText):
thingOne: some text for thingOne
thingTwo: some text for thingTwo
thingThree: some text for thingThree
thingFour: some text for thingFour
If I use the same SED patterns but in two stages then everything behaves as expected. specifically, running this (sedex2):
sed '/^$/d ; s/^[ t]*// ; s/[ t]*$//'
on in1 produces this output (out2) :
thingOne::
some text for thingOne
thingTwo::
some text for thingTwo
thingThree::
some text for thingThree
thingFour::
some text for thingFour
and then using that (out2) as input, this SED command (sedex3) does the rest:
sed '/::/N ; s/::/: / ; s/n//'
producing the desired end result (wantedText).
What am I not understanding and/or missing about sedex1 that it doesn’t produce the desired result (wantedText) when those very same patterns used consecutively in sedex2 and then sedex3 do produce wantedText?
tfloatcp0 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
What you are doing basically is :
- First, we handle the blank lines and whitespace trimming:
sed '/^$/d; s/^[ t]*//; s/[ t]*$//'
- Then, we join the lines and adjust the colons:
sed '/::/N; s/::/: /; s/n//'
- When combined, the pattern space might behave differently. Let’s try combining the logic in a way that ensures it processes consistently.
Proposed sed command
- There are many ways to write this regex, a correction of your first command would be :
sed '/^$/d; s/^[ t]*//; s/[ t]*$//; /::/N; s/::/: /; s/n[ t]*/ /'
/^$/d;
: Delete empty lines.s/^[ t]*//;
: Remove leading whitespace from each line.s/[ t]*$//;
: Remove trailing whitespace from each line./::/N;
: Append the next line to the current line if the current line contains::
.s/::/: /;
: Replace::
with:
.s/n[ t]*/ /
: Replace the newline and any following whitespace with a single space.
YassineLbk is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.