I am trying to find a Linux command which could give me the text between [ and ]. I tried standard grep and sed but none helped for far.Say the text is in LOGS. like –
Starting mongo client..
Connecting to truststore.pki.rds.amazonaws.com (99.84.66.9:443)
global-bundle.pem 100% |********************************| 183k 0:00:00 ETA
MongoDB shell version v4.0.6
connecting to: mongodb://XXXx.amazonaws.com:27017/?authSource=XXX&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("XXXXXXX") }
MongoDB server version: 5.0.0
WARNING: shell and server versions do not match
----------------------------------------------------------------------------------------------------
Listing Existing set of collections
----------------------------------------------------------------------------------------------------
[
"10w",
"11w",
"Test1",
"Test2",
"QA_Testing",
"QueuesTest",
"TestPagination",
]
I wanted the text only this collection names inside []
Tried so far
echo "$LOGS" | grep -o "[.*]"
echo "$LOGS" | sed -n 's/.*[([^]]*)].*/1/p'
echo "$LOGS" | grep -o '[K[^]]+'
echo "$LOGS" | awk -F'[][]' '{for(i=2;i<=NF;i+=2) print $i}'
None of them helped so far. They are not showing any error but not also any output. Can anyone please help with the correct command which can help me
6
You can extract the strings between brackets and then arrange after you have them. For example;
-
Using
awk
:echo "$logs" | awk -F'[][]' '{print $2}'
-
Using
sed
:echo "$logs" | sed -n 's/.*[([^]]*)].*/1/p'
-
Using
grep -oP
:echo "$logs" | grep -oP '(?<=[)[^]]*(?=])'
They will only return “apple, pineapple, wood-apple”. If it is a necessity, you can add/concatenate/append the brackets after you retrieve them.
Edit: Based on the changes of the poster’s edit and comment, I added the new solution as follows.
Your input is $logs
can be processed such as:
- Using
awk
:
echo $logs | awk '/[/{flag=1; next} /]/{flag=0} flag {printf "%s", $0}'
/[/{flag=1; next}
this section sets a flag when [ is encountered. /]/{flag=0}
this section unsets the flag when ] is encountered. Final part prints lines while the flag is set, effectively concatenating the multiline input.
- Using
sed
:
echo $logs | sed -n '/[/,/]/p' | sed ':a;N;$!ba;s/n//g' | sed 's/.*[(.*)].*/1/'
You have to use sed
multiple times. First one finds the lines between the [ and ]. Second one joins all lines together, and the last one extracts the content between [ and ].
- Using
grep
andtr
:
echo $logs | grep -oP '(?<=[)[sS]*(?=])' | tr -d 'n'
grep
section matches all characters between [ and ] (including newlines) and tr
section removes newline characters to create a single line.
4
burcu’s answer is perfect. As I prefer using the if
statement in awk
, here is my answer:
awk '{
if ($0 == "[") {
in_bracket = 1
next
}else if ($0 == "]"){
in_bracket = 0
}
if (in_bracket) {
print $0
}
}' a.txt
2
At least for your value of LOGS
(you didn’t specify the general format of the text), the following should work in bash:
LOGS='[apple, pineapple, wood-apple]'
echo ${LOGS//[][]/}
The //
causes all occurances of the pattern to be removed. The pattern is a wildcard (i.e. glob) pattern, in this case a character class [ ... ]
, containing the two characters ][
.
$ echo "$logs" | grep -o '[.*]'
[apple, pineapple, wood-apple]
$ echo "$logs" | sed -n 's/.*([.*]).*/1/p'
[apple, pineapple, wood-apple]
$ echo "$logs" | awk -F'[][]' '{print "[" $2 "]"}'
[apple, pineapple, wood-apple]
$ echo "$logs" | awk 'match($0,/[.*]/){print substr($0,RSTART,RLENGTH)}'
[apple, pineapple, wood-apple]
$ echo "$logs" | gawk 'match($0,/[.*]/,a){print a[0]}'
[apple, pineapple, wood-apple]
$ echo "$logs" | gawk '{print gensub(/.*([.*]).*/, "\1", 1)}'
[apple, pineapple, wood-apple]
$ [[ "$logs" =~ [.*] ]] && echo "${BASH_REMATCH[0]}"
[apple, pineapple, wood-apple]
The grep
command requires the non-POSIX -o
extension as provided by GNU grep, the final 2 awk
commands require GNU awk for the non-POSIX extensions of gensub()
and the 3rd argument to match()
. Everything else will work in any version of the tools but since you’re on Linux you probably have GNU tools anyway.
The above will do what you asked for given a single pair of matched [...]
on a line as shown in your question but do different things for unmatched, multiple, nested, and/or overlapping pairs of [...]
on a line – if you can have any of those situations and can’t solve that problem yourself then ask a new question about that showing how you’d want those cases handled.
I’m using logs
instead of LOGS
for the reasons described at Correct Bash and shell script variable capitalization.
1