This is my first post here at stack overflow, so if I’m doing something wrong, please tell me!
I already searched hours for this and can’t find a full solution. I can either get one or the other but not both.
I don’t want to pipe one awk command to the other.
Instead I want to use a single line awk command.
I have a curl output of a webpage, which I want to change with awk (regular awk, no gawk, mawk etc).
I want to get all lines after the words Downstream Channel Status
and before the first occurrence of </table>
.
Then between the mentioned matches all line breaks should be removed between <tr align='center'>
and </tr>
. The lines with mentioned matches don’t need to be printed. But I can work around it, if a solution cannot
This is the (shortened) output of curl (which is saved in a variable).
<html>
<head>
</head>
<body>
<blockquote>
<p>
<table border="1" cellpadding="4" cellspacing="0">
<tr><th colspan=3><b>Startup Procedure</b></th></tr>
</table><br>
</p>
<p>
<table border='1' cellpadding='4' cellspacing='0'>
<tr><th colspan=13><b>Downstream Channel Status</b></th></tr>
<tr align='center'>
<td class='hd'>Channel Index</td>
<td class='hd'>Channel ID</td>
</tr>
<tr align='center'>
<td>1</td>
<td>10</td>
</tr>
<tr align='center'>
<td>2</td>
<td>1</td>
</tr>
<tr align='center'>
<td>3</td>
<td>2</td>
</tr>
<tr align='center'>
<td>4</td>
<td>3</td>
</tr>
</table><br><br>
</p>
<p>
<table border='1' cellpadding='4' cellspacing='0'>
<tr><th colspan=9><b>Upstream Channel Status</b></th></tr>
<tr align='center'>
<td class='hd'>Channel Index</td>
<td class='hd'>Channel ID</td>
</tr>
<tr align='center'>
<td>1</td>
<td>9</td>
</tr>
<tr align='center'>
<td>2</td>
<td>10</td>
</tr>
<tr align='center'>
<td>3</td>
<td>11</td>
</tr>
</table><br><br>
</p>
</blockquote>
</body>
</html>
I almost had a solution, but then my PC crashed and now my bash history is gone.
But I tried to combine How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)? and Remove all occurrence of new line between two patterns (sed or awk?)
In the end I want to end up with these lines:
<td class='hd'>Channel Index</td><td class='hd'>Channel ID</td>
<td>1</td><td>193</td>
<td>2</td><td>1</td>
<td>3</td><td>2</td>
<td>4</td><td>3</td>
If you have a solution for me, could you explain how it works?
Thank you!
Lenny is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.