I am writing a PHP scraping program. The program works smoothly for me but I found the scraping result slightly differs from my expectation.
Here is my script
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $eng_SCCW_array["Here is my website"]);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$html = curl_exec($ch);
$doc = new DOMDocument();
@ $doc->loadHTML($html);
$elements_content = $doc->textContent;
echo $elements_content."</br>"."</br>";
And here is the scraping result:
The problem is, some white space is missed as any ‘br’ will not be read by the script. However, this would make the data process later become very complicated. I want to split the scraping result as if the image below. But how shall I do it?