Using Beautiful Soup, I am working on a pretty standard program that parses through HTML and retrieves data based on their CSS elements. The issue I am having is that when I retrieve the CSS tag in raw form, there are tags that exist that I cannot call. Additionally, they are not available via .contents or .children despite the fact that I know they’re there.
What am I missing here. Ive never ran into this before.
The elements im looking for are the green items in BOTH the yellow and red tags, but am only getting the elements in the red box. Its almost as if the yellow box doesnt exist:
STRUCTURE
When I run the following:
fifth_level = child.div.table.find_all('tr', class_='odd')
for l in fifth_level:
print(l)
Printing l
gets me this:
<tr class="odd"><td style="width: 15%;">database_clip_id</td><td>IOP_AUD_DB_CLIP_DA40_ALERT</td><td>IOP_audio_clip_t32</td><td>database clip id</td><td></td><td></td></tr>
<tr class="odd"><td style="width: 15%;">attenuation</td><td>9</td><td>uint8</td><td>attenuation value</td><td></td><td></td></tr>
To me, its clear that there are two distinct elements, each with their own set of descendents. Where im having an issue, is I want the text from one of the elements nested within. But when I call upon each for the using this code:
fifth_level = child.div.table.find_all('tr', class_='odd')
for l in fifth_level:
# print(l)
ltitle = l.td. text
value = l.td.next_sibling.text
print('TITLE:' + str(title) + 't Name: ' + str(ltitle) + 't VALUE: ' + str(value))
I get this result:
TITLE:clip_volume[0]: IOP_AUD_DB_CLIP_DA40_ALERT Name: attenuation VALUE: 9
Referencing the attached image, why does l.td
jump to the second <tr>
tag. The expected result is that Name
should = database_clip_id and VALUE
should = IOP_AUD_DB_CLIP_DA40_ALERT.
Whats even more interesting, is that when I call on the .contents
or .children
of l
, the first <tr>
and all of its descendents are nowhere to be found! Please help!
cdodle is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
1