I have a XML document being loaded into a webpage representing a single client looking like this:
<!--?xml version="1.0" encoding="UTF-8" ?-->
<html>
<head></head>
<body>
<document>
<Name>Pablo</Name>
<Surname>Salamanca</Surename>
<Age>68</Age>
<Gender>M</Gender>
</document>
</body>
</html>
Resulting page show data and nothing more like this:
Pablo Salamnca 68 M
Things to keep in mind before answering:
-
There are nigh 15 different templates for XML structure. Meaning there is no standardization. One person may have a different order of information presented and other may have different information presented entirely.
-
Each client has depending on a template 68 – 278 (exactly) attributes/elements in their respective XML.
I need a List<string>
into which I would parse the TAG values of the person only. Meaning I would have a list like this:
List[0] = "Name"
List[1] = "Surname"
List[2] = "Age"
List[3] = "Gender"
carry on ...
This is the code which I have ready:
var url = _urlMaker.GetUrl();
WebClient client = new WebClient();
client.Encoding = System.Text.Encoding.GetEncoding("utf-8");
string xml = client.DownloadString(url);
int n = 2;
List<string> xmlSplit = xml
.Split(Environment.NewLine.ToCharArray())
.Skip(n)
.ToList();
xml = string.Join(Environment.NewLine, xmlSplit);
Here I need a universal way of parsing the TAGs (or tags’ values) into a List of strings please. The best case scenario is parsing only tag values surrounding the person so no words like “document” “html” etc. get parsed, but I can probably work with the field which has these values. I did try, among several other ways, to work with document by splitting it into nodes but that has went amiss.
side note:
The reason for the whole splitting string into a list of strings and removing 1st two lines is due to an issue brought up to me by another sw engineer. The document from which I am parsing is apparently a HTML and visal studio refuses to parse it entirely. So I have forced it to look like XML which works and now in a different part of my code the parsing of values is succesful.
I could just use XmlDocument.GetElements like this:
...code...
...loading data into something like a XML document....
XmlNodeList elemList = doc.GetElementsByTagName("title");
But I cannot since I have no idea with what to replace “title”.