• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

Parsing XML with C# .NET2

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    Parsing XML with C# .NET2

    OK this should be easy but I am missing a trick here. I have an XML feed (example file here http://www.stormtrack.co.uk/test.xml )

    For each child element, <topwindspeed>, <topmintemperature> and <topmaxtemperature> I need to get the 10 locations and wind/temp values into a list for now

    I can get the 30 locations but not by child element.

    Can anyone point me in the right direction?? [NB. This is private hobby work]

    This is what I have so far...

    --------------------------------------------------------
    XmlDocument xDoc = new XmlDocument();
    xDoc.Load("http://www.stormtrack.co.uk/test.xml");

    List<string> locList = new List<string>();
    XmlNodeList locNodes = xDoc.GetElementsByTagName("location");
    for (int i = 0; i < locNodes.Count; i++)
    {
    locList.Add(locNodes[i].InnerXml); //gives 30 values
    }

    List<string> windList = new List<string>();
    XmlNodeList windNodes = xDoc.GetElementsByTagName("windspeed");
    for (int i = 0; i < windNodes.Count; i++)
    {
    windList.Add(windNodes[i].InnerXml); //gives 10 values
    }


    List<string> tHiList = new List<string>();
    XmlNodeList tHiNodes = xDoc.GetElementsByTagName("maxtemperature");
    for (int i = 0; i < tHiNodes.Count; i++)
    {
    tHiList.Add(tHiNodes[i].InnerXml); //gives 10 values
    }

    List<string> tLoList = new List<string>();
    XmlNodeList tLoNodes = xDoc.GetElementsByTagName("mintemperature");
    for (int i = 0; i < tLoNodes.Count; i++)
    {
    tLoList.Add(tLoNodes[i].InnerXml); //gives 10 values
    }
    www.stormtrack.co.uk - My Stormchasing website.

    #2
    What you want is XPath - http://www.w3schools.com/XPath/default.asp

    I'm a Java guy, so can't be of more specific help, but I found this article which seems to demonstrate how to do it in .NET - http://www.aspfree.com/c/a/.NET/Work...The-NET-Way/4/
    Listen to my last album on Spotify

    Comment


      #3
      XPath ==

      Stay away from it...

      Comment


        #4
        Originally posted by AtW View Post
        XPath ==

        Stay away from it...
        For once I agree with the squirrily one - it's evil

        You've come right out the other side of the forest of irony and ended up in the desert of wrong.

        Comment


          #5
          XPath FTW

          Whatever you do, stay away from things like "InnerXml", which are derived from a broken model of treating XML as if it's text.

          Contrary to popular belief, XML isn't a text file format - it's a rather complex data model. It has the specific benefit of being easily serialised as text, but thinking that a text file full of strings that look like the strings in the XML specification is what XML is is the rapid route to epic fail.

          Microsoft introduced various techniques to make it easy for people to "do" XML as if XML was text, although this is one of the leakiest abstractions they have yet inflicted on the world (as you are now discovering) in that all it does is allow you to "do" XML in a way that doesn't work unless you already know how to do it, in which case you wouldn't do it that way.

          (Sorry if this sounds patronising; I don't mean it that way, but I really should wrap this up and get to bed.)

          Right, so your XML at http://www.stormtrack.co.uk/test.xml is well-formed. From here on in I'll be working with W3C standards - MS claim to support these, so this should help, if you abandon MS's proprietary nonsense.

          To start with, your XML isn't structured in a useful manner - it's a case if "If you want to go there, don't start from here."

          So, for example,
          Code:
          <topwindspeed>
                  <location>Wyton Royal Air Force Base</location>
                  <windspeed>15</windspeed>
                  <location>Sumburgh Cape</location>
                  <windspeed>13</windspeed>
          </topwindspeed>
          is of little value - it should be
          Code:
          <topwindspeed>
                  <location>Wyton Royal Air Force Base
                      <windspeed>15</windspeed>
                  </location>
                  <location>Sumburgh Cape
                      <windspeed>13</windspeed>
                  </location>
          </topwindspeed>
          although I would be more likely to go with
          Code:
          <topwindspeed>
                  <location>
                      <name>Wyton Royal Air Force Base</name>
                      <windspeed>15</windspeed>
                  </location>
                  <location>
                      <name>Sumburgh Cape</name>
                      <windspeed>13</windspeed>
                  </location>
          </topwindspeed>
          as this would allow me, from the top level <netweather> element, to then select the <topwindspeed> and suchlike elements, and easily navigate to the locations, which have names and windspeeds as child elements.

          XML follows a hierarchical data model, and you should always seek to present your data in an appropriate hierarchical structure - a datum should not be the sibling of its owner. In this case, the windspeed is a property of the location (as indeed is the location's name) and should be represented as such by making it a child element thereof.

          If you have control over the format of the XML, you should change it such that it represents the hierarchical structure of the data.

          If you can't control the structure of the source XML then, after telling the provider thereof that they're doing it wrong, you should be able to get by with... actually, scrub that. The best thing to do is to turn the XML into something usable; the following XSLT (1.0, with most belts-and-braces in situ, but watch out for whitespace in the original XML) will do that with your sample file:

          Code:
          <?xml version="1.0" encoding="UTF-8"?>
          <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
          
              <xsl:output method="xml" encoding="UTF-8" />
              
              <xsl:template match="/">
                  <xsl:apply-templates/>
              </xsl:template>
              
              <xsl:template match="netweather">
                  <netweather>
                      <xsl:apply-templates/>
                  </netweather>
              </xsl:template>
          
              <xsl:template match="topwindspeed">
                  <topwindspeed>
                      <xsl:apply-templates select="location"/>
                  </topwindspeed>
              </xsl:template>
              
              <xsl:template match="topmintemperature">
                  <topmintemperature>
                      <xsl:apply-templates select="location"/>
                  </topmintemperature>
              </xsl:template>
              
              <xsl:template match="topmaxtemperature">
                  <topmaxtemperature>
                      <xsl:apply-templates select="location"/>
                  </topmaxtemperature>
              </xsl:template>
              
              <xsl:template match="location">
                  <location>
                      <name><xsl:value-of select="."/></name>
                      <value><xsl:value-of select="following-sibling::node()/text()"/></value>
                  </location>
              </xsl:template>
          
          </xsl:stylesheet>
          giving the following output, if one adds the indent="yes" attribute to the xsl:output element:

          Code:
          <?xml version="1.0" encoding="UTF-8"?>
          <netweather>
              
             <topwindspeed>
                <location>
                   <name>Wyton Royal Air Force Base</name>
                   <value>15</value>
                </location>
                <location>
                   <name>Sumburgh Cape</name>
                   <value>13</value>
                </location>
                <location>
                   <name>Saint Mawgan</name>
                   <value>13</value>
                </location>
                <location>
                   <name>Holbeach</name>
                   <value>12</value>
                </location>
                <location>
                   <name>Leeds And Bradford</name>
                   <value>12</value>
                </location>
                <location>
                   <name>Leuchars</name>
                   <value>10</value>
                </location>
                <location>
                   <name>Tain Range</name>
                   <value>10</value>
                </location>
                <location>
                   <name>Cosford Royal Air Force Base</name>
                   <value>10</value>
                </location>
                <location>
                   <name>Glasgow Airport</name>
                   <value>9</value>
                </location>
                <location>
                   <name>Lakenheath Royal Air Force Base</name>
                   <value>8</value>
                </location>
             </topwindspeed>
              
             <topmintemperature>
                <location>
                   <name>Aberdeen / Dyce</name>
                   <value>2</value>
                </location>
                <location>
                   <name>Disforth</name>
                   <value>3</value>
                </location>
                <location>
                   <name>Tees-Side</name>
                   <value>3</value>
                </location>
                <location>
                   <name>Wick</name>
                   <value>3</value>
                </location>
                <location>
                   <name>Topcliffe Royal Air Force Base</name>
                   <value>3</value>
                </location>
                <location>
                   <name>Kirkwall Airport</name>
                   <value>3</value>
                </location>
                <location>
                   <name>Spadeadam</name>
                   <value>4</value>
                </location>
                <location>
                   <name>Leeming</name>
                   <value>4</value>
                </location>
                <location>
                   <name>Benbecula</name>
                   <value>4</value>
                </location>
                <location>
                   <name>Linton-On-Ouse</name>
                   <value>5</value>
                </location>
             </topmintemperature>
              
             <topmaxtemperature>
                <location>
                   <name>Wyton Royal Air Force Base</name>
                   <value>12</value>
                </location>
                <location>
                   <name>Cosford Royal Air Force Base</name>
                   <value>11</value>
                </location>
                <location>
                   <name>Manston, South East</name>
                   <value>10</value>
                </location>
                <location>
                   <name>Yeovilton</name>
                   <value>10</value>
                </location>
                <location>
                   <name>Shoreham Airport</name>
                   <value>9</value>
                </location>
                <location>
                   <name>Middle Wallop</name>
                   <value>9</value>
                </location>
                <location>
                   <name>Lydd Airport</name>
                   <value>9</value>
                </location>
                <location>
                   <name>London / Gatwick Airport</name>
                   <value>9</value>
                </location>
                <location>
                   <name>Eglinton / Londonderr</name>
                   <value>9</value>
                </location>
                <location>
                   <name>Scilly, Saint MaryS</name>
                   <value>9</value>
                </location>
             </topmaxtemperature>
          
          </netweather>
          With the document having that structure, it should be a lot easier to find your way around it. If you have existing files but can move to the new XML structure for the future, simply run your existing files through that XSLT and re-save; if you're stuck with the old XML structure, then pre-process the input file with that XSLT. Then select all the location elements into the list you originally wanted, use their child elements to get at the names and values, and get on with the rest of your life.

          Oh, and those bits like following-sibling::node()/text()? That's where I began: XPath FTW!

          Comment


            #6
            Wow many thanks for a full an concise reply - I have no control over the source XML feed so I will do as you suggest and reformat it via an xlst - I *think* then that i can lift out the nodes as required.

            One other option is for my to just bung the raw XML into an SQL2005 table and querry out the info that I need for there - but I alway find this approch a pain!
            www.stormtrack.co.uk - My Stormchasing website.

            Comment


              #7
              Generally I prefer the XPath approach as well (although I have previously experienced quite a big performance problem [no doubt due to my query] when dealing with a large result set).

              If you are going down the code root in .net, the XmlNode object has a PreviousSibling property which will get you the "location node" for each of the nodes in your GetElementsByTagName loops.

              The below works as expected.

              Code:
                          XmlDocument doc = new XmlDocument();
                          string xml = "<test><location>Home</location><next>123</next></test>";
                          doc.LoadXml(xml);
              
                          XmlNode node = doc.GetElementsByTagName("next")[0];
                          
                          Console.WriteLine(node.PreviousSibling.Name);
                          Console.WriteLine(node.PreviousSibling.InnerText);
              Last edited by Jaws; 24 December 2008, 11:56.

              Comment


                #8
                Already Pm'd this to wxman, but here's a possible solution.

                Feeling generous at christmas and as this is my first post, I thought I'd post a compilable solutuion.

                Basically a set of Info "records" are created one per location, containing the windspeed, min and max temperature for each location.

                Each location element, is checked to see if it has a windspeed, mintemperature or maxtemperature sibling and this is used to populate the appropriate property of the info "record".

                This can be compiled as a console application, but you will need to change the xml source file from c:\temp\xml.xml to http://www.stormtrack.co.uk/test.xml


                Madeoff

                =============== Code starts (indentation may be lost)=============
                using System;
                using System.Collections.Generic;
                using System.Text;
                using System.Xml;
                namespace CUK
                {
                class Program
                {
                class Info
                {
                public string Location = "";
                public string Windspeed = "<none>";
                public string MinTemp = "<none>";
                public string MaxTemp = "<none>";
                public Info(string loc)
                {
                Location = loc;
                }
                }
                static void Main(string[] args)
                {
                XmlDocument doc=new XmlDocument();
                doc.Load(@"C:\temp\xml.xml");

                Dictionary<string, Info> locInfoMap = new Dictionary<string, Info>();



                XmlNodeList nodeList = doc.SelectNodes("/netweather/*/location");
                foreach (XmlElement locElem in nodeList)
                {
                //Ensure you have a map entry for this location
                string loc = locElem.InnerText;
                if (locInfoMap.ContainsKey(loc) == false) locInfoMap.Add(loc, new Info(loc));

                //get location info entry
                Info locInfo = locInfoMap[loc];

                //Look for next sibling element named windspeed and set its value if found.
                XmlElement windSpeedElem = locElem.SelectSingleNode("following-sibling::windspeed") as XmlElement;
                if(windSpeedElem!=null) locInfo.Windspeed=windSpeedElem.InnerText;

                //Do as above for min/max temperature

                XmlElement minTempElem = locElem.SelectSingleNode("following-sibling::mintemperature") as XmlElement;
                if (minTempElem != null) locInfo.MinTemp = minTempElem.InnerText;

                XmlElement maxTempElem = locElem.SelectSingleNode("following-sibling::maxtemperature") as XmlElement;
                if (maxTempElem != null) locInfo.MaxTemp = maxTempElem.InnerText;

                }

                int iCount = 0;
                foreach (string location in locInfoMap.Keys)
                {
                ++iCount;
                Info locInfo = locInfoMap[location];
                Console.WriteLine("Location ({0}): {1}; Windspeed: {2}; Min temp {3}; Max temp {4}",
                iCount,locInfo.Location, locInfo.Windspeed, locInfo.MinTemp, locInfo.MaxTemp);

                }

                Console.WriteLine("Total Locations: "+ iCount);
                }
                }
                }

                Comment


                  #9
                  Originally posted by AtW View Post
                  XPath ==

                  Stay away from it...
                  Forgot to tell us why, like the person afterwards.

                  Comment


                    #10
                    Originally posted by jkoder View Post
                    Forgot to tell us why, like the person afterwards.
                    Run a board search for Xpath topics - there should be a big one created by me about 2 years ago... yes, it is that

                    It was about XSL but XPath is closely related -

                    http://forums.contractoruk.com/gener...ighlight=xpath

                    I was doing a contract job at the time and it paid well but it would need to be x10 the rate for me to even consider touching it with a bargepole
                    Last edited by AtW; 25 December 2008, 14:28.

                    Comment

                    Working...
                    X