Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
You are not logged in or you do not have permission to access this page. This could be due to one of several reasons:
You are not logged in. If you are already registered, fill in the form below to log in, or follow the "Sign Up" link to register a new account.
You may not have sufficient privileges to access this page. Are you trying to edit someone else's post, access administrative features or some other privileged system?
If you are trying to post, the administrator may have disabled your account, or it may be awaiting activation.
However, I strongly disagree with any knocking of lambda functions. For a start they have nothing to do with object-orientation; and for a finish, they are one of the most gloriously powerful constructs in the field of computation. Combine them (as is only natural) with Currying, and they make it feasible to completely eradicate control structures, which are the source of the vast majority of bugs
I too was going to post something similar in disgust to SY01, however - you worded it better than I.
Don't knock what you don't understand
TM
Last edited by themistry; 19 February 2010, 13:35.
I love the smell of functions returning functions that return functions in the morning
"WTF does this do?"
A code review comment from someone reading one of my curried javascript constructions. Most people find these difficult to read, and unless they were raised on functional programming, I do understand why.
This is why code has gone backwards. How inelegant is that. Object oriented nightmare. Not knocking lightng, you didn't write linq.
That's why I prefer Xpath queries. Far more readable. Linq does have its uses but it's not the magic pill Microsoft would like us to believe.
Now XPath, mans tool.
XPath FTW!
It's always struck me as odd that MS, apparently responding to the fact that many programmers are confused by the set-theoretical aspects of XPath, offer Linq instead, which seems to be a bastardisation of the set-theoretical query language SQL
However, I strongly disagree with any knocking of lambda functions. For a start they have nothing to do with object-orientation; and for a finish, they are one of the most gloriously powerful constructs in the field of computation. Combine them (as is only natural) with Currying, and they make it feasible to completely eradicate control structures, which are the source of the vast majority of bugs
I love the smell of functions returning functions that return functions in the morning
I haven't used any of .Net's XML stuff in a few years, so I don't know exactly what it supports or if I'm describing something that has already been referred to, but a good approach for large XML sources that don't need to be available in their entirety is a hybrid SAX/DOM approach.
A SAX parser reads the stream of source data and fires off events for interesting moments such as the start and end of an element, attribute or text node; you subscribe to those events and thereby build a DOM that only contains those parts of the source data that are of interest to you. Then you can query that much-simplified DOM via XPath or Linq in whatever way you choose, without incurring the memory burden of parsing the entire source data into a DOM.
Of course you may be able to do everything in the SAX event handlers, which is possibly FTW, but probably more trouble than it's worth: a two-step process is almost certainly more maintainable, as well as easier to build.
Probably off-topic, but this also comes in handy with non-XML data such as CSV: one can write a parser for that data that generates SAX events, such that a DOM can be constructed from a non-XML source
I remember a kerfuffle developing about this on certain mailing lists in the late 90s or early 00s when some purists argued that it was a corruption of the pure spirit of XML to suggest that such techniques should be used to allow non-XML data to be made use of via APIs designed for XML. My attitude was and is: FFS
EDIT: having perused the link above, it looks like XMLReader is roughly similar to SAX; I'm not sure I would have chosen MS's approach myself, as that example suggests that you have to execute decision-making code for every moment of potential interest rather than only subscribing to those events that are of actual interest, but nonetheless it appears to get around the memory problem and is presumably a widely-used technique.
Guess i'll probably stick to the Reader just in case.
Using the XmlReader also has its challenges. Of the top of my head (I may have this confused with something else) it's a forward only reader, so xpaths that require lookups across axes, counts etc are not possible.
It's also harder to work with. I'd suggest sticking with your Linq direction, but run through some larger example files to ensure its ok.
Most XML files I have worked with are sub 100Kb, so unless you are doing anything extraordinary, i'd suggest saving yourself the hastle and using Linq to XML.
LINQ is good but I tend to use the old .NET XML libraries to manipulate my XML. LINQ does enable developers to use the same methods to work with data whatever source it comes from but I think if developers are used to using XPATH, they'll tend to favour that over LINQ. One thing I like about XPATH is that you can avoid hard-coding queries; I also find it very readable.
I use LINQ to objects (business objects) a lot but I'm personally not that keen on LINQ to XML or LINQ to SQL.
When you say monster XML docs, how big are we talking?
Quite small really, not much data.
The monster bit is the data itself which is a load of separate interrelated hierarchies which i need to piece back together. This is no fun so i thought i'd have a play with Linq to see if it can help in other areas.
Cheers for the heads up though - whilst the data is currently viewed as minimal, we were planning to test with a very large dataset to see if it copes so i'll see if that causes issues.
When you say monster XML docs, how big are we talking?
Linq To XML doesn't stream the XML, meaning if your docs are 100mb in size (the infoset is approx 10x larger in memory), and your system is attempting to open a few of these at once... out of memory exceptions are likely.
Similarily, if you are trying to process 100 xml docs that are only 1mb each, thats roughly 1gb in memory.
Just something to think about in case it affects you.
Leave a comment: