Originally posted by NickFitz
View Post
- Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
- Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
test please delete
Collapse
This is a sticky topic.
X
X
Collapse
-
-
Originally posted by NickFitz View PostI'll have to work out just where the problem lies first - it could be in WP, or it could be in TinyMCE, which WP uses as an editor.
In fact it could be both... it looks something is resolving entity references when it shouldn't, so things like <xsl:template> get turned into <xsl:template>, and something else then tries to treat it as XHTML, but without being namespace-aware, so that gets turned into <xsl :template> (with a space between the namespace prefix and the colon) and then the closing </xsl:template> gets turned into </xsl>
It isn't.
You may be able to get away with it when the XML is definitely constrained to one or several pre-specified dialects. For the general case you can't. Parsing XML requires a Turing-complete parser, and regular expression engines aren't Turing-complete: they are finite automata.
When the use of namespaces is thrown into the mix, the failure of regular expressions as a means for parsing XML becomes even more apparent.
It all comes down to the common misunderstanding that XML is a text-based format for data representation, leading to the conclusion that as regular expressions are good for dealing with strings they must be good for parsing XML.
Of course XML isn't a text-based format for data representation. It is merely a data representation model that may be easily serialised into a textual format, which is not the same thing at all.
In fact, this is the very first point made in Henri Sivonen's seminal paper "HOWTO Avoid Being Called a Bozo When Producing XML".
The other points in that paper should also have been read and understood by the people responsible for the fail I am currently enduring
(BTW, if you're ever invited to "bug bash" somebody else's code, it's always worth entering a few "astral plane" Unicode characters into a form - the fail is usually epic. The last time I did this, the entire database had to be rolled back to an earlier backup before we could continue looking for bugs )Comment
-
-
Originally posted by NickFitz View PostOf course the fundamental problem is immediately apparent: somebody is labouring under the delusion that it's possible to parse XML (of which both XHTML and XSLT are dialects) using regular expressions.
It isn't.
You may be able to get away with it when the XML is definitely constrained to one or several pre-specified dialects. For the general case you can't. Parsing XML requires a Turing-complete parser, and regular expression engines aren't Turing-complete: they are finite automata.
When the use of namespaces is thrown into the mix, the failure of regular expressions as a means for parsing XML becomes even more apparent.
It all comes down to the common misunderstanding that XML is a text-based format for data representation, leading to the conclusion that as regular expressions are good for dealing with strings they must be good for parsing XML.
Of course XML isn't a text-based format for data representation. It is merely a data representation model that may be easily serialised into a textual format, which is not the same thing at all.
In fact, this is the very first point made in Henri Sivonen's seminal paper "HOWTO Avoid Being Called a Bozo When Producing XML".
The other points in that paper should also have been read and understood by the people responsible for the fail I am currently enduring
(BTW, if you're ever invited to "bug bash" somebody else's code, it's always worth entering a few "astral plane" Unicode characters into a form - the fail is usually epic. The last time I did this, the entire database had to be rolled back to an earlier backup before we could continue looking for bugs )Comment
-
Morning all
The decimation of the honeysuckle is predicted to continue today...
We need a mad axeman smiley...Last edited by zeitghost; 13 April 2009, 08:21.Comment
-
Originally posted by voodooflux View PostMorning
May your Easter eggs be plentiful. And large.
I knew there was something I'd forgotten...Comment
-
-
I see someone has posted sympathy for the dead pirates in General.
Specifically.
Personally, the only good pirate is a dead pirate...
I laughed my head off when the lot that captured that oil tanker got themselves drowned with all the money... the biter bit & all that.Comment
-
Is there anything in that Somalian tuliphole that's worth carpet bombing?
Might be cheaper in the long run & discourage the others.Comment
-
We are off to Eltham Palace in a while. Trying to go to as many English Heritage places as possible.Comment
- Home
- News & Features
- First Timers
- IR35 / S660 / BN66
- Employee Benefit Trusts
- Agency Workers Regulations
- MSC Legislation
- Limited Companies
- Dividends
- Umbrella Company
- VAT / Flat Rate VAT
- Job News & Guides
- Money News & Guides
- Guide to Contracts
- Successful Contracting
- Contracting Overseas
- Contractor Calculators
- MVL
- Contractor Expenses
Advertisers
Contractor Services
CUK News
- Which IT contractor skills will be top five in 2025? Jan 2 09:08
- Secondary NI threshold sinking to £5,000: a limited company director’s explainer Dec 24 09:51
- Reeves sets Spring Statement 2025 for March 26th Dec 23 09:18
- Spot the hidden contractor Dec 20 10:43
- Accounting for Contractors Dec 19 15:30
- Chartered Accountants with MarchMutual Dec 19 15:05
- Chartered Accountants with March Mutual Dec 19 15:05
- Chartered Accountants Dec 19 15:05
- Unfairly barred from contracting? Petrofac just paid the price Dec 19 09:43
- An IR35 case law look back: contractor must-knows for 2025-26 Dec 18 09:30
Comment