Originally posted by suityou01
View Post
- Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
- Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!
Anyone any good at regexp?
Collapse
X
-
-
If you have any say in the format of the incoming string, I'd try and steer them into using a CSV format from which each field can be split by using the built-in C# CSV Parser or a 3rd party CSV parser.Originally posted by original PM View PostYeah I was a PM so have no technical knowledge at all...
but when will it be completed by I have a box to tick?

You may even be able to specify the delimiter character with those. (The default will be commas, but you may prefer colons.)
Use of a proper CSV parser avoids issues with quoted delimiters, e.g. "a,b","c","d" which actually comprise only the three fields "a,b" and "c" and "d". It also even allows multi-line fields.
The string parsing, however you do it, and the database handling are two different issues. So I would try and keep those conceptually separate.
edit: I think CSV even supports two levels of delimiter. So you could parse something like "a=1,b=3,c=4" into a handy structure by specifying "," as the top-level delimiter and "=" as the next level delimiter. But that said, CSV means different things to different people and probably not all parser implementations support multi-level parsing.Last edited by OwlHoot; 30 July 2014, 15:11.Work in the public sector? Read the IR35 FAQ hereComment
-
Knock first as I might be balancing my chakras.Comment
-
Tricky one to sell as ClientCo want regex.Originally posted by OwlHoot View PostIf you have any say in the format of the incoming string, I'd try and steer them into using a CSV format from which each field can be split by using the built-in C# CSV Parser or a 3rd party CSV parser.
You may even be able to specify the delimiter character with those. (The default will be commas, but you may prefer colons.)
Use of a proper CSV parser avoids issues with quoted delimiters, e.g. "a,b","c","d" which actually comprise only the three fields "a,b" and "c" and "d". It also even allows multi-line fields.
The string parsing, however you do it, and the database handling are two different issues. So I would try and keep those conceptually separate.
I do take your point though.
Knock first as I might be balancing my chakras.Comment
-
-
I just use StackOverflow for Regex questions. Agree about them being a pig though.
Recently I've gotten into using regex-search in Visual C++... or even regex-find-and-replace. You can do pretty fancy stuff except MS use their own regex syntax (of course they do).Originally posted by MaryPoppinsI'd still not breastfeed a naziOriginally posted by vetranUrine is quite nourishingComment
-
I'm not even sure it's possible with a regex.
Thisisalargestring: And here is some data up until the next space character Thisisanotherlargestring: And here is some more data
"And here is some data up until the next space character" is full of spaces so how are you supposed to know that the last space is the one to stop at? Maybe because "Thisisanotherlargestring" is followed by a colon? But you don't want to include "Thisisanotherlargestring" in the match so you need to look-ahead. Now it's getting beyond my knowledge of regexes.Comment
-
Exactly, even if that is just a first step to then using regexps for individual fields.Originally posted by FiveTimes View PostCan't you just split on ':' ?
That's why I suggested using CSVs (which despite standing for "comma-separated values" can just as well use colons as field delimiters)Work in the public sector? Read the IR35 FAQ hereComment
-
And then clientco say oh, suity we want to retrieve this piece of data and it looks like :Originally posted by OwlHoot View PostExactly, even if that is just a first step to then using regexps for individual fields.
That's why I suggested using CSVs (which despite standing for "comma-separated values" can just as well use colons as field delimiters)
*12345*
I'm not so sure on why everyone is so hung up on why I want to use regexp for parsing unstructured data.
Thanks for all your input, it is appreciated.Knock first as I might be balancing my chakras.Comment
-
You still haven't said how you want to parse it. Do you want to extract all uppercase letters? All hexadecimal digits that are odd? All non-space characters?
In default of any clear specification, I shall make the assumption that FolderGUID, Outcome, and DataItem are field labels, the field value follows the label after a colon, and fields are delimited by a single space or the end of the line.
If that is the case then, in JavaScript (because I can test that in the browser, and I'm not firing up a C# compiler for a no-brainer like this) the following regular expression:
will give the following array when presented with the input you specified (which is repeated as the complete match at element 0):Code:/FolderGUID:([^ ]*) Outcome:([^ ]*) DataItem:(.*$)/
To generalise it into a function that takes the input as a (single line) string and returns an object with the extracted values as named properties:Code:[ "FolderGUID:67bfabff-ad78-4d30-918e-811dd2636f83 Outcome:Accept DataItem:SomeData", "67bfabff-ad78-4d30-918e-811dd2636f83", "Accept", "SomeData" ]
Code:function parseLineInWhatOneAssumesToBeTheRequiredWay(line) { var pieces = line.match(/FolderGUID:([^ ]*) Outcome:([^ ]*) DataItem:(.*$)/); if (pieces) { return { folderGUID: pieces[1], outcome: pieces[2], dataItem: pieces[3] }; } return null; }Comment
- Home
- News & Features
- First Timers
- IR35 / S660 / BN66
- Employee Benefit Trusts
- Agency Workers Regulations
- MSC Legislation
- Limited Companies
- Dividends
- Umbrella Company
- VAT / Flat Rate VAT
- Job News & Guides
- Money News & Guides
- Guide to Contracts
- Successful Contracting
- Contracting Overseas
- Contractor Calculators
- MVL
- Contractor Expenses
Advertisers


Comment