• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

RegExp conundrum

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    #11
    http://www.regular-expressions.info/named.html have ever actually read about .net regex ?

    The following worked for me:

    Code:
                Regex r = new Regex(@"^.*\s(?<Reference>0[a-zA-Z]{3}\d{6})|(?<Reference>[a-zA-Z]{3}\d{3,7})", RegexOptions.Multiline);         
                MatchCollection m = r.Matches("sdssd 0ABC123456" + Environment.NewLine + "asds ABC123456");
    VS.NET immediate window output:

    m[0].Groups["Reference"].Captures[0]
    {0ABC123456}
    [System.Text.RegularExpressions.Group]: {0ABC123456}
    Index: 6
    Length: 10
    Value: "0ABC123456"
    m[1].Groups["Reference"].Captures[0]
    {ABC123456}
    [System.Text.RegularExpressions.Group]: {ABC123456}
    Index: 23
    Length: 9
    Value: "ABC123456"
    Last edited by Jaws; 27 August 2009, 19:53. Reason: Code correction...

    Comment


      #12
      In fact the multiple groups are not even required. It appears your problems lie in the fact you start your expression with the ^ indicating the start of the line / string. The rest of it just says match the following one time only.

      Comment


        #13
        Originally posted by DimPrawn View Post
        Clearly it doesn't as you are only getting one group back.

        The | is an or, which means one or the other. In this case it doesn't mean both.

        Can't you move on (or around the problem) rather than always banging your head against the wall for day after day?
        Why do you always have to post something contrary? Is it living in Swindon? Not happpy at your work? Try and ease up a bit mate, you'll grow old and bitter.

        I might be getting my terminology wrong as I know bugger all about regex. I am reading up about it, and also have a solution for the client which I have implemented. The thing is, the regex SHOULD return two matches and doesn't.

        This is the code I am using

        private void btnParse_Click(object sender, RoutedEventArgs e)
        {
        //Regex regEx = new Regex(@txtRegExp.ToString());
        Regex regEx = new Regex(@txtRegExp.Text);

        txtCaptures.Text += "************************************************* **********************\n";

        if (regEx.IsMatch(@txtInputString.ToString()))
        {
        string[] groupNames = regEx.GetGroupNames();
        MatchCollection matches = regEx.Matches(@txtInputString.Text);
        Match match = regEx.Match(@txtInputString.Text);

        foreach (Match m in matches)
        {
        foreach (string s in groupNames)
        {
        Group g = m.Groups[s];
        if (g.Success)
        {
        string matchedValue = g.Value;

        txtCaptures.Text += "[" + s.ToString() + "](" + matchedValue + ")\n";
        }
        }
        }
        }
        txtCaptures.Text += "************************************************* **********************\n";
        }
        If I use the following string

        0ABC123456 ABC123456

        and the following Regex

        ^.*\s(?<Reference>(0[a-zA-Z]{3}\d{6}|[a-zA-Z]{3}\d{3,7}))

        I get the following output

        ************************************************** *********************
        [0](0ABC123456 ABC123456)
        [1](ABC123456)
        [Reference](ABC123456)
        ************************************************** *********************
        So you see, once match in [Reference] group, capture, match or whatevet the correct flipping name is. One match. That is all.

        If however I use the following regex

        ^.*\s(?<Reference>(0[a-zA-Z]{3}\d{6}))

        I get the following output

        ************************************************** *********************
        [0]( 0ABC123456)
        [1](0ABC123456)
        [Reference](0ABC123456)
        ************************************************** *********************
        So both sides of the or actually match. And if we take DPs solution which is that a logical or would only return one result then all well and good.

        However, in the first scenario, why did it return the second string and not the first? What gave the second pattern priority? It parses from left to right by default.

        So what happens if we try the one character pattern I mentioned before

        ^.*(?<Reference>(A|C))

        The output is

        ************************************************** *********************
        [0]( 0ABC123456 ABC)
        [1](C)
        [Reference](C)
        ************************************************** *********************
        So why is it ignoring the first A?

        If I make the search lazy

        ^.*?(?<Reference>(A|C))

        It returns

        ************************************************** *********************
        [0]( 0A)
        [1](A)
        [Reference](A)
        ************************************************** *********************
        Knock first as I might be balancing my chakras.

        Comment

        Working...
        X