Originally posted by DimPrawn
View Post
Why do you always have to post something contrary? Is it living in Swindon? Not happpy at your work? Try and ease up a bit mate, you'll grow old and bitter. 
I might be getting my terminology wrong as I know bugger all about regex. I am reading up about it, and also have a solution for the client which I have implemented. The thing is, the regex SHOULD return two matches and doesn't.
This is the code I am using
private void btnParse_Click(object sender, RoutedEventArgs e)
{
//Regex regEx = new Regex(@txtRegExp.ToString());
Regex regEx = new Regex(@txtRegExp.Text);
txtCaptures.Text += "************************************************* **********************\n";
if (regEx.IsMatch(@txtInputString.ToString()))
{
string[] groupNames = regEx.GetGroupNames();
MatchCollection matches = regEx.Matches(@txtInputString.Text);
Match match = regEx.Match(@txtInputString.Text);
foreach (Match m in matches)
{
foreach (string s in groupNames)
{
Group g = m.Groups[s];
if (g.Success)
{
string matchedValue = g.Value;
txtCaptures.Text += "[" + s.ToString() + "](" + matchedValue + ")\n";
}
}
}
}
txtCaptures.Text += "************************************************* **********************\n";
}
{
//Regex regEx = new Regex(@txtRegExp.ToString());
Regex regEx = new Regex(@txtRegExp.Text);
txtCaptures.Text += "************************************************* **********************\n";
if (regEx.IsMatch(@txtInputString.ToString()))
{
string[] groupNames = regEx.GetGroupNames();
MatchCollection matches = regEx.Matches(@txtInputString.Text);
Match match = regEx.Match(@txtInputString.Text);
foreach (Match m in matches)
{
foreach (string s in groupNames)
{
Group g = m.Groups[s];
if (g.Success)
{
string matchedValue = g.Value;
txtCaptures.Text += "[" + s.ToString() + "](" + matchedValue + ")\n";
}
}
}
}
txtCaptures.Text += "************************************************* **********************\n";
}
0ABC123456 ABC123456
and the following Regex
^.*\s(?<Reference>(0[a-zA-Z]{3}\d{6}|[a-zA-Z]{3}\d{3,7}))
I get the following output
************************************************** *********************
[0](0ABC123456 ABC123456)
[1](ABC123456)
[Reference](ABC123456)
************************************************** *********************
[0](0ABC123456 ABC123456)
[1](ABC123456)
[Reference](ABC123456)
************************************************** *********************
If however I use the following regex
^.*\s(?<Reference>(0[a-zA-Z]{3}\d{6}))
I get the following output
************************************************** *********************
[0]( 0ABC123456)
[1](0ABC123456)
[Reference](0ABC123456)
************************************************** *********************
[0]( 0ABC123456)
[1](0ABC123456)
[Reference](0ABC123456)
************************************************** *********************
However, in the first scenario, why did it return the second string and not the first? What gave the second pattern priority? It parses from left to right by default.
So what happens if we try the one character pattern I mentioned before
^.*(?<Reference>(A|C))
The output is
************************************************** *********************
[0]( 0ABC123456 ABC)
[1](C)
[Reference](C)
************************************************** *********************
[0]( 0ABC123456 ABC)
[1](C)
[Reference](C)
************************************************** *********************
If I make the search lazy
^.*?(?<Reference>(A|C))
It returns
************************************************** *********************
[0]( 0A)
[1](A)
[Reference](A)
************************************************** *********************
[0]( 0A)
[1](A)
[Reference](A)
************************************************** *********************

Leave a comment: