Why is Regex.Match not matching the same strings as Regex.Matches?

I made a find and replace dialog with a regex option. There is a button to test a regex, highlighting all matches, and a button to find individual matches. With some regular expressions both methods make the same matches. Other regular expressions yield no matches with Regex.Match, but behave as expected with a collection of Regex.Matches. I have tried different RegexOptions when assigning the Regex, but haven't found any option that makes it behave as desired.

The goal here is to be able to test a regex, with ButtonTestRegex, then to be able to select each match, with a Find or Replace button.

Public rtb as RichTextBox

Private Sub ButtonTestRegex_Click(sender As Object, e As EventArgs)
    rtb.Select(0, rtb.TextLength)
    rtb.SelectionColor = Color.Black

    Dim rgx As New Regex("(duplicate of )*([0-9]:+)*")

    Dim matches As MatchCollection = rgx.Matches(rtb.Text)
    For Each match In matches
        rtb.Select(match.index, match.length)
        rtb.SelectionColor = Color.Red
    Next
End Sub

Private Sub ButtonFind_Click(ByVal sender As Object, ByVal e As EventArgs)
    rtb.Focus()
    rtb.selectionstart = 0
    rtb.selectionlength = 0
    Dim rgx = New Regex("(duplicate of )*([0-9]:+)*")
    Dim match As Match = rgx.Match(rtb.Text)
    If match.Value <> "" Then
        rtb.SelectionStart = match.Index
        rtb.SelectionLength = match.Length
     End If
End Sub

With a RichTextBox containing the following:

1:remainder

duplicate of 1:remainder

duplicate of duplicate of 1:remainder

The code above will match all text except "remainder" with ButtonTestRegex_Click() (as expected). Nothing will be matched with ButtonFind_Click(). The code is being executed, and it does work with some regexs, e.g. [0-9].

This code sample is abbreviated for clarity. My question is, why does Regex.Match not match anything in this case but Regex.Matches does?

Jon Skeet
people
quotationmark

I suspect you have a space or something at the start of the text selected by your RichTextBox. At that point, it makes total sense. Look at your regular expression:

(duplicate of )*([0-9]:+)*

That will match the empty string. So for example, if you find all the matches of that against "x", you'll find one match before the x, and one match after the x.

When you call Match, that finds the first match - which it does successfully, but matches an empty string. When you call Matches, it will retrieve all the matches - and there are a lot of them. Here's a small C# console app to show them all, assuming a space at the start of the text:

using System;
using System.Text.RegularExpressions;

class Program
{
    static void Main(string[] args)
    {
        var regex = new Regex("(duplicate of )*([0-9]:+)*");
        var input = @" 1:remainder
duplicate of 1:remainder
duplicate of duplicate of 1:remainder";
        foreach (Match match in regex.Matches(input))
        {
            Console.WriteLine(match.Length);
        }
    }
}

The output of that starts like this:

0
2
0
0
0
0
0
0
0
0
0
0
0
15
0

... but there's a lot of output.

It's not entirely clear what you were trying to match, but you probably want to make sure that empty strings don't match your regex.

people

See more on this question at Stackoverflow