I have encountered some odd behaviour while implementing a Group Join with a customer IEqualityComparer.
The following code demonstrates the behaviour that is the problem for me
List<String> inner = new List<string>() { "i1", "i2" };
List<String> outer = new List<string>() { "o1", "o2" };
var grouped = outer.GroupJoin(inner, i => i, o=> o, (inKey, outCollection) => new {Key = inKey, List = outCollection},
new EqualityComparer<string>((i, o) => i == o)).ToList();
From the docs found on MSDN I would expect that the last parameter to be passed a series of inner keys and outer keys for comparison.
However, placing a breakpoint inside the Func shows that both i and o start with the letter i and are in fact both elements of the inner collection so the grouped
object is always empty (I know the example will always be empty, its just the smallest bit of code that that demonstrates the problem).
Is there a way to GroupJoin objects with a custom comparator?
For completeness, this the EqualityComparer that is being created in the GroupJoin argument list:
public class EqualityComparer<T> : IEqualityComparer<T>
{
public EqualityComparer(Func<T, T, bool> cmp)
{
this.cmp = cmp;
}
public bool Equals(T x, T y)
{
return cmp(x, y);
}
public int GetHashCode(T obj)
{
// Always return 0 so that the function is called
return 0;
}
public Func<T, T, bool> cmp { get; set; }
}
A GroupJoin
operation first needs to build a lookup - basically from each projected key in inner
to the elements of inner
with that key. That's why you're being passed inner
values. This happens lazily in terms of "when the first result is requested" but it will consume the whole of inner
at this point.
Then, once the lookup has been built, outer
is streamed, one element at a time. At this point, your custom equality comparer should be asked to compare inner keys with outer keys. And indeed, when I add logging to your comparer (which I've renamed to avoid collisions with the framework EqualityComparer<T>
type) I see that:
using System;
using System.Linq;
using System.Collections.Generic;
public class Test
{
public static void Main()
{
List<String> inner = new List<string>() { "i1", "i2" };
List<String> outer = new List<string>() { "o1", "o2" };
outer.GroupJoin(inner, i => i, o=> o,
(inKey, outCollection) => new {Key = inKey, List = outCollection},
new CustomEqualityComparer<string>((i, o) => i == o)).ToList();
}
}
public class CustomEqualityComparer<T> : IEqualityComparer<T>
{
public CustomEqualityComparer(Func<T, T, bool> cmp)
{
this.cmp = cmp;
}
public bool Equals(T x, T y)
{
Console.WriteLine("Comparing {0} and {1}", x, y);
return cmp(x, y);
}
public int GetHashCode(T obj)
{
// Always return 0 so that the function is called
return 0;
}
public Func<T, T, bool> cmp { get; set; }
}
Output:
Comparing i1 and i2
Comparing i1 and i2
Comparing i1 and i2
Comparing i2 and o1
Comparing i1 and o1
Comparing i2 and o2
Comparing i1 and o2
Now that's not the only possible implementation of GroupJoin
, but it's a fairly obvious one. See my Edulinq post on GroupJoin
for more details.
See more on this question at Stackoverflow