Assuming there is a class
public class StopTime
{
public TimeSpan? ArrivalTime { get; set; }
public TimeSpan? DepartureTime { get; set; }
public string StopID { get; set; }
public int StopSequence { get; set; }
public string TripID { get; set; }
}
I have to read data from a CSV file and map it to the mentioned class. The CSV file can have many records, in my case about 500000 records.
After I parse CSV file and map data to list of StopTime
in different functions I want to filter StopTimes
based on TripId
.
In my scenario I've about 8000 TripId
s in the list of StopTime
.
I have tried to create a Dictionary of a list using this code:
var TripIdStops = new Dictionary<string, List<StopTime>>();
foreach (var tripId in ListOfTripId)
{
TripIdStops.Add(tripId, StopTimes.Where(x=>x.TripID==tripsDistinct).ToList());
}
To create the dictionary, this loop has to filter StopTime
s, remember 500000 records and 8000 instances of TripIds.
However, this is a very time consuming task. Is there a way to improve the performance?
It sounds like you want a lookup:
var stopTimesByTripId = StopTimes.ToLookup(st => st.TripId);
Or to narrow it down by ListOfTripId
first:
var tripIdSet = new HashSet<string>(ListOfTripId);
var stopTimesByTripId = StopTimes.Where(st => tripIdSet.Contains(st.TripId))
.ToLookup(st => st.TripId);
In both cases you'll only need to iterate through StopTimes
once.
See more on this question at Stackoverflow