I have a method that iterates a list of objects and for each item in the list fetches data from an external api.
Sometimes this can be very slow (naturally) and I'd like to add all my items to a Taks list instead to be able to run multiple threads at the same time. Is this possible without rewriting it all to be async? Today I'm using WebClient and fetches everything synchronously.
I tried something like this at first:
public Main()
{
List<Task<Model>> taskList = new List<Task<Model>>();
foreach (var aThing in myThings)
{
taskList.Add(GetStuffAsync(aThing));
}
List<Model> list = Task.WhenAll(taskList.ToArray()).Result.ToList();
}
public async Task<Model> GetStuffAsync(object aThing)
{
// Stuff that is all synchronous
// ...
return anObject;
}
Rather than using async here, you can just make GetStuff
a normal synchronous method and use Task.Run
to create new tasks (which will normally be run on multiple threads) so that your fetches occur in parallel. You could also consider using Parallel.ForEach
and the like, which is effectively similar to your current Main
code.
Your async approach will not do what you want at the moment, because async methods run synchronously at least as far as the first await
expression... in your case you don't have any await
expressions, so by the time GetStuffAsync
returns, it's already done all the actual work - you have no parallelism.
Of course, an alternative is to use your current code but actually do make GetStuffAsync
asynchronous instead.
It's also worth bearing in mind that the HTTP stack in .NET has per-host connection pool limits - so if you're fetching multiple URLs from the same host, you may want to increase those as well, or you won't get much parallelism.
See more on this question at Stackoverflow