Friday, January 10, 2014

Parallel Processing with Defined Degree of Parallelism using Task Parallelism

Sometimes you really want to achieve very simple task but when it comes to program it, you end up thinking about dead locks, thread sync, max threads, processing time etc. I have faced a similar situation where i want to process some items mutually exclusive with each other but i can't process more than 4 items at a time. so i end up using Task Parallelism which makes it really easy and understandable. Other way of doing this is by using ThreadPool and ManualRest locks that is you can say a retro fit for this problem.

To illustrate the problem, let assume you want to parse the html content of a list of urls on a site but you can't process more than 4 urls at a time to avoid overloading your web server and bandwidth. Here is how it can be easily achieved using Task Parallelism:


          //fetch all list of urls to scan.
           List<string> urlsToScan = GetUrlsToScan();
           
            //apply parallel foreach on every url but with Degree of Parallelism to 4 threads at a time.
            Parallel.ForEach(urlsToScan,
                new ParallelOptions()
                {
                    MaxDegreeOfParallelism = 4
                }, (s) =>
                {
                    Console.WriteLine("{0} - Processing url: {1}", DateTime.Now, s);
                    ScrapUrl(s);
                });

Hope this will help a lot to solve many such problems very easily. Happy Coding!!

No comments: