Process in Parallel to Go Faster in .NET..!

When you have a chore that takes a long time to complete then it's probably going to be faster if you can divide the amount of work.

For example, if you paint a room on your own, this job might take a couple of hours. But with a couple of friends, the same amount of work can be completed in an hour or less, depending on the number of friends that are helping and how hard everyone is working.

The same applies to programming, where these friends are called threads. When you want to work through a collection faster, a common solution is to divide the work among threads that run concurrently.

In the early days of .NET, spawning new threads was manual work and required some knowledge. If you take a look at the thread docs you can see that it takes some "orchestration code" to manage these threads.

Because we write the code on your own, there's also a probability that this code contains bugs. It even gets more complex when you spawn multiple threads to achieve the best performance.

Luckily, C# hides all of the implementation details for us with the Task Parallel Library (TPL) introduced in .NET Framework 4. It's also safe to say that the chance of bugs within this code is far less in comparison with a custom implementation.

Data parallelism refers to scenarios in which the same operation is performed concurrently (that is, in parallel) on elements in a source collection or array. In data parallel operations, the source collection is partitioned so that multiple threads can operate on different segments concurrently.

Test Case For our case, we had 60.000 items that had to be migrated from one system to another system. It took 30 minutes to process 1.000 items, which makes it 30 hours in total to process all of the items in the collection.

That's a lot of time, especially if you know that it isn't a difficult task to migrate an item. The simplified version of the initial code looked like this. A simple iteration over a list, and within the loop, the migration of an item where we:

retrieve the details of the item migrate the item save the item into system B

When we take a closer look at where we lose time, it was clear that a lot of time was spent waiting. Waiting while the item is retrieved and waiting on the item to be saved. The migration in MigrateToSystemB is fast and only took a couple of milliseconds.

Normally, I would suggest limiting the amount of I/O to make this process faster. This can be done when we retrieve and save the items in batch, instead of one-by-one. For this use case, doing that wasn't an option.

The only way to make this migration faster is to add parallelism to the code. Instead of migrating item by item in a sequential order, where the following item is migrated after the previous item is migrated, we want to migrate multiple items at the same time. Each migration will have its own thread and that makes it more efficient.

The easiest way to add parallelism to the loop is to use Parallel.ForEach. Internally, the Parallel.ForEach method divides the work into multiple tasks, one for each item in the collection.

The Parallel class provides library-based data parallel replacements for common operations such as for loops, for each loops, and execution of a set of statements.

By default, Parallel.ForEach tries to use all of the available threads of the machine. To lower the impact on the system we can use the MaxDegreeOfParallelism option. This property limits the number of spawned concurrent tasks so we don't impact the other running processes of the application.

The MaxDegreeOfParallelism option can be set to a static value, or we can use the Environment.ProcessorCount property to make it dependant on the machine's resources. In the snippet below, we configure the MaxDegreeOfParallelism option to use a maximum of 75% resources of the machine.

In our case, this "refactor" resulted that it now only takes 40 seconds to process 1.000 items. For the whole collection of 60.000 items, it takes 40 minutes. From 30 hours to 40 minutes with just a few lines of extra code! Because we're using the number of processors of the machine, it takes 20% longer on my machine compared to the server.

But it doesn't stop here.

PARALLEL LINQ (PLINQ)

While the Parallel solution works fine for my use case, .NET also has Parallel LINQ (PLINQ).

PLINQ brings parallelism to the well-known LINQ API. This ensures that the code remains readable while writing more complex business logic, where you need to order, filter, or transform the data.

If you're already familiar with LINQ, I got good news for you because you'll immediately feel right at home.

A PLINQ query in many ways resembles a non-parallel LINQ to Objects query. PLINQ queries, just like sequential LINQ queries, operate on any in-memory IEnumerable or IEnumerable<T> data source, and have deferred execution, which means they do not begin executing until the query is enumerated. The primary difference is that PLINQ attempts to make full use of all the processors on the system. It does this by partitioning the data source into segments, and then executing the query on each segment on separate worker threads in parallel on multiple processors. In many cases, parallel execution means that the query runs significantly faster.

To process the collection in a parallel manner, we can use the AsParallel() extension method followed by any of the existing LINQ extension methods. The Parallel example rewritten with the PLINQ syntax looks like this, where we use the ForAll extension method to iterate over the items.

Note that we can set the degree of parallelism with the WithDegreeOfParallelism extension method.

The performance benefits between the Parallel solution and this PLINQ solution were the same.

Differences between Parallel and PLINQ

While performance isn't a factor (in most cases) to choose between these two solutions, there are subtle differences between the PLINQ and the Parallel methods. Both solutions have a right to exist and provide a solution to different problems.

The distinct advantages of both are well-explained in "When To Use Parallel.ForEach and When to Use PLINQ".

The main differences that we will remember are:

The degree of parallelism

With Parallel you set the maximum degree, which means that it's impacted based on the available resources with PLINQ you set the degree, meaning that that's the actual number of threads that are used

The order of execution

The order in which the tasks are invoked within a Parallel iteration is random. In other words, use Parallel to execute independent tasks if the order is important, use PLINQ because the order is preserved

Using the result

Parallel doesn't return a result. The output of Parallel is ParallelLoopResult, which contains the completion information of the collection (for example if all tasks are completed) but nothing more. When you need a return value of the processed stream use PLINQ. Because the tasks do run concurrently, we need a way to merge the results of all the tasks to one result object. To specify how the result of each task must be merged back to the output result, use the merge options.

Break early to stop processing

Parallel provides a way to exit early with ParallelLoopState.Stop() and ParallelLoopState.Break(). Both prevent more iterations from starting but have the difference that Stop, stops the loop immediately while Break still runs previous iterations. to stop a PLINQ iteration, a CancellationToken is used but this doesn't guarantee that the following iterations are not started.

DATAFLOW (TASK PARALLEL LIBRARY)

Besides the Parallel and PLINQ methods, there's a third library called Dataflow (Task Parallel Library). By solving my performance issue, this was the first time I encountered Dataflow.

Provides dataflow components to help increase the robustness of concurrency-enabled applications. These dataflow components are collectively referred to as the TPL Dataflow Library. This dataflow model promotes actor-based programming by providing in-process message passing for coarse-grained dataflow and pipelining tasks. The dataflow components build on the types and scheduling infrastructure of the TPL and integrate with the C#, Visual Basic, and F# language support for asynchronous programming.

The use case that inspired this post was one of the few times where I actually needed to use parallel programming. It's also the first time that it has such a big impact.

We like how easy .NET makes it to rewrite code that runs sequentially into code that runs in parallel. Because of it, we can focus on delivering business value without making the code difficult to write and read.

The learning curve isn't steep and it grows with the complexity of the use case:

Use the Parallel.ForEach method for the simplest use case, where you just need to perform an action for each item in the collection Use the PLINQ methods when you need to do more, e.g. query the collection or to stream the data Use the DataFlow methods for when you want complete control over the processing pipeline

Does this mean that I'll sprinkle parallel programming all over in the codebase? No, that's not what I'm recommending because it might even have a negative result, as it comes with its own potential pitfalls.

.NET Parallelism Most of the parallel extensions in the .NET Framework were released with .NET 4, though I still find them drastically underused. The Visual Studio Async CTP (SP1 Refresh) later introduced the async and await keywords, and of course these made it into .NET Framework 4.5. I'm going to give you my own coverage of this feature, but it would be useless without leaving you comfortable with the concept of tasks. So the parts of the framework that I will cover in this article include:

The Task Parallel Library The async and await keywords Like the older programming models, you should use the new ones with care. Multithreaded programming still seems to be one of those things where once a programmer learns how it's done, they get carried away rather quickly.

In mobile programming, you should keep parallelism in the forefront of your thoughts. Why? Because mobile platforms allow immediate access to one application at a time so you should avoid monopolization of the device.

The Task Parallel Library Perhaps the largest addition to the world of parallel programming in .NET is the Task Parallel Library. This library includes the contents of the namespace, System.Threading.Tasks. The two primary classes here are the Task class and the Parallel class. These two classes make up the core of the API you should use going forward to perform multithreaded programming, going forward referred to as parallel programming.

Task The Task class is the centerpiece of the Task Parallel Library. A task represents an operation that is running or going to run. Using the Task class, you benefit from a state-of-the-art fluent API that is easy to use and offers extreme flexibility. Another benefit of the Task Parallel Library is that when it incorporates multithreading, it uses the thread pool. The thread pool manages thread usage for maximum throughput and scalability. Tasks don't necessarily have to execute on a separate thread and usage of the Task class can be in a multithreaded capacity. You can set tasks to execute on the current (or a specific) thread.

Going “Async” The .NET Framework contains many things that we refer to as syntactic sugar. This term means that to accomplish something with the .NET Framework's base class library, which requires a specific pattern, usually a complex one, we as developers are given the luxury of a simple code syntax. The compiler team at Microsoft didn't have to add new features and keywords to the IL compiler. They simply modified the language compilers to turn simple syntax into the complex one when it generates the IL during compilation.

Background vs. Foreground Threads A thread is a thread is a thread. In the CLR-managed threads, what makes a thread a so-called “background thread” is the fact that an application gives it no priority upon closing time. When an application is closed, it checks to see if there are any other foreground threads running besides the primary one and if so, it will pause its shutdown until they finish their work. Background threads will simply be killed. Thread pool threads are automatically background threads. To create a foreground thread you must use Thread.Start.