Perilous Parallel.For

I’ve been playing a lot with the Task Parallel Library because I’m really excited about the new Task-Based Asynchronous Pattern. Someone on the VB forum asked a question that was more or less about parallelizing a task, and I endeavored to make an example that compared the performance of some approaches.

When I got to the one that used Parallel.For, I started having some random incorrect results. Here’s some code that reproduces what I was seeing:

using System.Threading.Tasks;
using System.Threading;

static class Class1
{
    private const int NumberOfValues = 100;

    public static void Main()
    {
        int[] values = new int[NumberOfValues];
        int expectedSum = 0;
        for (int i = 0; i < values.Length; i++)
        {
            values[i] = i;
            expectedSum += i;
        }

        for (int i = 0; i < 10; i++)
        {
            int sum = 0;
            Parallel.For(0, values.Length, (int value) =>
                {
                    sum += value;
                    Thread.Sleep(10);
                });
            System.Console.WriteLine("{0}/{1}", expectedSum, sum);
        }
    }

}

The frequency of incorrect results seemed to depend on the presence and length of the Sleep() call. This told me it was most definitely a problem with concurrent access to the sum variable. My first attempt to fix it involved making a class with a lock:

class Sum
{
    private readonly object LockObject = new object();
    private int _sum;

    public int Sum
    {
        get
        {
            lock(LockObject) { return _sum; }
        }
        set
        {
            lock(LockObject) { _sum = value; }
        }
    }
}

It didn’t help. This is where I actually started converting the project to C# and planning out a StackOverflow post, but after a few more minutes of thought I figured out what was happening.

When sum += value executes, it looks like one atomic step, but it’s not. It has to execute in at least two steps: it must fetch the current value of sum, it must apply the + operator to that result, then it must assign the result to sum. There is a potential for some thread to assign a new value to sum after some other thread has retrieved it; this will cause some results to be lost. The Sum class above didn’t solve this because the thread synchronization only blocks concurrent access to the get and set individually. To solve the problem, I had to block access to the get and set simultaneously:

// ...snip

for (int i = 0; i < 10; i++)
{
    _sum = 0;
    Parallel.For(0, values.Length, (int value) =>
        {
            lock (LockObject)
            {
                _sum += value;
            }
            Thread.Sleep(10);
        });
    System.Console.WriteLine("{0}/{1}", expectedSum, _sum);
}

This works as expected (I later updated it to use Interlocked.Add()). But if you look closely, it seems to destroy the concurrency! The point of the code is to sum an array of values, but if each thread can only execute serially there’s no value to having threads. What gives?

I didn’t believe summation is impossible to parallelize because addition is commutative. However, since adding a value to the sum must be atomic, I feel like it might not be something that can be done in parallel. When I remove the Sleep() call, this approach is faster than a single-threaded approach but much slower than some other approaches I took using the thread pool; those let each thread tally its own sum then added each thread’s individual sum at the end. I wish I could think of a way to do that elegantly with Parallel.For().

With the Sleep() calls inserted, Parallel.For() is dramatically faster than any of the other approaches I tried; I assume this is because the Sleep() calls don’t block concurrent access and the TPL might be shuffling tasks out if they’re idle. While the Sleep() call may seem useless, it could represent 10ms worth of work that I have to do once I’ve updated _sum; if I cache any values needed in the lock statement I could still gain some benefit from working with the cached values outside of the lock statement.

So if you’re using Parallel.For(), be mindful of whether any variables you touch are safe for concurrent access. If they aren’t, consider different approaches.

I guess I have to weigh in on Silverlight

I’m going to assume you’re familiar with the Microsoft stack and moderately familiar with the current furor over the shift in Silverlight’s strategy. If you’re not, this is going to be a boring post and you should move on.

I see a few camps right now. There are people heavily invested in SL that feel a bit upset. There are some people heavily invested in SL that feel this is a misunderstanding. There are some people that state the writing on the wall’s always been there. Let’s focus on the last people, because they have the only position that isn’t summarized by its description.

Silverlight was interpreted as many things at release; I’ve always felt like it was supposed to be a Flash competitor. Regardless, the message has been that Silverlight was there to fill in some gaps that standard HTML didn’t fill well. Indeed, there’s some things it still does that HTML 5 and its related technologies don’t address well. Microsoft’s new opinion is that Silverlight was never intended to take the place of simple HTML 5 applications, and instead was only intended for web applications that required a level of integration with the system that could not be achieved with HTML, even the superhero HTML 5. The more I think about it, the more I can’t understand why I didn’t see this message shine through from the beginning. Perhaps it’s the fault of their evangelism of SL for all things web, but that’s a different discussion.

The thing I don’t like about the “Silverlight is rightfully being snubbed in favor of HTML 5” arguments is many proponents ignore the fact that Silverlight is still a good option for those applications that need tight integration with the system. Too many arguments chastize developers for picking SL in the first place. Too many arguments celebrate the passing of SL (which isn’t dead yet) and can’t wait for the world in which they build their awesome HTML 5 applications. Let’s talk about the SL application I’m heavily invested in.

I have a hand in a part of LabVIEW Web UI Builder. LabVIEW is a graphical dataflow programming language primarily targeted at engineers. It has excellent support for analysis of large data sets, superb interoperability with many types of hardware, and implemented intuitive support for multicore parallelism before the MS stack even realized it was something worth working on. The Web UI Builder is an implementation of the programming language in Silverlight with stripped-down support; it’s not intended to replace LV but it’s going to be very handy for some remote configuration scenarios that are tricky to handle with plain old native code. You can put a web browser on pretty much anything, and since SL’s client-side that device can serve SL content without requiring much in the way of muscle. It’s a great fit.

I see no way this could have been implemented in HTML, CSS, and JS. We require the ability to let the user create a UI, use those UI elements to create a diagram that represents program code, compile that code into something, and run that. It’s nice to have access to the user’s hard drive for this stuff. In the ideal case, it’d be nice to be able to work with hardware connected to the system. Last I checked, that’s not really on anyone’s roadmap for HTML and related technologies. While this is the case, there will always be room for Silverlight.

Will Microsoft toss aside Silverlight like they did VB6 when .NET released? I’m not sure. The initial statement gave that impression, and the clarification seemed to skirt around the issue. I’d like to think it’s all an issue of poor planning on the part of MS. It’s possible that their statements seem (to an insider) that the question of continued Silverlight support is answered. Unfortunately, they’re preaching to a crowd that’s seen numerous well-accepted technologies pushed aside to make room for the latest. I don’t know that the murmurs are going to die down until details about SL 5 start to slip. April’s quite a long way to go. I think most of the discussion would die down if MS made it immediately clear that WP7 and the shift to HTML 5 isn’t going to take resources away from developing Silverlight as an excellent no-brainer deployment for cross-platform applications that need system integration. Right now Java and Mono are the only real story for that, and neither puts money in Microsoft’s pocket.

I’m writing SL code so long as my managers feel it’s the right tool for the job. I’m not an independent developer able to change stacks on a whim, and I don’t like that most of the community seems to ignore those of us that have roots and can’t switch jobs on a whim. I’m aware of the excellent work going on in the non-Microsoft world and I like to keep an eye on Ruby, Python, and a handful of other languages just in case. But I’m happy with my job, I like the people I work with, and I don’t have any problems with the Microsoft stack that are specific to Microsoft. I don’t aim to fix things that aren’t broken.