Defensive Threading

Designing and writing multi-threaded code is hard. Threading adds a debilitating amount of complexity to the Von Neumann architecture and, despite 50+ years of theory and implementation, the threading support in most programming languages is low level, the equivalent of malloc rather than GC managed memory.

If you’re a developer and you approach threading with trepidation then here are some experience tested guidelines and techniques that I try to follow. The overall strategy is to reduce the surface area for bugs and to increase the maintainability of the code. These are not hard and fast rules. It’s important to be pragmatic and to adjust for the specific situation.

But first, please note that I’m no expert. I’m sharing what I’ve learned and I know I’m not done learning. If this post helps you, I want to know. If you think I’ve got something wrong, I want to know. Thank you.

 

Avoid sharing data.

Sharing data between threads is the most common source of threading bugs. By definition many of the nastiest threading bugs can only occur when data is shared. Safely sharing data requires synchronization either in the form of a lock or an atomic primitive. Getting the synchronization right is not always easy. Avoid sharing data and the threading bug potential goes way down. As a bonus, the code can potentially run faster because the possibility of synchronization contention is eliminated.

It may seem that without sharing data, threads are pretty useless. Not so. But this is a case where you need to be willing and able to trade space for speed and safety. Give a thread it’s own private copy of everything it may need, let the thread run to completion (‘wait’ or ‘join’ on the thread), and then collect the results of it’s computations.

Avoid sharing logging.

Logging is data. See above. Sometimes the messages generated on a thread need to be reported more or less immediately. But if a delay can be tolerated, a thread can have it’s own private log that gets collected after the thread has run to completion.

When you must share, prefer lighter weight synchronization mechanisms.

I didn’t understand lock-free threading until I understood that lock-free doesn’t mean no synchronization mechanisms at all.

Under Windows the Interlocked* APIs represent atomic operations implemented by specific processor instructions. Critical sections are implemented via the interlocked functions. The interlocked functions and the critical section on Windows and the equivalent on other platforms are generally the lightest weigh synchronization mechanisms.

Technically the interlocked functions are not locks, they are hardware implemented primitives. But colloquially developers will speak of ‘locks’ and mean the whole set of synchronization mechanisms, hence my confusion over lock-free threading.

Having said that I will now forgo rigor and refer to all synchronization mechanisms as ‘locks’ because it’s pithier.

Use the smallest number of locks possible.

Don’t treat locks like magic pixie dust and sprinkle them everywhere. Synchronization locks provide safety but at a cost in performance and a greater potential for bugs. Yes, it’s kind of paradoxical.

Hold locks for the shortest length of time possible.

A critical section, for an example, allows only one thread to enter a section of code at a time. The longer the execution time of the section, the longer the window for other threads to box car up waiting for entry.

If there are values that can be computed before entering a critical section, do so. Only the statements that absolutely must be protected should be within the section.

Sometimes a value needs to be retrieved from a synchronized source, used as part of a computation, and a product of the computation needs to be stored to a synchronized destination. If the whole operation does not need to be atomic then, despite the desire to minimize locks, two independent synchronization locks could be better than one. Why? Because the source and destination are decoupled and the combined lock hold time is reduced because only the source read and the destination write are covered.

Be DRY - Don’t Repeat Yourself.

Always a good policy, being DRY has special importance with threaded code. There should be only one expression or implementation of any given synchronized operation. Every thread should be executing the same code to ensure that the operation is performed consistently.

Design to ensure that every acquisition of a lock is balanced by a release of the lock.

Take advantage of the RAII pattern or the dispose pattern or whatever is appropriate to the language and platform. Don’t rely on the developer (even when the developer is yourself) to remember to explicitly add every release for every acquisition.

Finish the threads you start.

Don’t fire and forget. Wait or join and clean up your threads and their resources.

Don’t let threads you created outlive the primary thread in the process. Some platforms have the unfortunate design of performing runtime set up and initialization, calling the program code, and then tearing down and de-initializing on the primary thread. Other threads that may still be running after the primary thread exits may fail when the runtime is pulled out from underneath them.

Don’t kill a thread. That will leave data in an undeterminable state. If appropriate implement a way to signal your thread to finish.

Don’t get focused on locking the data when it’s the operation that needs to be synchronized.

It’s easy to get fixated on the shared data but often times the design is better served by paying more attention to the shared operations.

Avoid nested locks.

Acquiring lock A and then lock B in one part of the code and elsewhere acquiring lock B and then lock A is asking for a deadlock. No-one sets out to write a deadlock. But sometimes the code isn’t exactly what was intended or a change is made and the impact of the change isn’t fully understood. If locks are never nested then the potential for this kind of deadlock just doesn’t exist.

If locks must be nested then alway acquire the locks in the same order and always release the locks in the same opposite order. And be DRY about it.

Avoid method calls within a lock.

This may seem aggressively limiting but method implementations can change and introduce new issues like unwanted nested locks. Keeping method calls out of the scope of locks reduces coupling and the potential for deleterious side effects.

(I think of this one as following in the spirit of LoD.)

Encapsulate intermediate values.

Avoid creating inconsistent state in shared data. Operations that create intermediate values shouldn’t be performed in situ on shared objects or data structures. Use local temporaries for intermediate values. Only update the shared data with the end products of a operation.

Be careful of using the double check lock pattern.

It’s a wonderful idea that’s so clever that’s it’s broken in many languages and runtime platforms.

Where the DCLP is not just inherently broken and can be used, it still needs to be implemented correctly. Mark the variable under test as ‘volatile’ (or the closest equivalent) so that the language compiler doesn’t optimize away one of the two checks. Don’t operate directly on the variable under test. Use a temporary to hold intermediate values and wait to assign to the variable under test until the new state is fully computed.

The DCLP is often used for one time or rare initializations.

Don’t lock the ‘this’ pointer.

If there are member functions of an object that need to synchronize, create a private member variable that is used as the lock target rather than locking on ‘this’. In addition to encapsulating the locking, using a private member guards against potential issues where a framework, library, or runtime may be performing synchronization by using the ‘this’ pointer.

Explain your code.

Leave comments for yourself and the next person.

Strive to make the design grokkable.

The original design and implementation can be bug free but if the design is hard to understand and follow, bugs can be introduced in maintenance cycles. Even if you are the only developer who may ever touch the code, remember that the details of the design that are so crystal clear in your mind today will get supplanted by tomorrow’s events and miscellany like … Squirrel!

Hacking SLN Files to Improve Visual Studio/Visual Source Safe Integration

There are some who think Microsoft Visual Source Safe (VSS) is great. There are some who think that it’s not great but it is pretty good. And there are some who think Source Safe is just good enough. I am not among any of those groups.

One pain point with Source Safe in a Microsoft tool chain is integration with Visual Studio. Historically Visual Studio has stored source control information directly in the solution and project files. This reveals itself to be a really terrible idea the first time a project or solution file is shared in Source Safe. Checkouts from within the IDE will use the the source control bindings in the solution or project. If the solution or project file has been shared and not changed the bindings will be pointing to the original location in Source Safe. Checking out through Visual Studio will checkout the wrong file.

Yes, the solution and project files can be branched and the bindings updated. That’s sub-optimal. It means branching just to fix a tool problem and not because of any actual divergence in the code base. Any non-divergent changes to the solution or project must then be manually propagated to all the branched versions. Ugh.

The problem has not gone unnoticed at Microsoft. Over time and new releases of VStudio and VSS, improvements have been made.

My current position of employment is not using the latest and greatest. We’re on Visual Studio 2005 and Visual Source Safe 2005. The code base is in C#. I worked out a way to have shared solution and project files for non-web projects. Web projects in 2005 are different beasts and don’t follow the same rules. If you are using different versions of Visual Studio or Visual Source Safe your mileage will vary.

When getting files from VSS into a working directory, Visual Source Safe 2005 will create or update a mssccprj.scc file if any of the files present include a solution (.sln) or project file (for C#, .csproj). The mssccprj.scc file is a text file and it will contain the VSS bindings for the solution and project files that are in the same directory as the mssccprj.scc file.

In Visual Studio 2005, project files by default use the special ‘SAK’ string in the project file settings for VSS bindings. SAK indicates to Visual Studio that the bindings in the mssccprj.scc file should be used. The mssccprj.scc bindings are based on the directory retrieved from Source Safe. This means that shared project files just work. Yay.

In 2005 the problem is with solution files.

More specifically the problem is with solution files that include project files that are not in the same directory as the solution file. Creating a MyHelloWorld project will create a MyHelloWorld.sln and a MyHelloWorld.csproj in a MyHelloWorld directory. The .sln will reference the .csproj by name only and both files will have bindings in the mssccprj.scc file in the same directory and that all works without issue. But create a blank solution and add multiple existing projects to it and Visual Studio 2005 reverts to storing specific VSS bindings for the projects in the solution file.

There’s a work-around but it doesn’t work for parenting paths. Any time a solution references a project where the path to the project file starts with ‘..’, Visual Studio will revert to storing a VSS binding in the solution file. Because of this limitation it becomes convenient to have a convention that solution files generally go in the root of the code base tree.

The goal is to get the VSS bindings out of the solution file and have Visual Studio rely on the mssccprj.scc file. I haven’t found a reliable way to do this from within the IDE with existing projects but the solution (.sln) files are just text files, so I hack them.

Here’s a snippet from a .sln file. In Source Safe, the solution file is in “$/CodeBase v1”. There are two projects: a class library and a windows application. The SccProjectName<N> values are bindings to specific locations in the source safe repository. These are the bindings that need to be removed.


Global
GlobalSection(SourceCodeControl) = preSolution
SccNumberOfProjects = 3
SccLocalPath0 = .
SccProjectUniqueName1 = ClassLibrary1\\ClassLibrary1.csproj
SccProjectName1 = \u0022$/CodeBase\u0020v1/ClassLibrary1\u0022,\u0020TAAAAAAA
SccLocalPath1 = ClassLibrary1
SccProjectUniqueName2 = WindowsApplication1\\WindowsApplication1.csproj
SccProjectName2 = \u0022$/CodeBase\u0020v1/WindowsApplication1\u0022,\u0020WAAAAAAA
SccLocalPath2 = WindowsApplication1
EndGlobalSection

The .sln file can be edited to remove the SccProjectName<N> values but the SccLocalPath<N> must be updated to ‘.’ and a new property, SccProjectFilePathRelativizedFromConnection<N>, must be added with the old local path value and an appended directory separator.


Global
GlobalSection(SourceCodeControl) = preSolution
SccNumberOfProjects = 3
SccLocalPath0 = .
SccProjectUniqueName1 = ClassLibrary1\\ClassLibrary1.csproj
SccLocalPath1 = .
SccProjectFilePathRelativizedFromConnection1 = ClassLibrary1\\
SccProjectUniqueName2 = WindowsApplication1\\WindowsApplication1.csproj
SccLocalPath2 = .
SccProjectFilePathRelativizedFromConnection2 = WindowsApplication1\\
EndGlobalSection

Reference: Alin Constantin’s blog: The SAK source control strings in Visual Studio’s project files

Block Comment Quick Trick

Sometimes it’s useful to be able to quickly add or remove a section of code. Some languages have preprocessors that can help but here’s a ‘stupid’ comment trick to achieve the same.

C-style syntax languages (e.g. C, C++, Java, JavaScript, C#) support single line comments with ‘//’ and block comments with a ‘/*’ and ‘*/’ pair. The trick is that single line comments can be used to comment out block comments.

So here’s a commented out code block:

/*

    var lorem = 'ipsum';

    ...

//*/

Note the ‘//’ for a single line comment in front of the block comment terminator. The block can be uncommented with a single keystroke, by adding a ‘/’ before the block start.

//*

    var lorem = 'ipsum';

    ...

//*/