MSBuild Metadata doesn't support string instance methods

In MSBuild, string instance methods are directly supported for properties but not for item metadata values. This is a longstanding issue. Properties have property functions which include string instance properties and methods. $(Prop.Length), for example, will invoke the string.Length property of the string instance for the MSBuild Property named Prop. Items have item functions that operate on the vector. @(Items->get_Length()) will invoke the string.Length property of the string instance for the Identity metadata value for each item in the ItemGroup reference. But to use a metadata value other than Identity and/or to operate on a metadata reference instead of a ItemGroup reference is not directly supported.

The work around is to create a string from the metadata reference and then apply a property function to the string. Common approaches are to 'copy' the string or to use the ValueOrDefault function.

An example of copying the string would be

$([System.String]::Copy('%(Filename)').Length)

An example of using the ValueOrDefault function would be

$([MSBuild]::ValueOrDefault('%(Filename)', '').Length)

At this point in time, the copy technique is considered more idiomatic and preferred and there is an optimization in the MSBuild internals to not actually create a copy of the string.

An expanded example that contrasts Property functions, Item functions, and changing metadata values to strings and applying Property functions follows.

<!-- stringinstance.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <!-- MSBuild Property (Scalar) -->
  <PropertyGroup>
    <Prop>FooBar</Prop>
  </PropertyGroup>

  <PropertyGroup>
    <!-- Examples of string instance methods as 'Property Functions' -->
    <PropLength>$(Prop.Length)</PropLength>
    <PropRemoveFirstChar>$(Prop.Substring(1))</PropRemoveFirstChar>
    <PropRemoveLastChar>$(Prop.Substring(0, $([MSBuild]::Subtract($(Prop.Length), 1))))</PropRemoveLastChar>
  </PropertyGroup>

  <!-- MSBuild Item (Vector) -->
  <ItemGroup>
    <Items Include="FooBar.txt;Quux.txt"/>
  </ItemGroup>

  <PropertyGroup>
    <!-- Examples of string instance methods as 'Item Functions' -->
    <ItemLengths>@(Items->get_Length())</ItemLengths>
    <ItemRemoveFirstChars>@(Items->Substring(1))</ItemRemoveFirstChars>
  </PropertyGroup>

  <ItemGroup>
    <!-- Examples of string instance methods on metadata values -->
    <Items2 Include="@(Items)">
      <FilenameLength>$([System.String]::Copy('%(Filename)').Length)</FilenameLength>
      <FilenameRemoveFirstChar>$([System.String]::Copy('%(Filename)').Substring(1))</FilenameRemoveFirstChar>
      <FilenameRemoveLastChar>$([System.String]::Copy('%(Filename)').Substring(0, $([MSBuild]::Subtract($([System.String]::Copy('%(Filename)').Length), 1))))</FilenameRemoveLastChar>
    </Items2>
  </ItemGroup>

  <Target Name="ShowResults">
    <Message Text="Property Functions"/>
    <Message Text="PropLength           = $(PropLength)"/>
    <Message Text="PropRemoveFirstChar  = $(PropRemoveFirstChar)"/>
    <Message Text="PropRemoveLastChar   = $(PropRemoveLastChar)"/>
    <Message Text="Item Functions"/>
    <Message Text="ItemLengths          = $(ItemLengths)"/>
    <Message Text="ItemRemoveFirstChars = $(ItemRemoveFirstChars)"/>
    <Message Text="Items2 with Metadata"/>
    <Message Text="@(Items2->'  Identity = %(Identity), FilenameLength = %(FilenameLength), FilenameRemoveFirstChar = %(FilenameRemoveFirstChar), FilenameRemoveLastChar = %(FilenameRemoveLastChar)', '%0d%0a')"/>
  </Target>
  <!--
    Output:
      ShowResults:
        Property Functions
        PropLength           = 6
        PropRemoveFirstChar  = ooBar
        PropRemoveLastChar   = FooBa
        Item Functions
        ItemLengths          = 6;4
        ItemRemoveFirstChars = ooBar;uux
        Items2 with Metadata
          Identity = FooBar, ItemLength = 6, ItemRemoveFirstChar = ooBar, ItemRemoveLastChar = FooBa
          Identity = Quux, ItemLength = 4, ItemRemoveFirstChar = uux, ItemRemoveLastChar = Quu
    -->

</Project>

Imagine a scenario with a set of files where some of the filenames use a special suffix.

<!-- suffixexample.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <SourceList Include="a.txt;b.txt;a-alt.txt;b-alt.txt"/>
  </ItemGroup>

  <PropertyGroup>
    <suffix>-alt</suffix>
  </PropertyGroup>

  <ItemGroup>
    <Files Include="@(SourceList)">
      <HasSuffix>$([System.String]::Copy('%(Filename)').Endswith('$(suffix)'))</HasSuffix>
      <PrimaryName Condition="%(HasSuffix)">$([System.String]::Copy('%(Filename)').Substring(0, $([MSBuild]::Subtract($([System.String]::Copy('%(Filename)').Length), $(suffix.Length)))))</PrimaryName>
      <PrimaryName Condition="!%(HasSuffix)">%(Filename)</PrimaryName>
    </Files>
  </ItemGroup>

  <Target Name="ListFilesWithSuffix">
    <Message Text="@(Files->'%(Identity) PrimaryName = %(PrimaryName)','%0d%0a')" Condition="%(HasSuffix)" />
  </Target>
  <!--
    Output:
      ListFilesWithSuffix:
        a-alt.txt PrimaryName = a
        b-alt.txt PrimaryName = b
    -->

</Project>

The Files ItemGroup is created with metadata that provides a boolean indicating if the special suffix is present or not and metadata for the filename sans suffix (regardless of whether the suffix is present). The Target 'ListFilesWithSuffix' uses metadata batching to display the files that have the suffix.

MSBuild, About Import Guards

It is reasonable, for maintainability and for reusability, to parcel MSBuild code into multiple files and then, for a given project, to Import files as needed to support specific functionality.

MSBuild checks for duplicate imports (including 'self' imports). If a duplicate import is detected, MSBuild will show a warning (either MSB4011 'DuplicateImport' or MSB4210 'SelfImport') and block the import. Barring duplicate imports prevents circular references.

However, as the number of files increase and/or the number of maintainers increase, it can become easier to unexpectedly introduce a duplicate import.

Import Guards

Custom MSBuild files can be written with "import guards", which are very much like old school C header file include guards.

Within a given file, define a property that is unique to the file and that will only ever be defined by the given file.

A file named 'common.targets' might start with the following:

<!-- common.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-CommonTargets>defined</ImportGuard-CommonTargets>
  </PropertyGroup>

The name of the property, 'ImportGuard-CommonTargets', is derived from the filename. If there are names that repeat in different folders — for example, if there is a chain of Directory.Build.props or Directory.Build.targets files, then a different convention will be needed to ensure that the property names are unique.

Any import of 'common.targets' should use a Condition to test that the unique property for 'common.targets' is not defined.

<Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

If the property has no value, then the file is imported.

If the property has any value, then the file is already imported and should not be imported again.

With the use of import guards, a file can explicitly import all of its dependencies regardless of imports that may exist in other files. In the following diagram, A.proj can be explicit about import dependencies on both Y.targets and Z.targets.

MSBuildImports.png

Without import guards, the import of Z.targets by A.proj is a duplicate import that generates a warning. This is because Y.targets imports Z.targets first. The warning can be resolved by modifying A.proj to remove the import of Z.targets.

With import guards for Y.targets and Z.targets, there is no warning. Further, if Y.targets is changed in the future to remove the import of Z.targets, there is no associated code change to A.proj.

Visualizing the Import Order

The -preprocess command line argument to msbuild.exe (see Switches) will produce output that is all the imported files inlined. The project is not built when this switch is used.

A lighter-weight approach is to create an item group of the imported files and create a target to report the item group.

In each file, add, at the top of the file, an ItemGroup with an include of the current file:

<ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

A target that reports the ItemGroup would be:

<Target Name="ListFiles">
    <Message Text="Project:" />
    <Message Text="  $(MSBuildProjectFullPath)" />
    <Message Text="Files:" />
    <Message Text="  @(Diagnostic-FilesList, '%0d%0a')"/>
  </Target>

The Diagnostic-FilesList will be a list of the files being used in the order that the files were imported. This is far less complete then the -preprocess option but, unlike -preprocess, the project is built and other targets can be run along with the ListFiles target.

Example

Following is a full example with five files: A.proj, B.proj, Y.targets, Z.targets, and common.targets.

First is 'common.targets'. The common.targets file defines an import guard. It also adds itself to the Diagnostic-FilesList ItemGroup. common.targets defines the ListFiles target.

<!-- common.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-CommonTargets>defined</ImportGuard-CommonTargets>
  </PropertyGroup>
  <!--
    Example import:
    <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />
    -->
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Target Name="ListFiles">
    <Message Text="Project:" />
    <Message Text="  $(MSBuildProjectFullPath)" />
    <Message Text="Files:" />
    <Message Text="  @(Diagnostic-FilesList, '%0d%0a')"/>
  </Target>

</Project>

The 'Z.targets' file defines two targets: DoSomeWork and Customize. Z.targets has an import guard, adds itself to Diagnostic-FilesList, and imports common.targets.

<!-- Z.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-ZTargets>defined</ImportGuard-ZTargets>
  </PropertyGroup>
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Target Name="DoSomeWork" DependsOnTargets="Customize">
    <Message Text="Working." Condition="'$(Work)' == ''"/>
    <Message Text="$(Work)." Condition="'$(Work)' != ''"/>
  </Target>

  <Target Name="Customize" />

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

The 'Y.targets' file imports Z.targets and redefines the Customize target. Y.targets has an import guard, adds itself to Diagnostic-FilesList, and imports common.targets.

<!-- Y.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-YTargets>defined</ImportGuard-YTargets>
  </PropertyGroup>
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Import Project="Z.targets" Condition=" '$(ImportGuard-ZTargets)' == '' " />

  <Target Name="Customize">
    <PropertyGroup>
      <Work>Writing</Work>
    </PropertyGroup>
  </Target>

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

A.proj and B.proj are 'project' files that use the set of .targets files. A.proj and B.proj each adds itself to Diagnostic-FilesList and imports common.targets.

A.proj imports Y.targets which in turn imports Z.targets. A.proj has an import of Z.targets but the condition that tests the import guard will prevent the import.

<!-- A.proj -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="Main">
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Import Project="Y.targets" Condition=" '$(ImportGuard-YTargets)' == '' " />
  <Import Project="Z.targets" Condition=" '$(ImportGuard-ZTargets)' == '' " />

  <Target Name="Main" DependsOnTargets="Preamble;DoSomeWork" />
  <!--
    Output:
      Preamble:
        Project A
      DoSomeWork:
        Writing.
    -->

  <Target Name="Preamble">
    <Message Text="Project A" />
  </Target>

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

B.proj imports Z.targets. The customization that is done in Y.targets is not seen and is not used by B.proj.

<!-- B.proj -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="Main">
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Import Project="Z.targets" Condition=" '$(ImportGuard-ZTargets)' == '' " />

  <Target Name="Main" DependsOnTargets="Preamble;DoSomeWork" />
  <!--
    Output:
      Preamble:
        Project B
      DoSomeWork:
        Working.
    -->

  <Target Name="Preamble">
    <Message Text="Project B" />
  </Target>

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

Running the command "msbuild A.proj /t:ListFiles" will produce output like the following. (I have omitted the full paths of the files.)

ListFiles:
  Project:
    A.proj
  Files:
    A.proj
    Y.targets
    Z.targets
    common.targets

The A.proj, B.proj, Y.targets, and Z.targets files all import common.targets. This means that the ListFiles target can be run against any of the files.

The command "msbuild Y.targets /t:ListFiles" will produce output like the following.

ListFiles:
  Project:
    Y.targets
  Files:
    Y.targets
    Z.targets
    common.targets

Summary

Import guards allow for a set of MSBuild files to be more loosely coupled. Adding or removing an import in a given file doesn't need to ripple out and necessitate changes to other files. Because every MSBuild file is a 'Project', every MSBuild file is invocable. With import guards, files that are not normally an entry point can still be successfully run because the required imports can be present even if in the normal case the imports are not performed. Encapsulating/limiting changes aids maintenance and using a 'library' file as the primary project file aids testing.

How the Use of CallTarget can be a Code Smell in MSBuild

CallTarget, because it vaguely resembles a sub-routine call, is sometimes used to approximate a procedural approach in MSBuild — but this is a problem because MSBuild is a declarative language and Targets are not sub-routines.

Example

To elaborate, imagine there are two processes to be implemented in MSBuild. Process A has steps A-1, A-2, and A-3. Process B has steps B-1, B-2, and B-3. However, steps A-2 and B-2 are actually identical. The two processes have a common step. It makes sense to factor the common step into its own target that both processes can use.

From a procedural frame of mind, an implementation might look like the following. The output shows that steps A-1, Common (aka A-2), and A-3 are run for Process A. Likewise when running Process B.

<!-- calltarget-example-01.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA">
    <Message Text="perform Step A-1"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step A-3"/>
  </Target>
  <!--
    Output:
      PerformProcessA:
        perform Step A-1
      Common:
        perform Common Step
      PerformProcessA:
        perform Step A-3
    -->

  <Target Name="PerformProcessB">
    <Message Text="perform Step B-1"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      PerformProcessB:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
  </Target>

</Project>

However an implementation that is MSBuild native would eschew the CallTarget task and might look like the following.

<!-- calltarget-example-02.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA" DependsOnTargets="StepA1;Common">
    <Message Text="perform Step A-3"/>
  </Target>
  <!--
    Output:
      StepA1:
        perform Step A-1
      Common:
        perform Common Step
      PerformProcessA:
        perform Step A-3
    -->

  <Target Name="PerformProcessB" DependsOnTargets="StepB1;Common">
    <CallTarget Targets="Common"/>
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      StepB1:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
  </Target>

  <Target Name="StepA1">
    <Message Text="perform Step A-1"/>
  </Target>

  <Target Name="StepB1">
    <Message Text="perform Step B-1"/>
  </Target>

</Project>

Both implementations perform the steps; so what's the difference and why does the difference matter?

Scope

The difference is that CallTarget creates a new scope. And the new scope is initialized with the state of Properties and Items at the point where the Target containing the CallTarget task was started; Changes made to Properties and Items within the containing Target will not be seen.

Following is a revision of the example code that uses CallTarget. A Property has been added that is modified at different points.

<!-- calltarget-example-01a.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA">
    <Message Text="perform Step A-1"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in PerformProcessA</ExampleValue>
    </PropertyGroup>
    <Message Text="PerformProcessA: ExampleValue = $(ExampleValue)"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step A-3"/>
    <Message Text="PerformProcessA: ExampleValue = $(ExampleValue)"/>
  </Target>
  <!--
    Output:
      PerformProcessA:
        perform Step A-1
        PerformProcessA: ExampleValue = Initialized in Project;Modified in PerformProcessA
      Common:
        perform Common Step
        Common: ExampleValue = Initialized in Project;Modified in Common
      PerformProcessA:
        perform Step A-3
        PerformProcessA: ExampleValue = Initialized in Project;Modified in PerformProcessA
    -->

  <Target Name="PerformProcessB">
    <Message Text="perform Step B-1"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      PerformProcessB:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in Common</ExampleValue>
    </PropertyGroup>
    <Message Text="Common: ExampleValue = $(ExampleValue)"/>
  </Target>

  <PropertyGroup>
    <ExampleValue>Initialized in Project</ExampleValue>
  </PropertyGroup>

</Project>

Note that the change to append ";Modified in PerformProcessA" to the Property is not visible in the CallTarget task's scope. Also note that the change to append ";Modified in Common" which is done within the CallTarget task's scope is not seen by the Target that contains the CallTarget task.

Next is a similar revision of the example code that doesn't use CallTarget.

<!-- calltarget-example-02a.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA" DependsOnTargets="StepA1;Common">
    <Message Text="perform Step A-3"/>
    <Message Text="PerformProcessA: ExampleValue = $(ExampleValue)"/>
  </Target>
  <!--
    Output:
      StepA1:
        perform Step A-1
        StepA1: ExampleValue = Initialized in Project;Modified in StepA1
      Common:
        perform Common Step
        Common: ExampleValue = Initialized in Project;Modified in StepA1;Modified in Common
      PerformProcessA:
        perform Step A-3
        PerformProcessA: ExampleValue = Initialized in Project;Modified in StepA1;Modified in Common
      -->

  <Target Name="PerformProcessB" DependsOnTargets="StepB1;Common">
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      StepB1:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in Common</ExampleValue>
    </PropertyGroup>
    <Message Text="Common: ExampleValue = $(ExampleValue)"/>
  </Target>

  <Target Name="StepA1">
    <Message Text="perform Step A-1"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in StepA1</ExampleValue>
    </PropertyGroup>
    <Message Text="StepA1: ExampleValue = $(ExampleValue)"/>
  </Target>

  <Target Name="StepB1">
    <Message Text="perform Step B-1"/>
  </Target>

  <PropertyGroup>
    <ExampleValue>Initialized in Project</ExampleValue>
  </PropertyGroup>

</Project>

Note that, with this code, the final value of the Property is the result of all of the append operations: "Initialized in Project;Modified in StepA1;Modified in Common".

Project

MSBuild files are XML and the root element of an MSBuild file is the Project element. The Project will contain everything defined in the MSBuild file and everything from all imported files.

A Target is not a function. Targets have no arguments and no return values. A Target is a unit of work performed on or with the data in the Project.

Summary

Defining the target build order doesn't have the overhead of creating a new scope. CallTarget is not a substitute for an ordering attribute.

There are special cases where CallTarget is uniquely useful but those cases are uncommon.

Rampant use of CallTarget is an indication that the author of the code was working in the wrong paradigm.

MSBuild Batching, an Explanation

The MSBuild language is a declarative programming language and there are no control flow constructs for looping. Instead, for processing collections, MSBuild provides a language feature named 'batching'. Unfortunately, batching is complex and its behavior can appear inscrutable. Some coders, rather than understand batching, create equivalents for loops — and create other problems by doing so. Generally, forcing an imperative or procedural style into MSBuild tends to produce scripts that are less performant and less maintainable.

Understanding batching is essential to writing good MSBuild code. If you have found MSBuild batching to be confuzzling1 and if reading the Microsoft MSBuild Batching documentation doesn't clear up all the mystery, then this article is an attempt to explain some of the 'gotchas'.

Properties, Items, and Metadata

MSBuild has properties and items which are scalar variables and collections, respectively. A scalar variable holds one value. A collection holds a set of values.

A member of an item collection has metadata. Metadata is a collection of key-value pairs. The metadata collection is never an empty set. There is always at least a key-value pair with a key of 'Identity'. Identity is the name by which the member was added to the collection.

Metadata is the foundation that batching is built upon.

Metadata Examples

References to properties are introduced with a '$', references to an item collection are introduced with a '@', and references to a metadata key are introduced with a '%'.

In the following code, note the difference in the output between @(Example) and %(Example.Identity).

<!-- batching-example-01.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

    <ItemGroup>
        <!-- Creates an Item collection named 'Example' and adds 'Item1' to Example. -->
        <Example Include="Item1" />
        <!-- Adds 'Item2' to Example. -->
        <Example Include="Item2" />
    </ItemGroup>

    <Target Name="DisplayExample">
        <Message Text="@(Example)" />
    </Target>
    <!--
    Output:
      DisplayExample:
        Item1;Item2
      -->

    <Target Name="DisplayExampleByIdentity">
        <Message Text="%(Example.Identity)" />
    </Target>
    <!--
    Output:
      DisplayExampleByIdentity:
        Item1
        Item2
      -->

</Project>

In the 'DisplayExample' target, the Message task is executed once and it displays the whole collection as a string.

In the 'DisplayExampleByIdentity' target; however, batching is being used, specifically task batching. The Message task is executed twice, once for each distinct value of the Identity metadata.

To see the batches more clearly, the next code example adds 'Color' metadata and the Message task batches on distinct values of Color.

<!-- batching-example-02.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Example Include="Item1">
      <Color>Blue</Color>
    </Example>
    <Example Include="Item2">
      <Color>Red</Color>
    </Example>
    <Example Include="Item3">
      <Color>Blue</Color>
    </Example>
  </ItemGroup>

  <Target Name="DisplayExampleByColor">
    <Message Text="@(Example)" Condition=" '%(Color)' != '' " />
  </Target>
  <!--
    Output:
      DisplayExampleByColor:
        Item1;Item3
        Item2
      -->

  <Target Name="DisplayExampleByColorWithTransform">
    <Message Text="@(Example->'%(Identity) has %(Color)')" Condition=" '%(Color)' != '' " />
  </Target>
  <!--
    Output:
      DisplayExampleByColorWithTransform:
        Item1 has Blue;Item3 has Blue
        Item2 has Red
      -->

  <Target Name="DisplayExampleWithTransform">
    <Message Text="@(Example->'%(Identity) has %(Color)')" />
  </Target>
  <!--
    Output:
      DisplayExampleWithTransform:
        Item1 has Blue;Item2 has Red;Item3 has Blue
      -->

</Project>

Note that in the 'DisplayExampleByColor' target, the content of @(Example) has changed. It is not the whole collection; It is the subset that conforms to the current batch.

An expectation that @(\<Name\ data-preserve-html-node="true">) is always the complete collection is natural but is incorrect. A better (but still simple) mental model is to consider @(\<Name\ data-preserve-html-node="true">) as always a set derived from the collection. The set will be the complete collection when there is no batching and a subset of the collection when there is batching.

The 'DisplayExampleByColorWithTransform' target is the same batching operation as the 'DisplayExampleByColor' target. The only difference is that an item transform is used to show both the Identity and the Color. Using references to metadata inside a transform expression has no impact on batching. The last target, 'DisplayExampleWithTransform', demonstrates that, with only the transform expression, there is no batching.

Task Batching

MSBuild supports Target Batching and Task Batching. The examples so far have all been task batching.

For Task batching to be in effect, there must be a task that uses a metadata reference, e.g. %(\<name\ data-preserve-html-node="true">). The following stipulations apply:

  • The child items of ItemGroup and PropertyGroup (i.e. definitions of Items and Properties) are treated as implicit tasks for task batching.
  • Property functions in a task are evaluated for metadata references.
  • Transform expressions are excluded from triggering batching.

That transform expressions are excluded is shown in the prior example code. The next code example shows task batching with Item and Property definitions.

<!-- batching-example-03.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Example Include="Item1">
      <Color>Blue</Color>
    </Example>
    <Example Include="Item2">
      <Color>Red</Color>
    </Example>
    <Example Include="Item3">
      <Color>Blue</Color>
    </Example>
  </ItemGroup>

  <Target Name="DisplayResults">
    <ItemGroup>
      <Item1 Include="%(Example.Identity)" />
      <Item2 Include="%(Example.Color)" />
    </ItemGroup>
    <PropertyGroup>
      <Prop1>%(Example.Identity)</Prop1>
      <Prop2>%(Example.Color)</Prop2>
    </PropertyGroup>
    <Message Text="Item1 = @(Item1)" />
    <Message Text="Prop1 = $(Prop1)" />
    <Message Text="Item2 = @(Item2)" />
    <Message Text="Prop2 = $(Prop2)" />
  </Target>
  <!--
    Output:
      DisplayResults:
        Item1 = Item1;Item2;Item3
        Prop1 = Item3
        Item2 = Blue;Red
        Prop2 = Red
      -->

</Project>

A task batched Property doesn't accumulate. Effectively the property is re-defined with each batch in the batching execution. The final value will be the last batch value. The code in the example demonstrates this. Prop1 and Prop2 finish with the last values that are in Item1 and Item2, respectively.

The code in the example for Properties is not practically useful and it is only for demonstration purposes. Pulling a single value from a collection (especially a collection with a large numbr of batches) into a property might be better accomplished as follows (assuming that 'Identity' is unique which may not be true depending on the data). There is still a batching execution that partitions the collection, but the Property is defined once.

<PropertyGroup>
  <Prop3 Condition="'%(Identity)'=='Item3'">@(Example->Metadata('Color'))</Prop3>
</PropertyGroup>

Using a property function might look like the following:

<PropertyGroup>
  <Prop3 Condition="'%(Identity)'=='Item3'">$([System.IO.Path]::Combine($(SomePath),%(Example.Color)))</Prop3>
</PropertyGroup>

Qualified and Unqualified Metadata

A metadata reference can be qualified or unqualified.

The following shows a qualified reference. The name of the metadata is qualified with the name of the item collection.

<Message Text="%(Example.Color)" />

An unqualified reference uses just the name of the metadata and the relevant item collection is inferred.

<Message Text="@(Example)" Condition=" '%(Color)' != '' " />

If multiple collections are used, then the unqualified metadata reference is applied across all the collections.

<Message Text="@(Example1);@(Example2)" Condition=" '%(Color)' == 'Blue' " />

When using an unqualified metadata reference, regardless of whether one or many item collections are used, every member of each collection must have the metadata defined. If an item in one of the collections is missing the metadata, MSBuild will generate an error.

Batching is partitioning item collections based on metadata values. Whether the specified metadata is qualified or not makes a difference in how the collections are partitioned when multiple collections are used.

<!-- batching-example-04.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <!-- Example1 -->
    <Example1 Include="Item1">
      <Color>Blue</Color>
    </Example1>
    <Example1 Include="Item2">
      <Color>Red</Color>
    </Example1>
    <!-- Example2 -->
    <Example2 Include="Item3">
      <Color>Blue</Color>
    </Example2>
  </ItemGroup>

  <Target Name="DisplayResults">
    <!-- Unqualified -->
    <ItemGroup>
      <Result0 Include="@(Example1);@(Example2)" Condition=" '%(Color)' == 'Blue' " />
    </ItemGroup>
    <Message Text="@(Result0->'%(Identity) has %(Color)')" />
    <!-- Qualified Example1 -->
    <ItemGroup>
      <Result1 Include="@(Example1);@(Example2)" Condition=" '%(Example1.Color)' == 'Blue' " />
    </ItemGroup>
    <Message Text="@(Result1->'%(Identity) has %(Color)')" />
    <!-- Qualified Example2 -->
    <ItemGroup>
      <Result2 Include="@(Example1);@(Example2)" Condition=" '%(Example2.Color)' == 'Blue' " />
    </ItemGroup>
    <Message Text="@(Result2->'%(Identity) has %(Color)')" />
  </Target>
  <!--
    Output:
      DisplayResults:
        Item1 has Blue;Item3 has Blue
        Item1 has Blue;Item3 has Blue
        Item1 has Blue;Item2 has Red;Item3 has Blue
    -->

</Project>

The 'DisplayResults' target is creating three new item collections.

Result0, with an unqualified reference to Color, is a collection of items from Example1 where Color is Blue and items from Example2 where Color is Blue.

Result1, with a reference to Color qualified to Example1, is a collection of items from Example1 where Color is Blue and all items from Example2.

Result2, with a reference to Color qualified to Example2, is a collection of all items from Example1 and items from Example2 where Color is Blue.

Target Batching

For Target batching to be in effect, there must be a metadata reference in one of the attributes of the Target element.

<!-- batching-example-05.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Example Include="Item1">
      <Color>Blue</Color>
      <Shape>Square</Shape>
    </Example>
    <Example Include="Item2">
      <Color>Red</Color>
      <Shape>Square</Shape>
    </Example>
    <Example Include="Item3">
      <Color>Blue</Color>
      <Shape>Circle</Shape>
    </Example>
  </ItemGroup>

  <Target Name="DisplayTargetBatchByColor" Outputs="%(Example.Color)">
    <Message Text="MessageTask: @(Example->'%(Identity) has %(Color) %(Shape)')" />
  </Target>
  <!--
    Output:
      DisplayTargetBatchByColor:
        MessageTask: Item1 has Blue Square;Item3 has Blue Circle
      DisplayTargetBatchByColor:
        MessageTask: Item2 has Red Square
    -->

  <Target Name="DisplayTargetBatchAndTaskBatch" Outputs="%(Example.Color)">
    <Message Text="MessageTask: @(Example->'%(Identity) has %(Color) %(Shape)')" Condition=" '%(Shape)' != '' " />
  </Target>
  <!--
    Output:
      DisplayTargetBatchAndTaskBatch:
        MessageTask: Item1 has Blue Square
        MessageTask: Item3 has Blue Circle
      DisplayTargetBatchAndTaskBatch:
        MessageTask: Item2 has Red Square
    -->

</Project>

The 'DisplayTargetBatchByColor' target is executed once per Color batch, that is once for 'Blue' and once for 'Red'.

The 'DisplayTargetBatchAndTaskBatch' target is also executed once per Color batch but contains a Task that is batched per Shape.

Intersection of Two MSBuild Item Collections

The next example code shows two approaches for getting the intersection of two Item collections.

The first approach uses set algebra and can be used outside of a target.

The second approach uses task batching and was once described as a 'batching brainteaser'. It leverages the way that unqualified metadata references are handled.

<!-- batching-example-06.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <!-- Example1 -->
    <Example1 Include="Item1" />
    <Example1 Include="Item2" />
    <Example1 Include="Item4" />
    <!-- Example2 -->
    <Example2 Include="Item2" />
    <Example2 Include="Item3" />
    <Example2 Include="Item4" />
  </ItemGroup>

  <!-- Get the intersection of Example1 and Example2 without batching. -->
  <ItemGroup>
    <Intermediate Include="@(Example1)" Exclude="@(Example2)" />
    <!-- Intermediate has the items that are in Example1 and not in Example2. -->
    <Intersection Include="@(Example1)" Exclude="@(Intermediate)" />
    <!-- Intersection has the items that are in Example1 and not in Intermediate. -->
  </ItemGroup>

  <Target Name="DisplayIntersection">
    <Message Text="@(Intersection, '%0d%0a')" />
  </Target>
  <!--
    Output:
      DisplayIntersection:
        Item2
        Item4
    -->

  <!-- Get the intersection of Example1 and Example2 using batching. -->
  <Target Name="DisplayIntersectionByBatching">
    <ItemGroup>
      <IntersectionByBatching Include="@(Example1)" Condition="'%(Identity)' != '' and '@(Example1)' == '@(Example2)'" />
    </ItemGroup>
    <Message Text="@(IntersectionByBatching, '%0d%0a')" />
  </Target>
  <!--
    Output:
      DisplayIntersectionByBatching:
        Item2
        Item4
    -->

</Project>

The IntersectionByBatching line can also be written as:

<ItemGroup>
  <IntersectionByBatching Include="@(Example1)" Condition="'%(Identity)' != '' and '@(Example2)' != ''" />
</ItemGroup>

The result is the same.

Cartesian Product of Two MSBuild Item Collections

One more practical example is an approach for computing a cartesian product by using batching.

<!-- batching-example-07.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Rank Include="Ace;King;Queen;Jack;10;9;8;7;6;5;4;3;2" />
    <Suit Include="Clubs;Diamonds;Hearts;Spades" />
  </ItemGroup>

  <Target Name="DisplayCardDeck">
    <ItemGroup>
      <CardDeck Include="@(Rank)">
        <Suit>%(Suit.Identity)</Suit>
      </CardDeck>
    </ItemGroup>
    <Message Text="@(CardDeck->'%(Identity) of %(Suit)', '%0d%0a')" />
  </Target>
  <!--
    Output:
      DisplayCardDeck:
        Ace of Clubs
        King of Clubs
        Queen of Clubs
        Jack of Clubs
        10 of Clubs
        9 of Clubs
        8 of Clubs
        ...
    -->

</Project>

References

References for this article include:

1 'confuzzling' is a portmanteau of 'confusing' and 'puzzling' coined by my daughter.

My Greatest Weakness?

The question “What is your greatest weakness?” is a job interview trope.

I don’t know my greatest weakness. I don’t know my greatest strength. I haven’t given thought to either superlative.

Whatever is my greatest weakness today, probably won’t be for long. I make mistakes. I have regrets. But I’m continually improving. Mistakes and failings are learning opportunities. I am better prepared the second time I encounter a given issue or situation.

Part of the territory of being a software engineer is to always be learning, adapting, and problem solving. In that mutual endeavor we should be generous. We should stand on the shoulders of others and be a shoulder for others to stand on.

Review: "HTML 4 & 5: The Complete Reference" By O'Reilly Media for iOS

This is not a book stuffed into an app. It is an application. The second thing to know is that the content is very good.

The content is a refresh of the O’Reilly HTML & XHTML Pocket Reference, Fourth Edition and benefits from have been honed over multiple editions. As a pocket reference it shows a decision to opt for concision over being exhaustively deep. But it is good because it is on point.

The content can be browsed or searched. Browsing includes a list of elements and a list of attributes. From the attribute list attribute entries link back to element entries for the elements that use the given attribute.

The search feature works across both elements and attributes but apparently there’s no word stemming. Searching on ‘sel’ finds no results. ‘select’ finds results that include the select element but not the selected attribute.

The app was developed in HTML5, CSS3, and jQTouch. PhoneGap was used to create an iOS executable. There are plans to bring the app to Android so using PhoneGap probably seemed reasonable. But there are quirks present that probably result from not being directly developed to iOS. Overall I didn’t see performance issues on a first gen iPad and on an iPhone4 but scrolling and scrubbing in the elements and attributes lists was sometimes touchy. What seemed like a light flick would sometimes send the list flying to the end. Touching the scrub bar is not always recognized, so instead of scrubbing through alphabetically the list is only scrolling. There’s no visual cue when the scrub bar has been activated.

Despite a few deficiencies it’s a huge win in convenience and utility to have this content in the form of an application. I’ve been working on updating an older site and this application has proved its worth as what it claims to be: a complete reference.

HTML 4 & 5: The Complete Reference is on the iTunes App Store.

Thanks to O’Reilly for providing the application for review.

Defensive Threading

Designing and writing multi-threaded code is hard. Threading adds a debilitating amount of complexity to the Von Neumann architecture and, despite 50+ years of theory and implementation, the threading support in most programming languages is low level, the equivalent of malloc rather than GC managed memory.

If you’re a developer and you approach threading with trepidation then here are some experience tested guidelines and techniques that I try to follow. The overall strategy is to reduce the surface area for bugs and to increase the maintainability of the code. These are not hard and fast rules. It’s important to be pragmatic and to adjust for the specific situation.

But first, please note that I’m no expert. I’m sharing what I’ve learned and I know I’m not done learning. If this post helps you, I want to know. If you think I’ve got something wrong, I want to know. Thank you.

 

Avoid sharing data.

Sharing data between threads is the most common source of threading bugs. By definition many of the nastiest threading bugs can only occur when data is shared. Safely sharing data requires synchronization either in the form of a lock or an atomic primitive. Getting the synchronization right is not always easy. Avoid sharing data and the threading bug potential goes way down. As a bonus, the code can potentially run faster because the possibility of synchronization contention is eliminated.

It may seem that without sharing data, threads are pretty useless. Not so. But this is a case where you need to be willing and able to trade space for speed and safety. Give a thread it’s own private copy of everything it may need, let the thread run to completion (‘wait’ or ‘join’ on the thread), and then collect the results of it’s computations.

Avoid sharing logging.

Logging is data. See above. Sometimes the messages generated on a thread need to be reported more or less immediately. But if a delay can be tolerated, a thread can have it’s own private log that gets collected after the thread has run to completion.

When you must share, prefer lighter weight synchronization mechanisms.

I didn’t understand lock-free threading until I understood that lock-free doesn’t mean no synchronization mechanisms at all.

Under Windows the Interlocked* APIs represent atomic operations implemented by specific processor instructions. Critical sections are implemented via the interlocked functions. The interlocked functions and the critical section on Windows and the equivalent on other platforms are generally the lightest weigh synchronization mechanisms.

Technically the interlocked functions are not locks, they are hardware implemented primitives. But colloquially developers will speak of ‘locks’ and mean the whole set of synchronization mechanisms, hence my confusion over lock-free threading.

Having said that I will now forgo rigor and refer to all synchronization mechanisms as ‘locks’ because it’s pithier.

Use the smallest number of locks possible.

Don’t treat locks like magic pixie dust and sprinkle them everywhere. Synchronization locks provide safety but at a cost in performance and a greater potential for bugs. Yes, it’s kind of paradoxical.

Hold locks for the shortest length of time possible.

A critical section, for an example, allows only one thread to enter a section of code at a time. The longer the execution time of the section, the longer the window for other threads to box car up waiting for entry.

If there are values that can be computed before entering a critical section, do so. Only the statements that absolutely must be protected should be within the section.

Sometimes a value needs to be retrieved from a synchronized source, used as part of a computation, and a product of the computation needs to be stored to a synchronized destination. If the whole operation does not need to be atomic then, despite the desire to minimize locks, two independent synchronization locks could be better than one. Why? Because the source and destination are decoupled and the combined lock hold time is reduced because only the source read and the destination write are covered.

Be DRY - Don’t Repeat Yourself.

Always a good policy, being DRY has special importance with threaded code. There should be only one expression or implementation of any given synchronized operation. Every thread should be executing the same code to ensure that the operation is performed consistently.

Design to ensure that every acquisition of a lock is balanced by a release of the lock.

Take advantage of the RAII pattern or the dispose pattern or whatever is appropriate to the language and platform. Don’t rely on the developer (even when the developer is yourself) to remember to explicitly add every release for every acquisition.

Finish the threads you start.

Don’t fire and forget. Wait or join and clean up your threads and their resources.

Don’t let threads you created outlive the primary thread in the process. Some platforms have the unfortunate design of performing runtime set up and initialization, calling the program code, and then tearing down and de-initializing on the primary thread. Other threads that may still be running after the primary thread exits may fail when the runtime is pulled out from underneath them.

Don’t kill a thread. That will leave data in an undeterminable state. If appropriate implement a way to signal your thread to finish.

Don’t get focused on locking the data when it’s the operation that needs to be synchronized.

It’s easy to get fixated on the shared data but often times the design is better served by paying more attention to the shared operations.

Avoid nested locks.

Acquiring lock A and then lock B in one part of the code and elsewhere acquiring lock B and then lock A is asking for a deadlock. No-one sets out to write a deadlock. But sometimes the code isn’t exactly what was intended or a change is made and the impact of the change isn’t fully understood. If locks are never nested then the potential for this kind of deadlock just doesn’t exist.

If locks must be nested then alway acquire the locks in the same order and always release the locks in the same opposite order. And be DRY about it.

Avoid method calls within a lock.

This may seem aggressively limiting but method implementations can change and introduce new issues like unwanted nested locks. Keeping method calls out of the scope of locks reduces coupling and the potential for deleterious side effects.

(I think of this one as following in the spirit of LoD.)

Encapsulate intermediate values.

Avoid creating inconsistent state in shared data. Operations that create intermediate values shouldn’t be performed in situ on shared objects or data structures. Use local temporaries for intermediate values. Only update the shared data with the end products of a operation.

Be careful of using the double check lock pattern.

It’s a wonderful idea that’s so clever that’s it’s broken in many languages and runtime platforms.

Where the DCLP is not just inherently broken and can be used, it still needs to be implemented correctly. Mark the variable under test as ‘volatile’ (or the closest equivalent) so that the language compiler doesn’t optimize away one of the two checks. Don’t operate directly on the variable under test. Use a temporary to hold intermediate values and wait to assign to the variable under test until the new state is fully computed.

The DCLP is often used for one time or rare initializations.

Don’t lock the ‘this’ pointer.

If there are member functions of an object that need to synchronize, create a private member variable that is used as the lock target rather than locking on ‘this’. In addition to encapsulating the locking, using a private member guards against potential issues where a framework, library, or runtime may be performing synchronization by using the ‘this’ pointer.

Explain your code.

Leave comments for yourself and the next person.

Strive to make the design grokkable.

The original design and implementation can be bug free but if the design is hard to understand and follow, bugs can be introduced in maintenance cycles. Even if you are the only developer who may ever touch the code, remember that the details of the design that are so crystal clear in your mind today will get supplanted by tomorrow’s events and miscellany like … Squirrel!

.Net Stopwatch Returns Negative Elapsed Time

The System.Diagnostics.Stopwatch class in the .Net runtime can be used to time operations. Bit there’s a problem in .Net 3.5 and earlier. The Elapsed* properties (Elapsed, ElapsedMilliseconds, ElapsedTicks) will sometimes return a negative value.

There’s a note in the documentation:

On a multiprocessor computer, it does not matter which processor the thread runs on. However, because of bugs in the BIOS or the Hardware Abstraction Layer (HAL), you can get different timing results on different processors. To specify processor affinity for a thread, use the ProcessThread.ProcessorAffinity method.

Some advocate setting processor affinity to eliminate the possibility of a negative elapsed time. But a common work-around is to simply use zero whenever the elapsed time is negative.

In recent work I used the Stopwatch class to time operations performed by Threads in a ThreadPool and I was getting false timings because of the negative elapsed time issue. I was leery of messing with the threads’ processor affinities because the ThreadPool is a shared resource. But I had concerns about the ‘convert to zero’ work-around.

Through a Microsoft contact I was able to confirm that the work-around of checking for ‘elapsed < 0’ is adequate. The negative time issue occurs only when measuring very small time periods. The issue is fixed in .Net 4.0 and the fix in the Stopwatch is to check that if elapsed < 0 then set elapsed to zero.

The Only Way To Improve Software Is To Change It

One place I worked had a major release of their enterprise software and following the release a bug appeared. A serious bug. The business was not happy. The Director of Software Engineering was under pressure. The bug had never been reported before and the system processed a large volume of activity every day and had done so since the last major release six months ago. The Director was confident the bug was introduced with the new release. Because of the sense of urgency he didn’t wait for a full analysis. He made the decision to rollback to the prior version. But he was wrong. Rolling back didn’t fix the issue. The bug was pre-existing. It was unfortunate timing that the first time conditions were right to trigger the bug was the day after a major release.

People sometime refer to software as ‘hardening’ in Production. There’s a pervasive presumption that working Production software is generally safer and less risky than newly developed software. The idea is that the longer the software has been in use in Production without change the less likely it is to have unknown bugs. But hardening is a misnomer. Software doesn’t cure like concrete.

The assumption behind ‘hardening’ is that Production is the ultimate test. That the volume and variety of a Production environment is more comprehensive than any test regimen could be. But how good of a test is Production? What if 80% of typical Production activity exercises only 20% of the code? What happens when the unusual occurs? Many organizations test for load and scale issues but I don’t hear of organizations commonly measuring (or inferring) code coverage in Production. Could the unit tests and QA tests actually be more comprehensive than typical Production scenarios? Could ‘hardening’ represent an unsafe assumption?

A colleague pointed out there can also be an anthropomorphizing factor at work. Who do you trust more — the new recruit or the veteran? Even with known flaws the veteran can seem like a safer choice than the unknown recruit. “The devil you know is better than the devil you don’t.” But software releases are not people and you may not know the devil you know as well as you think you know it.

Program testing can be used to show the presence of bugs, but never to show their absence.

Edsger W. Dijkstra

I once wrote the Dijkstra quote on a whiteboard as part of a presentation. The technical folk in the room nodded in recognition and agreement. The non-technical manager became agitated and vehemently disagreed.

To manage risk an organization working with the supposition that ‘hardened’ Production software is safer, may place strict controls on Production releases. Because of the effort involved in a release, releasing quarterly may be considered a fast schedule. Because of the amount of time between releases, each release tends to be large which increases the perceived risk.

If the potential for regression issues can be minimized (say through automated unit tests), which becomes the greater risk — introducing new bugs with a new release or not fixing latent bugs in the existing Production code? If the risk is in not releasing, than releases need to happen faster. Instead of quarterly what if releases were biweekly? The work would be chunked into smaller more frequent releases. Bugs which aren’t critical enough to require a special immediate fix could be fixed in production in 2 weeks instead of 3 months. The software could potentially adapt more quickly to changing needs and user feedback.

Build unit tests. Release often. The only way to improve software is to change it.

Hacking SLN Files to Improve Visual Studio/Visual Source Safe Integration

There are some who think Microsoft Visual Source Safe (VSS) is great. There are some who think that it’s not great but it is pretty good. And there are some who think Source Safe is just good enough. I am not among any of those groups.

One pain point with Source Safe in a Microsoft tool chain is integration with Visual Studio. Historically Visual Studio has stored source control information directly in the solution and project files. This reveals itself to be a really terrible idea the first time a project or solution file is shared in Source Safe. Checkouts from within the IDE will use the the source control bindings in the solution or project. If the solution or project file has been shared and not changed the bindings will be pointing to the original location in Source Safe. Checking out through Visual Studio will checkout the wrong file.

Yes, the solution and project files can be branched and the bindings updated. That’s sub-optimal. It means branching just to fix a tool problem and not because of any actual divergence in the code base. Any non-divergent changes to the solution or project must then be manually propagated to all the branched versions. Ugh.

The problem has not gone unnoticed at Microsoft. Over time and new releases of VStudio and VSS, improvements have been made.

My current position of employment is not using the latest and greatest. We’re on Visual Studio 2005 and Visual Source Safe 2005. The code base is in C#. I worked out a way to have shared solution and project files for non-web projects. Web projects in 2005 are different beasts and don’t follow the same rules. If you are using different versions of Visual Studio or Visual Source Safe your mileage will vary.

When getting files from VSS into a working directory, Visual Source Safe 2005 will create or update a mssccprj.scc file if any of the files present include a solution (.sln) or project file (for C#, .csproj). The mssccprj.scc file is a text file and it will contain the VSS bindings for the solution and project files that are in the same directory as the mssccprj.scc file.

In Visual Studio 2005, project files by default use the special ‘SAK’ string in the project file settings for VSS bindings. SAK indicates to Visual Studio that the bindings in the mssccprj.scc file should be used. The mssccprj.scc bindings are based on the directory retrieved from Source Safe. This means that shared project files just work. Yay.

In 2005 the problem is with solution files.

More specifically the problem is with solution files that include project files that are not in the same directory as the solution file. Creating a MyHelloWorld project will create a MyHelloWorld.sln and a MyHelloWorld.csproj in a MyHelloWorld directory. The .sln will reference the .csproj by name only and both files will have bindings in the mssccprj.scc file in the same directory and that all works without issue. But create a blank solution and add multiple existing projects to it and Visual Studio 2005 reverts to storing specific VSS bindings for the projects in the solution file.

There’s a work-around but it doesn’t work for parenting paths. Any time a solution references a project where the path to the project file starts with ‘..’, Visual Studio will revert to storing a VSS binding in the solution file. Because of this limitation it becomes convenient to have a convention that solution files generally go in the root of the code base tree.

The goal is to get the VSS bindings out of the solution file and have Visual Studio rely on the mssccprj.scc file. I haven’t found a reliable way to do this from within the IDE with existing projects but the solution (.sln) files are just text files, so I hack them.

Here’s a snippet from a .sln file. In Source Safe, the solution file is in “$/CodeBase v1”. There are two projects: a class library and a windows application. The SccProjectName<N> values are bindings to specific locations in the source safe repository. These are the bindings that need to be removed.


Global
GlobalSection(SourceCodeControl) = preSolution
SccNumberOfProjects = 3
SccLocalPath0 = .
SccProjectUniqueName1 = ClassLibrary1\\ClassLibrary1.csproj
SccProjectName1 = \u0022$/CodeBase\u0020v1/ClassLibrary1\u0022,\u0020TAAAAAAA
SccLocalPath1 = ClassLibrary1
SccProjectUniqueName2 = WindowsApplication1\\WindowsApplication1.csproj
SccProjectName2 = \u0022$/CodeBase\u0020v1/WindowsApplication1\u0022,\u0020WAAAAAAA
SccLocalPath2 = WindowsApplication1
EndGlobalSection

The .sln file can be edited to remove the SccProjectName<N> values but the SccLocalPath<N> must be updated to ‘.’ and a new property, SccProjectFilePathRelativizedFromConnection<N>, must be added with the old local path value and an appended directory separator.


Global
GlobalSection(SourceCodeControl) = preSolution
SccNumberOfProjects = 3
SccLocalPath0 = .
SccProjectUniqueName1 = ClassLibrary1\\ClassLibrary1.csproj
SccLocalPath1 = .
SccProjectFilePathRelativizedFromConnection1 = ClassLibrary1\\
SccProjectUniqueName2 = WindowsApplication1\\WindowsApplication1.csproj
SccLocalPath2 = .
SccProjectFilePathRelativizedFromConnection2 = WindowsApplication1\\
EndGlobalSection

Reference: Alin Constantin’s blog: The SAK source control strings in Visual Studio’s project files

Professional Praise

Here’s a story:

I’m working as an in-house software engineer for Nameless Big Co creating software for internal use.

I’m at an all-hands meeting for my business unit group. A very important person in a nice expensive suit is at the podium. Apparently we’re honored to have him come and speak to us. What he has to say is engaging until he gets to a certain point.

He tells us he’s had a career in financial services IT and we’re the best and brightest IT organization he has ever worked with. I think of the inefficiencies and poor decisions we deal with every day. It’s normal stuff for a large organization and for a software development management chain heavy on MBA’s. I don’t think we’re more clever than average.

Why is Mr. VIP laying on the superlatives? Is he out of touch? Is he measuring differently? Is he just trying to be a cheerleader? Is he marketing to us?

Striving to be the best and the brightest is incompatible with being uncritical enough to accept his hyperbole. I tune out. He’s pushing more noise than signal.

Here’s a second story:

The lead architect has moved some of my code from a particular project down into a core library so he could use it on another project. “You saved me a lot of time.” He tells me. “You did some good work on that project and I want to leverage it across the other projects.”

Here’s a guy whose technical chops I respect and he found my code useful. It’s a small thing but it made my day.

Peer praise is meaningful.