MSBuild Metadata doesn't support string instance methods

August 23, 2021 by Jonathan Dodds in Talking Code

In MSBuild, string instance methods are directly supported for properties but not for item metadata values. This is a longstanding issue. Properties have property functions which include string instance properties and methods. $(Prop.Length), for example, will invoke the string.Length property of the string instance for the MSBuild Property named Prop. Items have item functions that operate on the vector. @(Items->get_Length()) will invoke the string.Length property of the string instance for the Identity metadata value for each item in the ItemGroup reference. But to use a metadata value other than Identity and/or to operate on a metadata reference instead of a ItemGroup reference is not directly supported.

The work around is to create a string from the metadata reference and then apply a property function to the string. Common approaches are to 'copy' the string or to use the ValueOrDefault function.

An example of copying the string would be

$([System.String]::Copy('%(Filename)').Length)

An example of using the ValueOrDefault function would be

$([MSBuild]::ValueOrDefault('%(Filename)', '').Length)

At this point in time, the copy technique is considered more idiomatic and preferred and there is an optimization in the MSBuild internals to not actually create a copy of the string.

An expanded example that contrasts Property functions, Item functions, and changing metadata values to strings and applying Property functions follows.

<!-- stringinstance.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <!-- MSBuild Property (Scalar) -->
  <PropertyGroup>
    <Prop>FooBar</Prop>
  </PropertyGroup>

  <PropertyGroup>
    <!-- Examples of string instance methods as 'Property Functions' -->
    <PropLength>$(Prop.Length)</PropLength>
    <PropRemoveFirstChar>$(Prop.Substring(1))</PropRemoveFirstChar>
    <PropRemoveLastChar>$(Prop.Substring(0, $([MSBuild]::Subtract($(Prop.Length), 1))))</PropRemoveLastChar>
  </PropertyGroup>

  <!-- MSBuild Item (Vector) -->
  <ItemGroup>
    <Items Include="FooBar.txt;Quux.txt"/>
  </ItemGroup>

  <PropertyGroup>
    <!-- Examples of string instance methods as 'Item Functions' -->
    <ItemLengths>@(Items->get_Length())</ItemLengths>
    <ItemRemoveFirstChars>@(Items->Substring(1))</ItemRemoveFirstChars>
  </PropertyGroup>

  <ItemGroup>
    <!-- Examples of string instance methods on metadata values -->
    <Items2 Include="@(Items)">
      <FilenameLength>$([System.String]::Copy('%(Filename)').Length)</FilenameLength>
      <FilenameRemoveFirstChar>$([System.String]::Copy('%(Filename)').Substring(1))</FilenameRemoveFirstChar>
      <FilenameRemoveLastChar>$([System.String]::Copy('%(Filename)').Substring(0, $([MSBuild]::Subtract($([System.String]::Copy('%(Filename)').Length), 1))))</FilenameRemoveLastChar>
    </Items2>
  </ItemGroup>

  <Target Name="ShowResults">
    <Message Text="Property Functions"/>
    <Message Text="PropLength           = $(PropLength)"/>
    <Message Text="PropRemoveFirstChar  = $(PropRemoveFirstChar)"/>
    <Message Text="PropRemoveLastChar   = $(PropRemoveLastChar)"/>
    <Message Text="Item Functions"/>
    <Message Text="ItemLengths          = $(ItemLengths)"/>
    <Message Text="ItemRemoveFirstChars = $(ItemRemoveFirstChars)"/>
    <Message Text="Items2 with Metadata"/>
    <Message Text="@(Items2->'  Identity = %(Identity), FilenameLength = %(FilenameLength), FilenameRemoveFirstChar = %(FilenameRemoveFirstChar), FilenameRemoveLastChar = %(FilenameRemoveLastChar)', '%0d%0a')"/>
  </Target>
  <!--
    Output:
      ShowResults:
        Property Functions
        PropLength           = 6
        PropRemoveFirstChar  = ooBar
        PropRemoveLastChar   = FooBa
        Item Functions
        ItemLengths          = 6;4
        ItemRemoveFirstChars = ooBar;uux
        Items2 with Metadata
          Identity = FooBar, ItemLength = 6, ItemRemoveFirstChar = ooBar, ItemRemoveLastChar = FooBa
          Identity = Quux, ItemLength = 4, ItemRemoveFirstChar = uux, ItemRemoveLastChar = Quu
    -->

</Project>

Imagine a scenario with a set of files where some of the filenames use a special suffix.

<!-- suffixexample.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <SourceList Include="a.txt;b.txt;a-alt.txt;b-alt.txt"/>
  </ItemGroup>

  <PropertyGroup>
    <suffix>-alt</suffix>
  </PropertyGroup>

  <ItemGroup>
    <Files Include="@(SourceList)">
      <HasSuffix>$([System.String]::Copy('%(Filename)').Endswith('$(suffix)'))</HasSuffix>
      <PrimaryName Condition="%(HasSuffix)">$([System.String]::Copy('%(Filename)').Substring(0, $([MSBuild]::Subtract($([System.String]::Copy('%(Filename)').Length), $(suffix.Length)))))</PrimaryName>
      <PrimaryName Condition="!%(HasSuffix)">%(Filename)</PrimaryName>
    </Files>
  </ItemGroup>

  <Target Name="ListFilesWithSuffix">
    <Message Text="@(Files->'%(Identity) PrimaryName = %(PrimaryName)','%0d%0a')" Condition="%(HasSuffix)" />
  </Target>
  <!--
    Output:
      ListFilesWithSuffix:
        a-alt.txt PrimaryName = a
        b-alt.txt PrimaryName = b
    -->

</Project>

The Files ItemGroup is created with metadata that provides a boolean indicating if the special suffix is present or not and metadata for the filename sans suffix (regardless of whether the suffix is present). The Target 'ListFilesWithSuffix' uses metadata batching to display the files that have the suffix.

MSBuild, About Import Guards

August 15, 2021 by Jonathan Dodds in Talking Code

It is reasonable, for maintainability and for reusability, to parcel MSBuild code into multiple files and then, for a given project, to Import files as needed to support specific functionality.

MSBuild checks for duplicate imports (including 'self' imports). If a duplicate import is detected, MSBuild will show a warning (either MSB4011 'DuplicateImport' or MSB4210 'SelfImport') and block the import. Barring duplicate imports prevents circular references.

However, as the number of files increase and/or the number of maintainers increase, it can become easier to unexpectedly introduce a duplicate import.

Import Guards

Custom MSBuild files can be written with "import guards", which are very much like old school C header file include guards.

Within a given file, define a property that is unique to the file and that will only ever be defined by the given file.

A file named 'common.targets' might start with the following:

<!-- common.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-CommonTargets>defined</ImportGuard-CommonTargets>
  </PropertyGroup>

The name of the property, 'ImportGuard-CommonTargets', is derived from the filename. If there are names that repeat in different folders — for example, if there is a chain of Directory.Build.props or Directory.Build.targets files, then a different convention will be needed to ensure that the property names are unique.

Any import of 'common.targets' should use a Condition to test that the unique property for 'common.targets' is not defined.

<Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

If the property has no value, then the file is imported.

If the property has any value, then the file is already imported and should not be imported again.

With the use of import guards, a file can explicitly import all of its dependencies regardless of imports that may exist in other files. In the following diagram, A.proj can be explicit about import dependencies on both Y.targets and Z.targets.

Without import guards, the import of Z.targets by A.proj is a duplicate import that generates a warning. This is because Y.targets imports Z.targets first. The warning can be resolved by modifying A.proj to remove the import of Z.targets.

With import guards for Y.targets and Z.targets, there is no warning. Further, if Y.targets is changed in the future to remove the import of Z.targets, there is no associated code change to A.proj.

Visualizing the Import Order

The -preprocess command line argument to msbuild.exe (see Switches) will produce output that is all the imported files inlined. The project is not built when this switch is used.

A lighter-weight approach is to create an item group of the imported files and create a target to report the item group.

In each file, add, at the top of the file, an ItemGroup with an include of the current file:

<ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

A target that reports the ItemGroup would be:

<Target Name="ListFiles">
    <Message Text="Project:" />
    <Message Text="  $(MSBuildProjectFullPath)" />
    <Message Text="Files:" />
    <Message Text="  @(Diagnostic-FilesList, '%0d%0a')"/>
  </Target>

The Diagnostic-FilesList will be a list of the files being used in the order that the files were imported. This is far less complete then the -preprocess option but, unlike -preprocess, the project is built and other targets can be run along with the ListFiles target.

Example

Following is a full example with five files: A.proj, B.proj, Y.targets, Z.targets, and common.targets.

First is 'common.targets'. The common.targets file defines an import guard. It also adds itself to the Diagnostic-FilesList ItemGroup. common.targets defines the ListFiles target.

<!-- common.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-CommonTargets>defined</ImportGuard-CommonTargets>
  </PropertyGroup>
  <!--
    Example import:
    <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />
    -->
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Target Name="ListFiles">
    <Message Text="Project:" />
    <Message Text="  $(MSBuildProjectFullPath)" />
    <Message Text="Files:" />
    <Message Text="  @(Diagnostic-FilesList, '%0d%0a')"/>
  </Target>

</Project>

The 'Z.targets' file defines two targets: DoSomeWork and Customize. Z.targets has an import guard, adds itself to Diagnostic-FilesList, and imports common.targets.

<!-- Z.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-ZTargets>defined</ImportGuard-ZTargets>
  </PropertyGroup>
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Target Name="DoSomeWork" DependsOnTargets="Customize">
    <Message Text="Working." Condition="'$(Work)' == ''"/>
    <Message Text="$(Work)." Condition="'$(Work)' != ''"/>
  </Target>

  <Target Name="Customize" />

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

The 'Y.targets' file imports Z.targets and redefines the Customize target. Y.targets has an import guard, adds itself to Diagnostic-FilesList, and imports common.targets.

<!-- Y.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <ImportGuard-YTargets>defined</ImportGuard-YTargets>
  </PropertyGroup>
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Import Project="Z.targets" Condition=" '$(ImportGuard-ZTargets)' == '' " />

  <Target Name="Customize">
    <PropertyGroup>
      <Work>Writing</Work>
    </PropertyGroup>
  </Target>

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

A.proj and B.proj are 'project' files that use the set of .targets files. A.proj and B.proj each adds itself to Diagnostic-FilesList and imports common.targets.

A.proj imports Y.targets which in turn imports Z.targets. A.proj has an import of Z.targets but the condition that tests the import guard will prevent the import.

<!-- A.proj -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="Main">
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Import Project="Y.targets" Condition=" '$(ImportGuard-YTargets)' == '' " />
  <Import Project="Z.targets" Condition=" '$(ImportGuard-ZTargets)' == '' " />

  <Target Name="Main" DependsOnTargets="Preamble;DoSomeWork" />
  <!--
    Output:
      Preamble:
        Project A
      DoSomeWork:
        Writing.
    -->

  <Target Name="Preamble">
    <Message Text="Project A" />
  </Target>

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

B.proj imports Z.targets. The customization that is done in Y.targets is not seen and is not used by B.proj.

<!-- B.proj -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003" DefaultTargets="Main">
  <ItemGroup>
    <Diagnostic-FilesList Include="$(MSBuildThisFileFullPath)"/>
  </ItemGroup>

  <Import Project="Z.targets" Condition=" '$(ImportGuard-ZTargets)' == '' " />

  <Target Name="Main" DependsOnTargets="Preamble;DoSomeWork" />
  <!--
    Output:
      Preamble:
        Project B
      DoSomeWork:
        Working.
    -->

  <Target Name="Preamble">
    <Message Text="Project B" />
  </Target>

  <Import Project="common.targets" Condition=" '$(ImportGuard-CommonTargets)' == '' " />

</Project>

Running the command "msbuild A.proj /t:ListFiles" will produce output like the following. (I have omitted the full paths of the files.)

ListFiles:
  Project:
    A.proj
  Files:
    A.proj
    Y.targets
    Z.targets
    common.targets

The A.proj, B.proj, Y.targets, and Z.targets files all import common.targets. This means that the ListFiles target can be run against any of the files.

The command "msbuild Y.targets /t:ListFiles" will produce output like the following.

ListFiles:
  Project:
    Y.targets
  Files:
    Y.targets
    Z.targets
    common.targets

Summary

Import guards allow for a set of MSBuild files to be more loosely coupled. Adding or removing an import in a given file doesn't need to ripple out and necessitate changes to other files. Because every MSBuild file is a 'Project', every MSBuild file is invocable. With import guards, files that are not normally an entry point can still be successfully run because the required imports can be present even if in the normal case the imports are not performed. Encapsulating/limiting changes aids maintenance and using a 'library' file as the primary project file aids testing.

How the Use of CallTarget can be a Code Smell in MSBuild

August 08, 2021 by Jonathan Dodds in Talking Code

CallTarget, because it vaguely resembles a sub-routine call, is sometimes used to approximate a procedural approach in MSBuild — but this is a problem because MSBuild is a declarative language and Targets are not sub-routines.

Example

To elaborate, imagine there are two processes to be implemented in MSBuild. Process A has steps A-1, A-2, and A-3. Process B has steps B-1, B-2, and B-3. However, steps A-2 and B-2 are actually identical. The two processes have a common step. It makes sense to factor the common step into its own target that both processes can use.

From a procedural frame of mind, an implementation might look like the following. The output shows that steps A-1, Common (aka A-2), and A-3 are run for Process A. Likewise when running Process B.

<!-- calltarget-example-01.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA">
    <Message Text="perform Step A-1"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step A-3"/>
  </Target>
  <!--
    Output:
      PerformProcessA:
        perform Step A-1
      Common:
        perform Common Step
      PerformProcessA:
        perform Step A-3
    -->

  <Target Name="PerformProcessB">
    <Message Text="perform Step B-1"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      PerformProcessB:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
  </Target>

</Project>

However an implementation that is MSBuild native would eschew the CallTarget task and might look like the following.

<!-- calltarget-example-02.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA" DependsOnTargets="StepA1;Common">
    <Message Text="perform Step A-3"/>
  </Target>
  <!--
    Output:
      StepA1:
        perform Step A-1
      Common:
        perform Common Step
      PerformProcessA:
        perform Step A-3
    -->

  <Target Name="PerformProcessB" DependsOnTargets="StepB1;Common">
    <CallTarget Targets="Common"/>
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      StepB1:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
  </Target>

  <Target Name="StepA1">
    <Message Text="perform Step A-1"/>
  </Target>

  <Target Name="StepB1">
    <Message Text="perform Step B-1"/>
  </Target>

</Project>

Both implementations perform the steps; so what's the difference and why does the difference matter?

Scope

The difference is that CallTarget creates a new scope. And the new scope is initialized with the state of Properties and Items at the point where the Target containing the CallTarget task was started; Changes made to Properties and Items within the containing Target will not be seen.

Following is a revision of the example code that uses CallTarget. A Property has been added that is modified at different points.

<!-- calltarget-example-01a.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA">
    <Message Text="perform Step A-1"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in PerformProcessA</ExampleValue>
    </PropertyGroup>
    <Message Text="PerformProcessA: ExampleValue = $(ExampleValue)"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step A-3"/>
    <Message Text="PerformProcessA: ExampleValue = $(ExampleValue)"/>
  </Target>
  <!--
    Output:
      PerformProcessA:
        perform Step A-1
        PerformProcessA: ExampleValue = Initialized in Project;Modified in PerformProcessA
      Common:
        perform Common Step
        Common: ExampleValue = Initialized in Project;Modified in Common
      PerformProcessA:
        perform Step A-3
        PerformProcessA: ExampleValue = Initialized in Project;Modified in PerformProcessA
    -->

  <Target Name="PerformProcessB">
    <Message Text="perform Step B-1"/>
    <CallTarget Targets="Common"/>
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      PerformProcessB:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in Common</ExampleValue>
    </PropertyGroup>
    <Message Text="Common: ExampleValue = $(ExampleValue)"/>
  </Target>

  <PropertyGroup>
    <ExampleValue>Initialized in Project</ExampleValue>
  </PropertyGroup>

</Project>

Note that the change to append ";Modified in PerformProcessA" to the Property is not visible in the CallTarget task's scope. Also note that the change to append ";Modified in Common" which is done within the CallTarget task's scope is not seen by the Target that contains the CallTarget task.

Next is a similar revision of the example code that doesn't use CallTarget.

<!-- calltarget-example-02a.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <Target Name="PerformProcessA" DependsOnTargets="StepA1;Common">
    <Message Text="perform Step A-3"/>
    <Message Text="PerformProcessA: ExampleValue = $(ExampleValue)"/>
  </Target>
  <!--
    Output:
      StepA1:
        perform Step A-1
        StepA1: ExampleValue = Initialized in Project;Modified in StepA1
      Common:
        perform Common Step
        Common: ExampleValue = Initialized in Project;Modified in StepA1;Modified in Common
      PerformProcessA:
        perform Step A-3
        PerformProcessA: ExampleValue = Initialized in Project;Modified in StepA1;Modified in Common
      -->

  <Target Name="PerformProcessB" DependsOnTargets="StepB1;Common">
    <Message Text="perform Step B-3"/>
  </Target>
  <!--
    Output:
      StepB1:
        perform Step B-1
      Common:
        perform Common Step
      PerformProcessB:
        perform Step B-3
    -->

  <Target Name="Common">
    <Message Text="perform Common Step"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in Common</ExampleValue>
    </PropertyGroup>
    <Message Text="Common: ExampleValue = $(ExampleValue)"/>
  </Target>

  <Target Name="StepA1">
    <Message Text="perform Step A-1"/>
    <PropertyGroup>
      <ExampleValue>$(ExampleValue);Modified in StepA1</ExampleValue>
    </PropertyGroup>
    <Message Text="StepA1: ExampleValue = $(ExampleValue)"/>
  </Target>

  <Target Name="StepB1">
    <Message Text="perform Step B-1"/>
  </Target>

  <PropertyGroup>
    <ExampleValue>Initialized in Project</ExampleValue>
  </PropertyGroup>

</Project>

Note that, with this code, the final value of the Property is the result of all of the append operations: "Initialized in Project;Modified in StepA1;Modified in Common".

Project

MSBuild files are XML and the root element of an MSBuild file is the Project element. The Project will contain everything defined in the MSBuild file and everything from all imported files.

A Target is not a function. Targets have no arguments and no return values. A Target is a unit of work performed on or with the data in the Project.

Summary

Defining the target build order doesn't have the overhead of creating a new scope. CallTarget is not a substitute for an ordering attribute.

There are special cases where CallTarget is uniquely useful but those cases are uncommon.

Rampant use of CallTarget is an indication that the author of the code was working in the wrong paradigm.

MSBuild Batching, an Explanation

August 01, 2021 by Jonathan Dodds in Talking Code

The MSBuild language is a declarative programming language and there are no control flow constructs for looping. Instead, for processing collections, MSBuild provides a language feature named 'batching'. Unfortunately, batching is complex and its behavior can appear inscrutable. Some coders, rather than understand batching, create equivalents for loops — and create other problems by doing so. Generally, forcing an imperative or procedural style into MSBuild tends to produce scripts that are less performant and less maintainable.

Understanding batching is essential to writing good MSBuild code. If you have found MSBuild batching to be confuzzling¹ and if reading the Microsoft MSBuild Batching documentation doesn't clear up all the mystery, then this article is an attempt to explain some of the 'gotchas'.

Properties, Items, and Metadata

MSBuild has properties and items which are scalar variables and collections, respectively. A scalar variable holds one value. A collection holds a set of values.

A member of an item collection has metadata. Metadata is a collection of key-value pairs. The metadata collection is never an empty set. There is always at least a key-value pair with a key of 'Identity'. Identity is the name by which the member was added to the collection.

Metadata is the foundation that batching is built upon.

Metadata Examples

References to properties are introduced with a '$', references to an item collection are introduced with a '@', and references to a metadata key are introduced with a '%'.

In the following code, note the difference in the output between @(Example) and %(Example.Identity).

<!-- batching-example-01.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

    <ItemGroup>
        <!-- Creates an Item collection named 'Example' and adds 'Item1' to Example. -->
        <Example Include="Item1" />
        <!-- Adds 'Item2' to Example. -->
        <Example Include="Item2" />
    </ItemGroup>

    <Target Name="DisplayExample">
        <Message Text="@(Example)" />
    </Target>
    <!--
    Output:
      DisplayExample:
        Item1;Item2
      -->

    <Target Name="DisplayExampleByIdentity">
        <Message Text="%(Example.Identity)" />
    </Target>
    <!--
    Output:
      DisplayExampleByIdentity:
        Item1
        Item2
      -->

</Project>

In the 'DisplayExample' target, the Message task is executed once and it displays the whole collection as a string.

In the 'DisplayExampleByIdentity' target; however, batching is being used, specifically task batching. The Message task is executed twice, once for each distinct value of the Identity metadata.

To see the batches more clearly, the next code example adds 'Color' metadata and the Message task batches on distinct values of Color.

<!-- batching-example-02.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Example Include="Item1">
      <Color>Blue</Color>
    </Example>
    <Example Include="Item2">
      <Color>Red</Color>
    </Example>
    <Example Include="Item3">
      <Color>Blue</Color>
    </Example>
  </ItemGroup>

  <Target Name="DisplayExampleByColor">
    <Message Text="@(Example)" Condition=" '%(Color)' != '' " />
  </Target>
  <!--
    Output:
      DisplayExampleByColor:
        Item1;Item3
        Item2
      -->

  <Target Name="DisplayExampleByColorWithTransform">
    <Message Text="@(Example->'%(Identity) has %(Color)')" Condition=" '%(Color)' != '' " />
  </Target>
  <!--
    Output:
      DisplayExampleByColorWithTransform:
        Item1 has Blue;Item3 has Blue
        Item2 has Red
      -->

  <Target Name="DisplayExampleWithTransform">
    <Message Text="@(Example->'%(Identity) has %(Color)')" />
  </Target>
  <!--
    Output:
      DisplayExampleWithTransform:
        Item1 has Blue;Item2 has Red;Item3 has Blue
      -->

</Project>

Note that in the 'DisplayExampleByColor' target, the content of @(Example) has changed. It is not the whole collection; It is the subset that conforms to the current batch.

An expectation that @(\<Name\ data-preserve-html-node="true">) is always the complete collection is natural but is incorrect. A better (but still simple) mental model is to consider @(\<Name\ data-preserve-html-node="true">) as always a set derived from the collection. The set will be the complete collection when there is no batching and a subset of the collection when there is batching.

The 'DisplayExampleByColorWithTransform' target is the same batching operation as the 'DisplayExampleByColor' target. The only difference is that an item transform is used to show both the Identity and the Color. Using references to metadata inside a transform expression has no impact on batching. The last target, 'DisplayExampleWithTransform', demonstrates that, with only the transform expression, there is no batching.

Task Batching

MSBuild supports Target Batching and Task Batching. The examples so far have all been task batching.

For Task batching to be in effect, there must be a task that uses a metadata reference, e.g. %(\<name\ data-preserve-html-node="true">). The following stipulations apply:

The child items of ItemGroup and PropertyGroup (i.e. definitions of Items and Properties) are treated as implicit tasks for task batching.
Property functions in a task are evaluated for metadata references.
Transform expressions are excluded from triggering batching.

That transform expressions are excluded is shown in the prior example code. The next code example shows task batching with Item and Property definitions.

<!-- batching-example-03.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Example Include="Item1">
      <Color>Blue</Color>
    </Example>
    <Example Include="Item2">
      <Color>Red</Color>
    </Example>
    <Example Include="Item3">
      <Color>Blue</Color>
    </Example>
  </ItemGroup>

  <Target Name="DisplayResults">
    <ItemGroup>
      <Item1 Include="%(Example.Identity)" />
      <Item2 Include="%(Example.Color)" />
    </ItemGroup>
    <PropertyGroup>
      <Prop1>%(Example.Identity)</Prop1>
      <Prop2>%(Example.Color)</Prop2>
    </PropertyGroup>
    <Message Text="Item1 = @(Item1)" />
    <Message Text="Prop1 = $(Prop1)" />
    <Message Text="Item2 = @(Item2)" />
    <Message Text="Prop2 = $(Prop2)" />
  </Target>
  <!--
    Output:
      DisplayResults:
        Item1 = Item1;Item2;Item3
        Prop1 = Item3
        Item2 = Blue;Red
        Prop2 = Red
      -->

</Project>

A task batched Property doesn't accumulate. Effectively the property is re-defined with each batch in the batching execution. The final value will be the last batch value. The code in the example demonstrates this. Prop1 and Prop2 finish with the last values that are in Item1 and Item2, respectively.

The code in the example for Properties is not practically useful and it is only for demonstration purposes. Pulling a single value from a collection (especially a collection with a large numbr of batches) into a property might be better accomplished as follows (assuming that 'Identity' is unique which may not be true depending on the data). There is still a batching execution that partitions the collection, but the Property is defined once.

<PropertyGroup>
  <Prop3 Condition="'%(Identity)'=='Item3'">@(Example->Metadata('Color'))</Prop3>
</PropertyGroup>

Using a property function might look like the following:

<PropertyGroup>
  <Prop3 Condition="'%(Identity)'=='Item3'">$([System.IO.Path]::Combine($(SomePath),%(Example.Color)))</Prop3>
</PropertyGroup>

Qualified and Unqualified Metadata

A metadata reference can be qualified or unqualified.

The following shows a qualified reference. The name of the metadata is qualified with the name of the item collection.

<Message Text="%(Example.Color)" />

An unqualified reference uses just the name of the metadata and the relevant item collection is inferred.

<Message Text="@(Example)" Condition=" '%(Color)' != '' " />

If multiple collections are used, then the unqualified metadata reference is applied across all the collections.

<Message Text="@(Example1);@(Example2)" Condition=" '%(Color)' == 'Blue' " />

When using an unqualified metadata reference, regardless of whether one or many item collections are used, every member of each collection must have the metadata defined. If an item in one of the collections is missing the metadata, MSBuild will generate an error.

Batching is partitioning item collections based on metadata values. Whether the specified metadata is qualified or not makes a difference in how the collections are partitioned when multiple collections are used.

<!-- batching-example-04.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <!-- Example1 -->
    <Example1 Include="Item1">
      <Color>Blue</Color>
    </Example1>
    <Example1 Include="Item2">
      <Color>Red</Color>
    </Example1>
    <!-- Example2 -->
    <Example2 Include="Item3">
      <Color>Blue</Color>
    </Example2>
  </ItemGroup>

  <Target Name="DisplayResults">
    <!-- Unqualified -->
    <ItemGroup>
      <Result0 Include="@(Example1);@(Example2)" Condition=" '%(Color)' == 'Blue' " />
    </ItemGroup>
    <Message Text="@(Result0->'%(Identity) has %(Color)')" />
    <!-- Qualified Example1 -->
    <ItemGroup>
      <Result1 Include="@(Example1);@(Example2)" Condition=" '%(Example1.Color)' == 'Blue' " />
    </ItemGroup>
    <Message Text="@(Result1->'%(Identity) has %(Color)')" />
    <!-- Qualified Example2 -->
    <ItemGroup>
      <Result2 Include="@(Example1);@(Example2)" Condition=" '%(Example2.Color)' == 'Blue' " />
    </ItemGroup>
    <Message Text="@(Result2->'%(Identity) has %(Color)')" />
  </Target>
  <!--
    Output:
      DisplayResults:
        Item1 has Blue;Item3 has Blue
        Item1 has Blue;Item3 has Blue
        Item1 has Blue;Item2 has Red;Item3 has Blue
    -->

</Project>

The 'DisplayResults' target is creating three new item collections.

Result0, with an unqualified reference to Color, is a collection of items from Example1 where Color is Blue and items from Example2 where Color is Blue.

Result1, with a reference to Color qualified to Example1, is a collection of items from Example1 where Color is Blue and all items from Example2.

Result2, with a reference to Color qualified to Example2, is a collection of all items from Example1 and items from Example2 where Color is Blue.

Target Batching

For Target batching to be in effect, there must be a metadata reference in one of the attributes of the Target element.

<!-- batching-example-05.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Example Include="Item1">
      <Color>Blue</Color>
      <Shape>Square</Shape>
    </Example>
    <Example Include="Item2">
      <Color>Red</Color>
      <Shape>Square</Shape>
    </Example>
    <Example Include="Item3">
      <Color>Blue</Color>
      <Shape>Circle</Shape>
    </Example>
  </ItemGroup>

  <Target Name="DisplayTargetBatchByColor" Outputs="%(Example.Color)">
    <Message Text="MessageTask: @(Example->'%(Identity) has %(Color) %(Shape)')" />
  </Target>
  <!--
    Output:
      DisplayTargetBatchByColor:
        MessageTask: Item1 has Blue Square;Item3 has Blue Circle
      DisplayTargetBatchByColor:
        MessageTask: Item2 has Red Square
    -->

  <Target Name="DisplayTargetBatchAndTaskBatch" Outputs="%(Example.Color)">
    <Message Text="MessageTask: @(Example->'%(Identity) has %(Color) %(Shape)')" Condition=" '%(Shape)' != '' " />
  </Target>
  <!--
    Output:
      DisplayTargetBatchAndTaskBatch:
        MessageTask: Item1 has Blue Square
        MessageTask: Item3 has Blue Circle
      DisplayTargetBatchAndTaskBatch:
        MessageTask: Item2 has Red Square
    -->

</Project>

The 'DisplayTargetBatchByColor' target is executed once per Color batch, that is once for 'Blue' and once for 'Red'.

The 'DisplayTargetBatchAndTaskBatch' target is also executed once per Color batch but contains a Task that is batched per Shape.

Intersection of Two MSBuild Item Collections

The next example code shows two approaches for getting the intersection of two Item collections.

The first approach uses set algebra and can be used outside of a target.

The second approach uses task batching and was once described as a 'batching brainteaser'. It leverages the way that unqualified metadata references are handled.

<!-- batching-example-06.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <!-- Example1 -->
    <Example1 Include="Item1" />
    <Example1 Include="Item2" />
    <Example1 Include="Item4" />
    <!-- Example2 -->
    <Example2 Include="Item2" />
    <Example2 Include="Item3" />
    <Example2 Include="Item4" />
  </ItemGroup>

  <!-- Get the intersection of Example1 and Example2 without batching. -->
  <ItemGroup>
    <Intermediate Include="@(Example1)" Exclude="@(Example2)" />
    <!-- Intermediate has the items that are in Example1 and not in Example2. -->
    <Intersection Include="@(Example1)" Exclude="@(Intermediate)" />
    <!-- Intersection has the items that are in Example1 and not in Intermediate. -->
  </ItemGroup>

  <Target Name="DisplayIntersection">
    <Message Text="@(Intersection, '%0d%0a')" />
  </Target>
  <!--
    Output:
      DisplayIntersection:
        Item2
        Item4
    -->

  <!-- Get the intersection of Example1 and Example2 using batching. -->
  <Target Name="DisplayIntersectionByBatching">
    <ItemGroup>
      <IntersectionByBatching Include="@(Example1)" Condition="'%(Identity)' != '' and '@(Example1)' == '@(Example2)'" />
    </ItemGroup>
    <Message Text="@(IntersectionByBatching, '%0d%0a')" />
  </Target>
  <!--
    Output:
      DisplayIntersectionByBatching:
        Item2
        Item4
    -->

</Project>

The IntersectionByBatching line can also be written as:

<ItemGroup>
  <IntersectionByBatching Include="@(Example1)" Condition="'%(Identity)' != '' and '@(Example2)' != ''" />
</ItemGroup>

The result is the same.

Cartesian Product of Two MSBuild Item Collections

One more practical example is an approach for computing a cartesian product by using batching.

<!-- batching-example-07.targets -->
<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

  <ItemGroup>
    <Rank Include="Ace;King;Queen;Jack;10;9;8;7;6;5;4;3;2" />
    <Suit Include="Clubs;Diamonds;Hearts;Spades" />
  </ItemGroup>

  <Target Name="DisplayCardDeck">
    <ItemGroup>
      <CardDeck Include="@(Rank)">
        <Suit>%(Suit.Identity)</Suit>
      </CardDeck>
    </ItemGroup>
    <Message Text="@(CardDeck->'%(Identity) of %(Suit)', '%0d%0a')" />
  </Target>
  <!--
    Output:
      DisplayCardDeck:
        Ace of Clubs
        King of Clubs
        Queen of Clubs
        Jack of Clubs
        10 of Clubs
        9 of Clubs
        8 of Clubs
        ...
    -->

</Project>

References

References for this article include:

¹ 'confuzzling' is a portmanteau of 'confusing' and 'puzzling' coined by my daughter.