Mutable objects can have their states changed at any point in time, from the moment they're created until they're destroyed. On the other hand, immutable objects don't allow you to change their state after being created. A string, for example, cannot be modified once it has been created, which makes them immutable.

When you attempt to alter a string instance, a new instance will be created. There are several ways in which you can improve string handling performance in your code. The purpose of this article is to discuss the key strategies for improving the performance of string handling in C#.

Prerequisites

If you want to work with the code examples discussed in this article, you need the following installed in your system:

  • Visual Studio 2022 or Visual Studio 2022 Preview
  • .NET 7.0
  • BenchmarkDotNet

If you don't already have Visual Studio 2022 Preview installed in your computer, you can download it from here: https://visualstudio.microsoft.com/vs/preview/.

What is a String in .NET?

A string comprises a sequence of characters represented using the BSTR rule, terminated by a null character at the end. It is a reference type, is immutable, can contain null characters and overloads the == operator. At a quick glance, a string in .NET has the following characteristics:

  • Is a reference type
  • Is immutable
  • Is thread-safe
  • Can contain null values

Performance Challenges of Strings in .NET

One of the most significant performance challenges with strings is their immutability. It is impossible to change a string object once it has been created. It is important to note that a new string object is created every time a string object is modified. It may result in unnecessary object creation and GC (garbage collection) overhead.

The immutability of strings has several benefits. First, it means that strings can be safely shared between threads without worry about data corruption. Second, it allows the CLR to optimize string operations by performing them on a shared read-only buffer. Finally, it makes strings easier to reason about since their values cannot change unexpectedly.

Create a New Console Application Project in Visual Studio 2022 Preview

Let's create a console application project in Visual Studio 2022 Preview that you'll use for benchmarking performance. When you launch Visual Studio 2022 Preview, you'll see the Start window. You can choose Continue without code to launch the main screen of the Visual Studio 2022 Preview IDE.

To create a new Console Application project in Visual Studio 2022 Preview:

  1. Start Visual Studio 2022 Preview.
  2. In the Create a new project window, select Console App, and click Next to move on.
  3. Specify the project name as StringPerformanceDemo and the path where it should be created in the Configure your new project window.
  4. If you want the solution file and project to be created in the same directory, you can optionally check the Place solution and project in the same directory checkbox. Click Next to move on.
  5. In the next screen, specify the target framework you would like to use for your console application.
  6. Click Create to complete the process.

You'll use this application in the subsequent sections of this article.

How Are Strings Represented in Memory?

A string is defined as a sequence of characters and represented using the BSTR rule. It is a reference type, is immutable, can contain null characters and overloads the == operator. While strings in C/C++ are represented in memory based on the PWSZ (Pointer to Wide-character String, Zero-terminated) rule, in C#, strings are stored in memory based on the BSTR (Basic string or binary string) rule. Note that both PWSZ and BSTR strings are null-terminated.

Generally, a string consists of the following components:

  1. An 8-byte object header, which consists of a SyncBlock and a type descriptor, each of which is 4 bytes.
  2. The length of the string represented as an int32 field.
  3. A total count of the number of characters in the character buffer represented by an int32 field.
  4. The first character in a string represented as System.Char.
  5. A character buffer containing the remaining characters in the string. Terminated by a null character at the end.

The length of a string in an x86 system is calculated as 14 + length x 2 bytes. The length of a string in an x64 system is calculated as 26 + length x 2 bytes. Hence, on x86 systems the size of an empty string object is 14 bytes while on x64 systems it is 26 bytes. Note that prior to .NET 4.0, a string object had an additional field m_arrayLength to represent capacity. Note that the value of m_arrayLength is equal to the total number of characters in the string plus one to store the null character. This field has been discarded from the subsequent versions of .NET.

An object in an x86 system will always be a multiple of four bytes, and a minimum of 16 bytes in size. For .NET 4.0 framework and later x86 systems, an empty string is 4 + 4 + 4 + 2 = 14 bytes, rounded up to 16 bytes. In x64 systems, an empty string occupies 26 bytes. In an x86 system, the length of a string is 14 + length x 2 bytes, while in a x64 system, it is 26 + length x 2 bytes.

Getting Started with String Performance Testing

We'll now create some methods and benchmark their performance using BenchmarkDotNet (https://benchmarkdotnet.org/), a popular benchmarking tool for .NET. You can read about BenchmarkDotNet in more detail from my earlier article here.

Create a class named StringPerformanceBenchmarks in a file having the same name with a “.cs” extension and write the following code in there:

[Orderer(BenchmarkDotNet.Order.SummaryOrderPolicy.SlowestToFastest)]
[RankColumn]
[MemoryDiagnoser]

public class StringPerformanceBenchmarks{const int MAX = 100;    
    const string TEXT = "This is a sample string for testing purposes only.";        
    //Write your benchmark methods here
}

Install NuGet Package(s)

The next step is to install the required NuGet package(s). Right-click on the solution and then select Manage NuGet Packages for Solution…. Now search for the package named BenchmarkDotNet in the search box and install it. Alternatively, you can type the commands shown below at the NuGet Package Manager command prompt:

PM> Install-Package BenchmarkDotNet

Concat Strings: String vs StringBuilder

In C#, strings are immutable, which means that when you build a string in C#, you're actually building a new object. A StringBuilder class in C# is an example of a mutable type, while a String is an immutable type. When you alter a string, the CLR creates a new one from scratch and deletes the old one.

When you modify a string by using concatenation and append operations, each time you write to the string object, a new object is created, which requires more CPU and memory allocation overhead. Since StringBuilder allocates buffer space in the memory and writes new characters directly into the buffer, only one instance is created.

StringBuilder instances are mutable, so they can be modified without the need to create a new object every time. As a result, there is a significant performance benefit when you would otherwise be creating and destroying many string objects.

Here's a code snippet that shows two methods: one uses string concatenation to append strings and the other uses the StringBuilder Append method.

[Benchmark]
public void AppendStringsUsingStringClass()
{
     string str = string.Empty;
     for (int i = 0; i < COUNTER; i++)    
     {         
         str = string.Concat(TEXT, str);    
         
     }
}

[Benchmark]
public void AppendStringsUsingStringBuilder()
{    
    StringBuilder stringBuilder = new StringBuilder();    
    for (int i = 0; i < COUNTER; i++)    
    {
        stringBuilder = stringBuilder.Append(TEXT);
    }
}

Executing the Benchmarks

Write the following piece of code in the Program.cs file to run the benchmarks:

using BenchmarkDotNet.Running;
using StringPerformanceDemo;
class Program
{
    static void Main(string[] args)    
    {
        BenchmarkRunner.Run<StringPerformanceBenchmarks>();
    }   
}

If you would like to execute the benchmarks, you will need to set the project's compilation mode to Release. You will also need to run the following command in the same folder as the project file.

dotnet run -p StringPerformanceDemo.csproj -c Release

Figure 1 shows the result of the execution of the benchmarks.

Figure 1: Benchmarking string vs StringBuilder performance when appending strings
Figure 1: Benchmarking string vs StringBuilder performance when appending strings

Create StringBuilder Instances: With and Without Using StringBuilderPool

You can leverage a StringBuilderPool that comprises a collection of re-usable, ready to use StringBuilder instances. This would eliminate the overhead and cost involved in creating several StringBuilder instances in your code. The following code snippet illustrates two methods to create or acquire StringBuilder instances; one that uses a StringBuilderPool and one that doesn't.

[Benchmark]
public void CreateStringBuilderWithPool()
{   
    var stringBuilderPool = new      
    DefaultObjectPoolProvider().CreateStringBuilderPool();   
    for (var i = 0; i < COUNTER; i++)   
    {      
        var stringBuilder = stringBuilderPool.Get();      
        stringBuilder.Append(TEXT);      
        stringBuilderPool.Return(stringBuilder);   
    }
}

[Benchmark]
public void CreateStringBuilderWithoutPool()
{
    for (int i = 0; i < COUNTER; i++)   
    {
        var stringBuilder = new StringBuilder();      
        stringBuilder.Append(TEXT);   
    }
}

Figure 2 illustrates the benchmark results.

Figure 2: Benchmarking StringBuilderPool performance
Figure 2: Benchmarking StringBuilderPool performance

Extract Strings: String.Substring vs StringBuilder.Append vs Span

The next code example illustrates three methods: one that uses the Substring method of the String class to extract a string, one that uses the Append method of the StringBuilder class to extract a string, and one that uses Span to extract a substring from a given string.

[Benchmark]
public void ExtractStringUsingSubstring() 
{  
    StringBuilder stringBuilder = new StringBuilder();  
    for (int i = 0; i < COUNTER; i++) 
    {
        stringBuilder.Append(TEXT.Substring(0, 10));  
    }
}

[Benchmark] 
public void ExtractStringUsingAppend() 
{
    StringBuilder stringBuilder = new StringBuilder();   
    for (int i = 0; i < COUNTER; i++) 
    {
        stringBuilder.Append(TEXT, 0, 10);   
    }
}

[Benchmark]
public void ExtractStringUsingSpan()
{   
    for (int i = 0; i < COUNTER; i++)   
    {
        var data = TEXT.AsSpan().Slice(0, 10);   
    }
}

Figure 3 shows the benchmarking results of these three methods.

Figure 3: Benchmarking performance when extracting strings using String class, StringBuilder and Span
Figure 3: Benchmarking performance when extracting strings using String class, StringBuilder and Span

Create StringBuilder: With and Without StringBuilderCache

StringBuilderCache represents a per-thread cache with three static methods: Acquire, Release, and GetStringAndRelease. StringBuilder instances can be acquired using the Acquire method. StringBuilder instances can be cached by the Release method if their size falls within the maximum. Using the GetStringAndRelease method, you can return a string instance to the caller method in addition to releasing and returning the StringBuilder instance to the cache.

You can take advantage of StringBuilderCache to reduce allocations working with StringBuilder in C#. Below is a code snippet that shows two methods; one uses StringBuilderCache and another doesn't.

[Benchmark]
public void WithoutStringBuilderCache()
{   
    for (int i = 0; i < COUNTER; i++)   
    {
        var stringBuilder = new StringBuilder();
        stringBuilder.Append(TEXT);
        _ = stringBuilder.ToString();   
    }
}

[Benchmark]
public void WithStringBuilderCache()
{
    for (int i = 0; i < COUNTER; i++)    
    {
        var stringBuilder = StringBuilderCache.Acquire();
        stringBuilder.Append(TEXT);
        _ = StringBuilderCache.GetStringAndRelease(stringBuilder);    
    }
}

Figure 4 illustrates the benchmarking results of these two methods.

Figure 4: Benchmarking performance when creating StringBuilder instances using and not using StringBUilderCache
Figure 4: Benchmarking performance when creating StringBuilder instances using and not using StringBUilderCache

Knowing how large the string object will be, you may determine the initial capacity when constructing a StringBuilder instance. To improve performance, specify the capacity of a StringBuilder instance before you use it. This would significantly minimize memory allocations.

String Interning

String interning involves using a hash table to store a single copy of each string in the string intern pool. The key is a string hash, and the value is a pointer to the real String object. Thus, interning guarantees that only one instance of that string is allocated memory, even if the string appears 100 times. When comparing strings, you only need to perform a reference comparison if they are interned.

The Intern method in C# is used to obtain a reference to the specified string object. This method looks into the “intern pool” for a string that is identical to the supplied string. If such a string exists, its intern pool reference is returned.

Although string interning is an exciting feature, it is only used sparingly in practice. Misusing it can adversely affect our application's performance. We should only consider it if we plan to create many strings with the same content.

The code snippet below shows how you can compare strings with and without interning.

[Benchmark]
public void CompareWithStringIntern()
{
    string str = string.Intern("HelloWorld"); // This string will be interned

    bool isEqual = false;

    for (int i = 0; i < COUNTER; i++)   
    {
        var s1 = str; // Uses the interned string
        var s2 = str; // Uses the interned string

        isEqual = (str == s2);   
    }
}

[Benchmark]
public void CompareWithoutStringIntern()
{
    string str = "HelloWorld"; //There is no interning in this case

    bool isEqual = false;

    for (int i = 0; i < COUNTER; i++)  
    {
        var s1 = "World";     
        var s2 = "Hello" + s1; //There is no interning in this case

        isEqual = (str == s2);  
    }
}

Figure 5 shows the benchmarking results of these two methods:

Figure 5: Benchmarking string interning performance
Figure 5: Benchmarking string interning performance

As you can see, comparing interned strings are much faster than comparing strings that have not been interned. Notice the difference in the allocated bytes in these two approaches.

Here's another example of string interning. The following two methods store strings in a list. One of them uses interned string, the other doesn't.

[Benchmark]
public void StoreInternedStringsInList()
{   
    string text = "Hello World";   
    string temp = null;   
    var list = new List<string>();   
    for (int i = 0; i < COUNTER; i++)   
    {
        if (temp == null)      
        {
            temp = string.Intern(text);      
        }   
        else list.Add(temp);   
    }
}

[Benchmark]
public void StoreStringsInList()
{  
    string text = "Hello";  
    var list = new List<string>();  
    for (int i = 0; i < COUNTER; i++)  
    {
        list.Add(text + " World");  
    }
}

Figure 6 illustrates the benchmark result of these two methods:

Figure 6: Benchmarking performance when storing interned and non-interned strings in a list
Figure 6: Benchmarking performance when storing interned and non-interned strings in a list

The Best Practices for Improving String Performance

Based on the performance numbers we gathered above from benchmarking string performance, here are the best practices for improving string performance at a glance:

  1. Use StringBuilder for performing string concatenation multiple times
  2. Use Span<T> for extracting a string from another string and for allocation free access to contiguous memory
  3. Use string interning for string literals or strings whose value won't change
  4. Use StringBuilderCache to reduce allocations when working with StringBuilders
  5. Use a pool of StringBuilder instances to reduce allocations

Conclusion

String handling is a critical part of any application, and .NET provides a variety of ways to improve string handling performance. To test the performance of the code without compromising its functionality, you can take advantage of BenchmarkDotNet to run a benchmark on a single method, module, or complete program. Remember that to enhance your application's performance and scalability; you must follow the best practices; otherwise, merely benchmarking your application's code will be ineffective.