Using string class is quite common in our daily code implementations. But understanding the behavior of string is very important, in terms of its performance, especially, when any operation is performed on it, like we append some string to it. Before we go to the actual discussion point, let us re-iterate some important points related to strings, which you might be already aware of. These are
- String is a reference type which behaves like a Value type variable.
- Being a reference types implies, value of a string variable is NOT the actual data, but a pointer/reference to the actual data.
- From MSDN : Although string is a reference type, the equality operators (== and !=) are defined to compare the values of string objects, not references. This makes testing for string equality more intuitive.
This means, comparing two strings with == or != should be comparing the references/pointers to actual data. But it does not. It directly compares the actual data they are assigned.
There is a term called immutable, which means state of an object can’t be changed after is has been created. String is an immutable type. The statement string is immutable means that, once created, it is not altered by changing the value assigned to it. If we try to change the value of a string by concatenation (using + operator) or assign a new value to it, it actually results in creation of a new string object to hold reference to the newly generated string. It might seem that we have successfully altered the existing string. But behind the scenes, a new string reference is created, which points to newly created string.
Let’s take an example to analyze this behavior. For this, we will simply create one string and assign it to another string. Then we will compare the values(which are references to actual data in this case) of both and the actual data they point to. So we write the following code and run the application.
In the code above, when we copy one string to another, it actually copied the reference(or pointer to the actual data) to the second variable and NOT the actual data. So now both the variables s1 and s2 contain the same reference to the actual data and comparing their values and the actual data of the strings, return true in both the cases.
Now in next step, we only change the s2 to append a string value with it. This time, of-course the data is changed. But in this case, the value (pointer to actual data) of s2, is modified to point to the newly generated string. Had it modified the existing string i.e. Hello, being reference types, value of s1 would have also changed to Hello User and the comparison of values of s1 and s2 would have returned true. But this did not happened, as changing the value of s2 resulted in creation of a new string, pointing to the new data. So not only the values differ, but also the pointer(reference to the data) become different. So comparison of the data and the references these strings hold, return false.
Since the values of s1 and s2 and the data they point to do not match, its immutable behavior is reflected from it. The whole concept can be diagrammatically represented as the following :
From above discussion, it becomes very important to understand that string operations, especially for large string manipulation, should be done very carefully. As an alternative, we have the StringBuilder class, which is much more efficient then string. Using its Append(), it always manipulates the existing string rather then creating a new instance. Thus, case where we would like to append large string, we should prefer the use of StringBuilder class instead of the string class.
Hope you enjoyed reading it…!!!