Converting Strings to Numbers

Many new developers struggle with converting strings to numbers. It’s hard to make it intuitive for new developers; the “right” name for these methods only makes sense if you’re already familiar with the terminology programmers use for this process. This article strives to discuss the techniques for converting strings to numeric values in .NET. It is intended for beginners, but the information may be useful for more experienced developers. I’m using C# and VB .NET interchangeably; differences between the two will be discussed briefly as needed.

If you’re in a hurry and don’t want all the explanation, you can skip to the answer that gives you the most power.

Terminology and Background

When I say “a numeric value”, I mean any of the .NET numeric types. For C# this includes byte, int, long, short, double, single, and others. For VB .NET this means Byte, Integer, Long, Short, Double, Single, and others. Those are all just language-specific names for the underlying .NET types, System.Byte, System.Int32, System.Int64, etc. I’m going to use the VB type names because alternating to be fair could get confusing.

Programmers call the process of converting a string to a numeric value parsing the string. It gets a special name because there are many considerations to take and can be much more complicated than you might think. The term for converting a numeric value to a string is formatting, but it’s a topic for another article.

Techniques

A VB Pitfall: Val()

If you’re used to VB6 you’re probably used to reaching for the Val() function to convert objects to numeric values. Lose this habit quickly. The best features of VB .NET remove some of the magic associated with VB6 programming; the tradeoff is when your code makes dangerous assumptions it is now more likely to fail. This may seem strange to you: why would you want your code to fail? It’s easier to fix bugs when you know that they are there. If your code does something wrong but doesn’t crash the program to tell you, then you may not notice something’s wrong until much later. This makes it harder to find the bug that’s causing the problem. The best-case scenario is bad code that fails immediately: then you know the line that’s the culprit.

You might think “4abc” is not a number, but Val() will return 4. “123 Mockingbird Lane” is an indicator the user put an address in, but Val() will return 123 as the person’s age. Whoops. The only appropriate way to use Val() is to perform tests before calling it to ensure the number matches your requirements; I’m not going to provide a code sample because there are better techniques. Don’t use Val().

Implicit Conversion

When a property or method expects an Integer but gets a String, many different things can happen. Some programming languages won’t compile if you try this. Some will compile, but fail at runtime. Some will compile and at runtime try to parse the string; of these languages some will fail if the string cannot be parsed and others will silently fail and return a default value like -1. The process of automatically converting the value to the expected type is called implicit conversion because it is done without an explicit command to perform the conversion.

C# will only perform implicit conversions if they are widening. This means it will only perform implicit conversions when no loss of data will occur. For example, C# will convert from Integer to Long because Long can represent every number Integer can represent (and even more.) However, Double does not implicitly convert to Integer because it would involve rounding off the decimal places. C# does not implicitly convert strings to integers because not all strings are integers, thus it is a narrowing conversion.

VB .NET’s behavior is more complicated. If Option Strict is on, VB .NET behaves like C#. If Option Strict is Off (the default), it behaves more like VB6 and will perform implicit conversions. In this case, assigning a String to a numeric value will cause the compiler to attempt to parse the number. If the number can be parsed, all is well; if it can’t be parsed an exception will be thrown.

Casting

When you cast a value from one type to another, you are making an explicit conversion because your code makes it clear you would like to make the conversion; all of the remaining techniques I discuss are explicit conversions.

In C#, you cast a variable by using this syntax:

double originalValue = 1.23;
int value = (int)originalValue;

This line should be read as, “Cast the variable originalValue to an integer and assign the result to the variable value.” C# will not automatically convert Double to Integer, but in this case you make the explicit conversion so it is valid. However, C# will not cast from String to Integer. It feels like this cast is unsafe because it might fail at runtime, so it is not allowed. This is sort of odd because there’s plenty of other unsafe casts that C# will make (like Integer to UInteger) without complaint, but this is how it behaves. I believe it is because some of the other numeric conversions are interpreted as, “Reinterpret the bits that make up this number as if they were this other number” and that is not easy to do for a string.

In VB .NET, you have a few choices. The best choice is to use one of the conversion operators like CInt() or CDbl():

Dim age As Integer = CInt(txtAge.Text)

This will work so long as txtAge.Text represents a simple number, like “10″. If it contains anything else, even whitespace, the cast will fail and throw an InvalidCastException. This makes it less useful for professional applications, where you might want more control over the number formats that are allowed.

Conversion

The System.Convert class has many static (Shared) methods to convert values from one type to another type. This is kind of like casting. In general, you look for a method named ToXXXX(), where “XXXX” is the .NET name for your type. For example, to convert a string to a Long you use System.Convert.ToInt64().

Much like casting, this technique can only parse strings that contain simple numbers. I’ve never seen a real reason to use this class instead of casting or one of the other parsing techniques. There is one interesting use for this an overload of the integer numeric values (Byte, Short, Integer, Long, anything that doesn’t support a decimal point.) This overload takes the string and an Integer parameter that represents the base of the number the string represents. For example, let’s say you have the hexadecimal string “BE” and want to parse it to 190 as a Byte. Here’s how you could do it:

Dim value As Byte = Convert.ToByte("BE", 16)

Note the second parameter is only allowed to be 2 (binary), 8 (octal), 10 (decimal), or 16 (hexadecimal). Passing 10 seems kind of pointless; the version that doesn’t take a base parameter does the same thing.

Parsing

Every numeric type has a Parse() method. The simplest form of this method takes the string you want to parse as a parameter and returns a number. If the string is not a simple number (similar to what I discussed above referring to casting), the Parse() method throws an exception. Here’s a C# example of parsing a text box’s value:

int age = int.Parse(txtAge.Text);

There are a few other overloads of the Parse() method that I will briefly discuss. The NumberStyles parameter can be used to configure Parse() to handle more complicated formatted numbers like “3,145,238″ or the hexadecimal string “3FEA4″. You can combine the NumberStyles parameters to allow many kinds of numbers to be parsed. Here’s a VB .NET example that parses a text box’s value, allowing commas in the number and ignoring any leading or trailing whitespace (”   45,238   ” would be parsed as 45238.)

Dim ns As NumberStyles = NumberStyles.AllowLeadingWhite Or NumberStyles.AllowTrailingWhite Or NumberStyle.AllowThousands
Dim countedSheep As Long = Long.Parse(txtCountedSheep.Text, ns)

The IFormatProvider parameter allows you to provide information about how numbers are formatted. In general, this is used to parse numbers that may be formatted in a culture that is not the same as your machine. For example, some languages use “.” instead of “,” to separate thousands, so if you want to parse “32.000″ as thirty two thousand instead of thirty two you would use this parameter to indicate that a different thousands character would be used. You can read more about it in the documentation.

TryParsing

In .NET 2.0, each numeric type was given a TryParse() method. This looks difficult to use, but it’s the best way to parse a string. TryParse() takes the same parameters as Parse(), but returns a Boolean and takes an additional reference parameter (ByRef or out depending on your language) that is for the return value. This is easier to show than explain:

int age;
if (!int.TryParse(txtAge.Text, out age))
{
    MessageBox.Show("Please input a valid age.");
}
Dim age As Integer
If Not Integer.TryParse(txtAge.Text, age) Then
    MessageBox.Show("Please input a valid age.")
End If

I won’t discuss the extra parameters again; if you skipped here see the Parse() discussion above.

This method was introduced because throwing an exception can be expensive. If you are parsing many values in a loop, TryParse() will be much faster than Parse(). For example, let’s say you’ve read each line of a file into a String array and you want to create a Double array from it. Parse() would look like this:

double[] values = new double[lines.Length];
for (int index = 0; index < values.Length; index++)
{
    try
    {
        values[index] = double.Parse(lines[index]);
    }
    catch ex As FormatException
    {
        Logger.Log("Could not convert line " + index.ToString() + " to a value.");
    }
}

TryParse() would look like this and be much faster, not to mention a little more elegant:

Dim values(lines.Length - 1) As Double
For index As Integer = 0 To values.Length - 1
    If Not Double.TryParse(lines(index), values(index)) then
        Logger.Log("Could not convert line " & index.ToString() & " to a value.");
    End if
Next

Manual Parsing

Sometimes none of .NET’s built-in methods give you the ability to parse the kind of string you want to deal with. For example, you might want to parse “A” as 1, “B” as 2, and thus “ABC” as 123. When this is the case, you have to write the code to parse the string yourself. You have many tools at your disposal, including the String class’s built-in methods and regular expressions. I won’t go into detail because there are an infinite number of ways to format a number in a string and usually many different ways to parse the formatted string. It is a programmer’s job to understand how to use the language’s tools to accomplish her goals.

Conclusion

Parsing strings is fundamental knowledge you will use in many applications you will write. Now you know how to reliably convert strings to numbers. You learned this process is called parsing the string. You learned some techniques that require little effort but can only handle simple strings and don’t remind you they can fail. You learned that the Parse() method gives you more power but the TryParse() version is better because it doesn’t throw exceptions and its Boolean return value forces you to think about what to do if it fails. You also learned that if these techniques don’t suit your needs, you have to write code that manually parses the string, and that there is no general algorithm for parsing all strings. If you read the whole article, you also learned an easy way to parse a string in hexadecimal, octal, or binary notation for a number as a bonus.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>