timstall: Understanding the Number of Bytes in Integral Types

Sunday, March 13, 2005

Understanding the Number of Bytes in Integral Types

[This was originally posted at http://timstall.dotnetdevelopersjournal.com/understanding_the_number_of_bytes_in_integral_types.htm]

As I prepare to teach about simple types in C#, one of the questions people will ask is "Why are there so many different integer types?"

C# supports nine integral types: sbyte, byte, short, ushort, int, uint, long, ulong, and char. Ignoring char, the table below shows the range and .Net Alias for each:

Type	Alias for	Allowed Values
sbyte	System.SByte	Integer between â€“128 and 127.
byte	System.Byte	Integer between 0 and 255.
short	System.Int16	Integer between â€“32768 and 32767.
ushort	System.UInt16	Integer between 0 and 65535.
int	System.Int32	Integer between â€“2147483648 and 2147483647.
uint	System.UInt32	Integer between 0 and 4294967295.
long	System.Int64	Integer between â€“ 9223372036854775808 and 9223372036854775807.
ulong	System.UInt64	Integer between 0 and 18446744073709551615.

The basic concept is that the larger the range of allowed values, the more memory the type requires to store that range. Range includes both magnitude as well as sign (+ or -). Therefore .Net provides different types to optimize for this. For example, no need to waste ulong when you only need to store the values 1 through 10.

Ultimately the types are based on binary storage. So storing the value 25 would really be stored as 11001, or: (1*2^4) + (1*2^3) + (0*2^2) + (0*2^1) + (1*2^0), or, more simply: 16 + 8 + 0 + 0 + 1. The following Excel spreadsheet demonstrates these calculations.

Notice that the range for unsigned integers start at 0. So byte (which has 8 bits) can store 2^8, or 256 values. This is spent covering 0 through 255. In order to make the type signed, or capable of storing negative values, there are two approaches:

Use an extra bit to indicate the sign.
Shift the entire range from 0 to some arbitrary negative number.

The second approach actually gives the larger range. For example, if sbyte used the first approach, it would use 1 bit for the sign, leaving only 7 bits for the magnitude. This would have a range of 0-127 (note: 2^7-1 = 127) in the positive direction, and 0-127 in the negative direction, for a total range of -127 to + 127. However, both the positive and negative directions cover 0, wasting a value. Therefore the datatype gets a larger range by simply offsetting the positive range: -128 to +127 instead of just -127 to + 127.

Given the power of today's computers, along with the simplicity of most applications, many developers can get away with just using the default int type (System.Int32) for all their integer needs. However it is still good to know what's going on behind the scenes because:

It is a common Computer Science principle and transcends merely the C# language.
You may work on an application where it does matter.
It will help you understand other applications that use these types.

timstall

Sunday, March 13, 2005

Understanding the Number of Bytes in Integral Types

No comments:

Post a Comment