A blog about SQL Server, SSIS, C# and whatever else I happen to be dealing with in my professional life.

Find ramblings

Tuesday, October 27, 2009

It depends on how you measure it

A friend ran into an issue where len() was trimming the trailing space
and not returning the real length. Ah, I said to myself, he must not
realize he's working with varchar datatype. I coded up a quick
example to show him how he was "doing it wrong(tm)"

DECLARE
    @a as varchar(10)
,   @b as char(10)
SELECT
    @a = 'hi'
,   @b = 'hi'
SELECT
    len(@a) AS len_a
,   len(@b) AS len_b

Results are

len_a       len_b
----------- -----------
2           2

Well, it must not be really making that a char since it's a variable
but if it were a table column, then I'd know for sure it's a
character. (Just go with me on the logic)

DECLARE
    @TABLE TABLE
(
    col1 varchar(10) NOT NULL
,   col2 char(10) NOT NULL
)
INSERT INTO @table
SELECT
    'hi'
,   'hi'
INSERT INTO @table
SELECT
    'bye       '
,   'bye       '
SELECT
    len(T.col1) AS len_col1
,   len(T.col2) AS len_col2
,   T.col1 + 'X' AS col1
,   T.col2 + 'X' AS col2
FROM
    @TABLE T

Results were

len_col1    len_col2    col1        col2
----------- ----------- ----------- -----------
2           2           hiX         hi        X
3           3           bye       X bye       X


Hmmm, that's decidedly not what I expected. Time to give up and
consult BOL as there must be something I'm missing. And of course,
plain as day in the article on len() "Returns the number of characters
of the specified string expression, excluding trailing blanks."
Equally important was the Note "To return the number of bytes used to
represent an expression, use the DATALENGTH function".

DECLARE
    @a as varchar(10)
,   @b as char(10)
SELECT
    @a = 'hi'
,   @b = 'hi'
SELECT
    len(@a) AS len_a
,   len(@b) AS len_b
SELECT
    datalength(@a) AS actual_len_a
,   datalength(@b) AS actual_len_b

DECLARE
    @TABLE TABLE
(
    col1 varchar(10) NOT NULL
,   col2 char(10) NOT NULL
)

INSERT INTO @table
SELECT
    'hi'
,   'hi'

INSERT INTO @table
SELECT
    'bye       '
,   'bye       '

SELECT
    len(T.col1) AS len_col1
,   len(T.col2) AS len_col2
,   T.col1 + 'X' AS col1
,   T.col2 + 'X' AS col2
,   datalength(T.col1) AS actual_len_col1
,   datalength(T.col2) AS actual_len_col
2FROM
    @TABLE T


Take away from this: use len() to find out how many characters would
be used for a varchar representation. Use datalength() to find out
how many bytes (not chars as this works with unicode types) a string
takes.
TODO: BOL indicates they work with binary datatypes, how does
that work?

Reference materials:
http://msdn.microsoft.com/en-us/library/ms190329.aspx and
http://msdn.microsoft.com/en-us/library/ms173486.aspx

No comments: