SQL Server Performance Forum – Threads Archive

Should columns ever be defined as bit datatypes?!!

Does anyone think there is any reason why I shouldn’t change all database BIT datatypes to tinyint datatypes? (If need be a check constraint could be added to the tinyint datatypes to ensure values of only 0 or 1 get entered. Although this really is not necessary as the languages used to write to the database are passing True or False values, represented by 0 and -1 that are converted to 0 and 1 respectively). tinyint: Integer data from 0 through 255. Storage size is 1 byte. Disadvantages of bit fields are the following:
Bit DataTypes in SQL Server cannot have indexes on them.
Aggregate functions cannot be created on bit datatypes fields.
You cannot use GROUP BY or HAVING on ntext,text,image, or bit columns unless they are in a function that returns a value having another data type.
You cannot have a bitwise datatype field included in a select list that also includes an aggregate function. Advantages of bit fields are the following:
Occupy less space. (Value can only be 0 or 1). Example of how they cannot be used. create table _a1
(
a int,
b bit
) insert into _a1 values (1, 1)
insert into _a1 values (2, 1)
insert into _a1 values (3, 1) select max(a), b from _a1 drop table _a1

Just my $0,02 cents…. I think this question is a rather logical one. Bit column represent more of a state than an attribute. Are you cleaning up after upgrading from Access to SQL Server?
What advantage would there be having an index on a column which can only contain 0 or 1?
Personally I don’t use bit columns, but rather tinyint as you’ve mentioned. The disadvantages outweight the advantages by far, IMHO.
Frank
http://www.insidesql.de
http://www.familienzirkus.de

in S2K, bit fields can be included in indexes, not sure about aggregates the reason you might want a bit field in an index is suppose you have a query with where clause: col1 = SARG1 AND ccol2 = SARG2 AND bitcol = 1 if the select returns many rows, you really want a covered index, that means having the bit field in the index after the key SARGs on space savings, if you only have 1 bit field, SQL cannot use the remaining 7 bits in that byte, as all other fields require byte boundaries,
i believe that SQL can pack multiple bit fields into the same byte

As mention already the only benefit space wise with bit is if you use really many of them. A single bit column will take up one byte, the same as a tinyint, but if you have 8 bit columns they will still fit on a byte. You also have the annoying issue with some programming languages converting the 0,1 to the strings "True" and "False" and some keep them as 0,1. And what happends if your requirements change? On/off is no longer enough but you need to add a "pause" state <img src=’/community/emoticons/emotion-1.gif’ alt=’

‘ /> Good point from joechang about "col1 = SARG1 AND ccol2 = SARG2 AND bitcol = 1" though. Hadn’t thought about that. /Argyle

For all who replied to my question. Thank-you for the input. I personlly cannot see any use of a bit column unless you want SQL to validate the values inserted as being either a 0 or a 1. Regards,
Rik

The main argument I have for bit fields is space savings. On large tables with multiple bit fields the space savings can be tremendous. Another reason is that in many programming languages the bit field will compare properly with a native boolean value from the language without a conversion. Of course there are ways of doing this without using an actual bit field, but the simplicity is nice. I am slightly confused by the example in the first post. The syntax for that query is not correct, as you would have to group by the second field, which would work. Other than that, I will concede that bit fields do kind of suck when you are in an environment that uses other database platforms, as bit (as well as date/time) fields often do not transfer properly. DB2 has been the biggest offender for us, although I am sure there are others that are just as bad.

Well, the disadvantages are mostly overcome-able.
On the few occaisions when you need to group by the bit field, you can do the minorest change to your query to do this create table _a1
(
a int,
b bit
) insert into _a1 values (1, 1)
insert into _a1 values (2, 1)
insert into _a1 values (3, 1) select max(a), convert(int,b) from _a1 group by convert(int,b) drop table _a1
And for the inability to index, its not a big deal. DB theory says that an index on a field that can contain only 2 values isnt overly effecient. I can remember a dbtheory course way back when (I’m thinking a decade or so) that used gender(M/F values) as an example of a field not to index (except under special circumstances). If I can find the supporting documentation I will. So bit data type usage is like most things in sql, its not always good, its not always bad, its one of those ‘it depends’ things. Chris

SQL 2000 allows grouping on bit fields…

Thank-you for all the responses. 1) Cannot group on bit fields in SQL Server 2000: e.g.
create table _a
(
a bit
)
insert into _a values (1)
select @@version
go
select max(a) from _a group by a
—————————————- RESULTES ——————————–
Microsoft SQL Server 2000 – 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 5.0 (Build 2195: Service Pack 3)
(1 row(s) affected) Server: Msg 409, Level 16, State 2, Line 1
The maximum aggregate operation cannot take a bit data type as an argument.
2) Can get around grouping issue on bit fields: e.g.
create table _a
(
a bit
)
insert into _a values (1)
select @@version
go
select max(convert(int, a)) from _a group by convert(int, a)
—————————————- RESULTES ——————————–
Microsoft SQL Server 2000 – 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Standard Edition on Windows NT 5.0 (Build 2195: Service Pack 3)
(1 row(s) affected)
———–
1 (1 row(s) affected)
Watch out for changes in the execution plan.
3) If you convert a bit field datatype to a tinyint datatype, and values already exist in the bit field. values of 1 will be converted to 255?!!
4) There appears to be an argument for incorporating a bit field in a covering index, if the table is very large. (Although this is not possible – you would have to convert the bit field datatype to a tinyint first). Analysis on execution plans and times would have to be performed in order to check this.
5) Main argument for bit fields appears to be the fact that it only accepts values of 0 or 1, therefore providing a level of validation. Thanx for all the input.
Regards,
Rik

You state in your first point that you cannot group by a bit field, however the problem is you cannot perform the MAX or min aggregate functions on bit fields. Grouping works fine, e.g. create table _a
(
a bit,
b int
)
insert into _a values (1, 5)
insert into _a values (1, 3)
insert into _a values (1, 2)
insert into _a values (0, 4)
insert into _a values (0, 2)
select @@version
go
select max(b), a from _a group by a —————– RESULTS ——————–
Microsoft SQL Server 2000 – 8.00.760 (Intel X86)
Dec 17 2002 14:22:05
Copyright (c) 1988-2003 Microsoft Corporation
Enterprise Edition on Windows NT 5.0 (Build 2195: Service Pack 3)
(1 row(s) affected) a
———– —-
4 0
5 1 (2 row(s) affected)
As far as issue 3 is concerned, I did the following: ALTER TABLE _a ALTER COLUMN a INT
GO
SELECT *
FROM _a —————– RESULTS ——————–
a b
———– ———–
1 5
1 3
1 2
0 2
0 4 (5 row(s) affected) How did you get values of 255 for the bit fields?

I was not able to reproduce the problem with values changing from 1 to 255 when you convert a column from bit to tinyint. Regarding the post wondering why you would want to index a column with only two values, there are at least two good reasons: 1) the distribution of the two values may be very lopsided, e.g. 90% with one value. An index referencing only the bit/tinyint column might be ignored for the 90% case, but might be used for the 10% case. Same applies for indexes where the bit/tinyint column was the leading column in a composite index. 2) Avoiding bookmark lookups for joins and/or filtering. Example, I have a WHERE clause that references three columns. My table contains 50,000 rows that match the first two criteria but only 25,000 which match the last (the bit/tinyint column). With an index including the bit/tinyint column, I can build a covering index (as mentioned by a previous poster) to avoid 25,000 bookmark lookups. If the clustered index is 3 levels deep, I just saved about 75,000 i/o’s. Similar advantages obtain to joins where an inner table row is only interesting if the bit/tinyint column has a specific value (e.g. disabled = 0).

If you are performing logical / binary operations on the column it must be a bit data type. Binary Not (~) in particular will give you incorrect results if you are using any other data type than a bit.

]]>