`
kerlubasola
  • 浏览: 673332 次
文章分类
社区版块
存档分类
最新评论

What is the difference between BIT and TINYINT in MySQL

 
阅读更多
What is the difference between BIT and TINYINT in MySQL?


A TINYINT is an 8-bit integer value, a BIT field can store between 1 bit, BIT(1), and 64 bits, BIT(64). For a boolean values, BIT(1) is pretty common.


对于BIT类型,M表示每个值的bits数目,默认为1,最大为64bits。对于MySQL来说这是一个新
的类型,因为从MySQL5.0.3才开始真正实现(在之前实际上是TINYINT(1)),而且仅仅支持MyISAM
存储引擎,但是从MySQL5.0.5开始Memory,Innodb和NDB Cluster存储引擎也开始“支持”了。在
MyISAM中,BIT的存储空间很小,是真正的实现了通过bit来存储,但是在其他的一些存储引擎中就不一
样了,因为他们是转换为最小的INT类型存储的,所以占用的空间也没有节省,还不如直接使用INT类的
数据类型存放来得直观。




http://www.xaprb.com/blog/2006/04/11/bit-values-in-mysql/
Why you should not use BIT columns in MySQL?
MySQL implements the BIT data type differently in different versions, and the behavior is not what one might expect. In this article I’ll explain how MySQL’s behavior has changed over time, what strange things can happen as a result, and how to understand and work around display issues. I’ll tell you about a serious bug I’ve found, and discuss differences in the BIT data type between MySQL and Microsoft SQL Server.


History


MySQL has supported the BIT data type for a long time, but only as a synonym for TINYINT(1) until version 5.0.3. Once the column was created, MySQL no longer knew it had been created with BIT columns.


In version 5.0.3 a native BIT data type was introduced for MyISAM, and shortly thereafter for other storage engines as well. This type behaves very differently from TINYINT.


Changed behavior


The old data type behaved just like the small integer value it really was, with a range from -128 to 127. The new data type is a totally different animal. It’s not a single bit. It’s a fixed-width “bit-field value,” which can be from 1 to 64 bits wide. This means it doesn’t store a single BIT value; it’s something akin to the ENUM and SET types. The data seems to be stored as a BINARY value, even though the documentation lists it as a “numeric type,” in the same category as the other numeric types. The data isn’t treated the same as a numeric value in queries, however. Comparisons to numeric values don’t always work as expected.


This change in behavior means it’s not safe to use the BIT type in earlier versions and assume upgrades will go smoothly.


Display issues


The client libraries, including the command-line client and all the GUI clients I’ve seen, don’t seem to know how to handle BIT values. They don’t display them as a series of ones and zeroes. For instance, the following code even breaks the alignment of the command-line output!


mysql> create table test (i bit not null default 0);
mysql> insert into test (i) values (1), (1), (0), (0);
mysql> select * from test;
+---+
| i |
+---+
| |
| |
| |
| |
+---+
As I mentioned above, the data seems to be stored internally, and transmitted through the client libraries, as a BINARY value, which is actually a string type in MySQL. How it displays depends on the width of the column. For example, if the column is 32 bits wide, it is treated as CHAR(4). If it’s 8 bits wide, it is treated as CHAR(1):


create table test(ch bit(8));
mysql> insert into test values (b'01011010');
mysql> select * from test;
+------+
| ch |
+------+
| Z |
+------+
To display the value as an integer, it has to be cast to another type. One way to do this is add 0 to the value: select ch + 0 from test;. Another way is select cast(ch as unsigned) from test;


Display width seems to be related to value with BIT, in contrast to what the manual’s section on Overview of Numeric Types states: “Display width is unrelated to the storage size or range of values a type can contain.” It appears that a field of size M can only store M bits, so it’s the storage size, not the display size, that’s affected. As I mentioned, bit values don’t display as ones and zeroes anyway, so it makes no sense to say an 8-bit wide column “displays with a width of 8.” It doesn’t display 8 bits, it stores 8 bits. In fact, inserting b'100000000' into the table I defined above will store the value 255, demonstrating that the actual value has a maximum capacity of 8 bits. Any bits not set explicitly to 1 are set to 0, so values are left-padded with zeroes (the most significant bits are zeroes).


Bugs


I’ve discovered some very strange bugs with BIT columns in MySQL. The issue I noticed was a LEFT OUTER JOIN failing when it should have succeeded. I discovered a combination of factors could cause the bug to appear and vanish. For example, the join will succeed or fail depending on combinations of these factors:


the presence of an additional column, not involved in the query at all
the presence of additional rows which don’t match in the join
the order of columns in the table
the presence of an additional tautology in the join clause
I’ve filed a bug with MySQL about the issues I found, including a script which demonstrates several ways to trigger the bug.


Why use it?


Given the problems I’ve mentioned, I recommend avoiding it entirely. It provides nothing that’s not already possible with standard numeric types and adds a lot of confusion.


The only reason I personally would consider using this data type in MySQL is to document the actual usage of the values in it, as described in an earlier article on choosing column types. Unfortunately, I think it’s counterintuitive enough that I wouldn’t even choose to use it there, because I think a BIT column intuitively sounds like it ought to store a single bit.


Differences from Microsoft SQL Server


Microsoft SQL Server also provides a BIT data type. However, it’s completely different; it’s a single-bit column. Internally, it is stored as a single bit within an integer data type. As successive BIT columns are added to a table, SQL Server packs them together behind the scenes. This is equivalent to doing bitmask operations on a single column (my previous employer loved bitmask columns!), but it allows the bits to be named explicitly, avoiding the need to pass around named constants (or embed magic numbers) and deal with bitwise arithmetic.


Pros and cons of bitmask columns


Bitmask columns (an integer within which each bit is retrieved and set via bitwise arithmetic) can be extremely handy. They’re a very compact way to pack true/false values together for efficient storage. They can also facilitate certain types of queries; for example, “if any value is set” queries become simple. I’ve used them in ACLs stored in a database, for instance. Certain types of problems are just easy to solve with bitwise arithmetic, and for those problems, creating an integer column and declaring “bit 5 is whether the value is [something]” makes a lot of sense.


On the negative side, bitmask columns can be hard to use. For one thing, they’re hard to understand. Without the documentation that says which bit means what, they’re pretty much useless. SQL Server avoids this and the other problems I’ll name by treating each bit as a separate column and naming it, but that’s only if you use that facility, which my previous employer didn’t. Bitwise arithmetic can also be pretty tricky to write, and even harder to read.


Magic numbers in queries are just as meaningless as a column named bitcolumn1. Declaring and passing around constants to name the magic numbers is a nice thought, but it’s error-prone and it’s extra work. Creating a table to define the bits can be quasi-helpful as well, unless (as often happened at my previous employer) you can’t find the table, or the column is named such that you can’t tell which table defines the values for which column, or the table’s values don’t make any sense for bitwise arithmetic.


Bitmask columns are also not index-friendly, so querying against them isn’t optimal. Of course, any column with only two values is useless to an index anyway, so this is no worse, performance-wise, than storing the yes/no values in columns by themselves. Since there’s less data to examine, it can actually be more efficient.


Finally, there can be some subtleties about bitwise arithmetic, such as issues relating to signed and unsigned numbers.


The bottom line is, I think bitmask columns should be used sparingly, and only when the nature of the data and computations really makes them the obvious choice.
分享到:
评论

相关推荐

    Devart UniDAC v5.3.10 Full Source

    -Support for the Upper and Lower statements in TDADataSet.Filter is added -Support for the ftOraTimeStamp type in TVirtualTable is added -Bug with reopening TDADataSet after BreakExec is fixed -Bug ...

    Devart LinqConnect 3.1

    LinqConnect is a fast and easy-to-use ORM solution, developed ...The behaviour is changed: the TINYINT(1) data type is now used for the columns which correspond to the System.Boolean properties (MySQL)

    unidac5.3.8src

    Bug with processing tinyint and single parameters in SQL Server Compact Edition is fixed MySQL data provider Bug with connection establishing when Pooling is enabled is fixed for MySQL provider Bug ...

    Devart UniDAC v5.3.8 Source

    -Bug with processing tinyint and single parameters in SQL Server Compact Edition is fixed MySQL data provider -Bug with connection establishing when Pooling is enabled is fixed for MySQL provider -Bug...

    Devart_UniDAC_7.4.12_Professional_D7-D10.3_Rio_Full_Source_Code

    Bug with setting the data type of the parameter to BOOLEAN for all TINYINT system data types in the functions is fixed PostgreSQL data provider Bug with reading and writing of the floating point ...

    unidac74.12d26-pro.rar for delphi 10.3.1

    error in Android is fixed MySQL data provider Bug with setting the data type of the parameter to BOOLEAN for all TINYINT system data types in the functions is fixed PostgreSQL data provider Bug with ...

    mysql中TINYINT的取值范围

    在MySQL的数据类型中,Tinyint的取值范围是:带符号的范围是-128到127。无符号的范围是0到255(见官方《MySQL 5.1参考手册》http://dev.mysql.com/doc/refman/5.1/zh/column-types.html#numeric-types)。 Tinyint...

    mysql int和tinyint的区别.docx

    mysql int和tinyint的区别.docx

    Mysql数据库设计.pdf

    = 不等于 <=> NULL安全的等于(NULL-safe) ⼩于 > ⼤于 ⼩于等于 >= ⼤于等于 BETWEEN 存在与指定范围 IN 存在于指点集合 IS NULL 为NULL IS NOT NULL 不为NULL LIKE 通配符匹配 REGEXP或者RLIKE 正则表达式匹配 %...

    Mybatis Generator将tinyint映射成Integer的解决办法.pdf

    Mybatis Generator将tinyint映射成Integer的解决办法

    MYSQL安装包官方试用版

    MYSQL_TYPE_BIT BIT字段 MYSQL_TYPE_TIMESTAMP TIMESTAMP字段 MYSQL_TYPE_DATE DATE字段 MYSQL_TYPE_TIME TIME字段 MYSQL_TYPE_DATETIME DATETIME字段 MYSQL_TYPE_YEAR YEAR字段 MYSQL_TYPE_...

    mss2mysql数据库sqlserver迁移mysql

    使用mss2sql生产数据库定义,在用navicat做数据迁移...mss2mysql生产的表定义需要稍微改动下,如mssql datetime类型默认保存到毫秒,mysql需要定义datetime(3),bit 类型转成mysql要换成tinyint。还可能遇到字符编码问题

    mysql中int、bigint、smallint 和 tinyint的区别详细介绍

    最近使用mysql数据库的时候遇到了多种数字的类型,主要有int,bigint,smallint和tinyint;接下来将详细介绍以上三种类型的应用

    mybatis-generator自动生成工具,包含tinyint处理,自动获取数据库注释

    mybatis-generator自动生成工具,包含tinyint处理,自动获取数据库注释

    MySQL数据库:表的字段类型.pptx

    作为SQL标准的扩展,MySQL也支持整数类型TINYINT、MEDIUMINT和BIGINT。 表的字段类型 数值类型 表的字段类型 日期和时间类型 表示时间值的日期和时间类型为DATETIME、DATE、TIMESTAMP、TIME和YEAR。 每个时间类型有...

    新华字典MySQL数据

    新华字典MySQL数据 CREATE TABLE `xhzd_surnfu` ( `id` int(8) NOT NULL AUTO_INCREMENT COMMENT '主键自增id', `zi` varchar(6) NOT NULL DEFAULT '' COMMENT '汉字', `py` varchar(32) NOT NULL DEFAULT '' ...

    mysql数据库全部字段类型

    BIGINT 大整数,带符号的范围是-...BIT 位字段类型,范围为从1到64,默认为1 BLOB 最大长度为65,535(216–1)字节的BLOB列 BOOL 是TINYINT(1)的同义词。zero值被视为假。非zero值视为真 CHAR 固定长度字符串

    中国行政区mysql表及数据

    `type` tinyint(1) DEFAULT NULL COMMENT '1省,2市,3县', `del_flag` tinyint(1) DEFAULT '0' COMMENT '删除标识0正常,1删除', PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=820001 DEFAULT CHARSET=...

Global site tag (gtag.js) - Google Analytics