The main difference is just in performance and sorting accuracy.
utf8_general_ci is a bit faster than utf8_unicode_ci. utf8_unicode_ci on
the other hand is a bit more accurate for sorting.
The character set for statements that arrive from the client. The
session value of this variable is set using the character set
requested by the client when the client connects to the server. (Many
clients support a --default-character-set option to enable this
character set to be specified explicitly.) The global
value of the variable is used to set the session value in cases when
the client-requested value is unknown or not available, or the server
is configured to ignore client requests:
* The client is from a version of MySQL older than MySQL 4.1, and thus
does not request a character set.
* The client requests a character set not known to the server. For
example, a Japanese-enabled client requests sjis when connecting to a
server not configured with sjis support.
* mysqld was started with the --skip-character-set-client-handshake
option, which causes it to ignore client character set configuration.
This reproduces MySQL 4.0 behavior and is useful should you wish to
upgrade the server without upgrading all the clients.
Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:
Additional character set and collation system variables are
involved in handling traffic for the connection between a client
and the server. Every client has connection-related character
set and collation system variables.
A “connection” is what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:
Clients can fine-tune the settings for these variables, or
depend on the defaults (in which case, you can skip the rest of
this section). If you do not use the defaults, you must change
the character settings for each connection to the
server.
Two statements affect the connection-related character set variables as a group:
The MySQL client programs
When a client connects to the server, it sends the name of the
character set that it wants to use. The server uses the name to
set the
With the mysql client, to use a character set different from the default, you could explicitly execute
Example: Suppose that
If you want the server to perform no conversion of result sets or error messages, set
For more information about character sets and error messages, see Section 10.1.6, “Character Set for Error
The character set for statements that arrive from the client. The
session value of this variable is set using the character set
requested by the client when the client connects to the server. (Many
clients support a --default-character-set option to enable this
character set to be specified explicitly.) The global
value of the variable is used to set the session value in cases when
the client-requested value is unknown or not available, or the server
is configured to ignore client requests:
* The client is from a version of MySQL older than MySQL 4.1, and thus
does not request a character set.
* The client requests a character set not known to the server. For
example, a Japanese-enabled client requests sjis when connecting to a
server not configured with sjis support.
* mysqld was started with the --skip-character-set-client-handshake
option, which causes it to ignore client character set configuration.
This reproduces MySQL 4.0 behavior and is useful should you wish to
upgrade the server without upgrading all the clients.
Several character set and collation system variables relate to a client's interaction with the server. Some of these have been mentioned in earlier sections:
-
The server character set and collation are the values of the
character_set_server
andcollation_server
system variables. -
The character set and collation of the default database are
the values of the
character_set_database
andcollation_database
system variables.
A “connection” is what you make when you connect to the server. The client sends SQL statements, such as queries, over the connection to the server. The server sends responses, such as result sets or error messages, over the connection back to the client. This leads to several questions about character set and collation handling for client connections, each of which can be answered in terms of system variables:
-
What character set is the statement in when it leaves the
client?
The server takes thecharacter_set_client
system variable to be the character set in which statements are sent by the client. -
What character set should the server translate a statement
to after receiving it?
For this, the server uses thecharacter_set_connection
andcollation_connection
system variables. It converts statements sent by the client fromcharacter_set_client
tocharacter_set_connection
(except for string literals that have an introducer such as_latin1
or_utf8
).collation_connection
is important for comparisons of literal strings. For comparisons of strings with column values,collation_connection
does not matter because columns have their own collation, which has a higher collation precedence. -
What character set should the server translate to before
shipping result sets back to the client?
Thecharacter_set_results
system variable indicates the character set in which the server returns query results to the client. This includes result data such as column values, and result metadata such as column names.
Two statements affect the connection-related character set variables as a group:
-
SET NAMES '
charset_name
' [COLLATE 'collation_name
']
SET NAMES
indicates what character set the client will use to send SQL statements to the server. Thus,SET NAMES 'cp1251'
tells the server, “future incoming messages from this client are in character setcp1251
.” It also specifies the character set that the server should use for sending results back to the client. (For example, it indicates what character set to use for column values if you use aSELECT
statement.)
ASET NAMES '
statement is equivalent to these three statements:charset_name
'
SET character_set_client =
Settingcharset_name
; SET character_set_results =charset_name
; SET character_set_connection =charset_name
;character_set_connection
tocharset_name
also implicitly setscollation_connection
to the default collation forcharset_name
. It is unnecessary to set that collation explicitly. To specify a particular collation, use the optionalCOLLATE
clause:
SET NAMES '
charset_name
' COLLATE 'collation_name
' -
SET CHARACTER SET
charset_name
SET CHARACTER SET
is similar toSET NAMES
but setscharacter_set_connection
andcollation_connection
tocharacter_set_database
andcollation_database
. ASET CHARACTER SET
statement is equivalent to these three statements:charset_name
SET character_set_client =
Settingcharset_name
; SET character_set_results =charset_name
; SET collation_connection = @@collation_database;collation_connection
also implicitly setscharacter_set_connection
to the character set associated with the collation (equivalent to executingSET character_set_connection = @@character_set_database
). It is unnecessary to setcharacter_set_connection
explicitly.
Note
ucs2
cannot be used as a client character
set, which means that it does not work for SET
NAMES
or SET CHARACTER SET
.
mysql
,
mysqladmin
, mysqlcheck
,
mysqlimport
, and mysqlshow
determine the default character set to use as follows:
-
In the absence of other information, the programs use the
compiled-in default character set, usually
latin1
. -
The programs support a
--default-character-set
option, which enables users to specify the character set explicitly to override whatever default the client otherwise determines.
character_set_client
,
character_set_results
, and
character_set_connection
system
variables. In effect, the server performs a SET
NAMES
operation using the character set name.
With the mysql client, to use a character set different from the default, you could explicitly execute
SET NAMES
every time you start up. To
accomplish the same result more easily, add the
--default-character-set
option
setting to your mysql command line or in your
option file. For example, the following option file setting
changes the three connection-related character set variables set
to koi8r
each time you invoke
mysql:
[mysql] default-character-set=koi8rIf you are using the mysql client with auto-reconnect enabled (which is not recommended), it is preferable to use the
charset
command rather
than SET NAMES
. For example:
mysql> charset utf8
Charset changed
The charset
command issues a SET
NAMES
statement, and also changes the default
character set that mysql uses when it
reconnects after the connection has dropped.
Example: Suppose that
column1
is defined as
CHAR(5) CHARACTER SET latin2
. If you do not
say SET NAMES
or SET CHARACTER
SET
, then for SELECT column1 FROM
t
, the server sends back all the values for
column1
using the character set that the
client specified when it connected. On the other hand, if you
say SET NAMES 'latin1'
or SET
CHARACTER SET latin1
before issuing the
SELECT
statement, the server
converts the latin2
values to
latin1
just before sending results back.
Conversion may be lossy if there are characters that are not in
both character sets.
If you want the server to perform no conversion of result sets or error messages, set
character_set_results
to
NULL
or binary
:
SET character_set_results = NULL;To see the values of the character set and collation system variables that apply to your connection, use these statements:
SHOW VARIABLES LIKE 'character_set%'; SHOW VARIABLES LIKE 'collation%';You must also consider the environment within which your MySQL applications execute. See Section 10.1.5, “Configuring the Character Set and Collation for Applications”.
For more information about character sets and error messages, see Section 10.1.6, “Character Set for Error
No comments:
Post a Comment