sql server - Relationship between collation and encoding -


does sql server enforce or control encoding of varchar columns in way? documentation i've browsed not make clear distinction collation (sort , compare rules) , encoding (byte representation of given character).

i have sql server instance modern_spanish_ci_as (including database, tables , columns), got impression meant windows-1252. database populated application uses windows-1252 well. recently, misconfigured application uses utf-8 has been writing data while and, surprise, sql server happily accepts complete unicode catalogue and, not that, other clients i've tried appear read data correctly no matter application table belongs to.

when cast hex:

select foo, cast(foo varbinary(max)) hex ...; 

... see different encodings depending on application table belongs to:

  • first app:

    €Á    0x80c1 
  • second app:

    €Á    0xac20c100 

... raw characters display properly.

how can sql clients know source encoding?


edit: if both apps write same table find this:

€Á      0x80c1 ۈ    0xe282acc381 

this guess seems supported testing , assorted documentation browsing. special binary collations aside, sql server considers 2 types of string data:

  • legacy (single-byte)
  • unicode (multi-byte)

legacy data expected encoded in whatever code page underlying windows system configured use. unicode less of problem since character repertoire same. in either case, it's driver used client 1 takes care of conversions (if any) , usual driver configurations include few options reflect fact (e.g. raw, ansi, utf-8). reasons, sql server not have setting or directive choose character set other dbms's have , need choose collation in usual meaning of term (sort , compare rules).

as how distinguish between 2 possible encodings, depends on column type:

  • char, varchar, text... imply ansi
  • nchar, nvarchar, ntext... imply unicode

if use incorrect encoding given column type you'll garbage €Ã.


Comments

Popular posts from this blog

java - Jasper subreport showing only one entry from the JSON data source when embedded in the Title band -

serialization - Convert Any type in scala to Array[Byte] and back -

SonarQube Plugin for Jenkins does not find SonarQube Scanner executable -