SQL$HELP72.HLB  —  Supported Character Sets
    Oracle Rdb supports multiple character sets and lets you use more
    than one character set in a database. The supported character
    sets are:

    o  ARABIC

    o  BIG5

    o  DEC_HANYU

    o  DEC_HANZI

    o  DEC_KANJI

    o  DEC_KOREAN

    o  DEC_MCS

    o  DEC_SICGCC

    o  DEVANAGARI

    o  DOS_LATIN1

    o  DOS_LATINUS

    o  HANYU

    o  HANZI

    o  HEX

    o  ISOLATIN1

    o  ISOLATIN9

    o  ISOLATINARABIC

    o  ISOLATINCYRILLIC

    o  ISOLATINGREEK

    o  ISOLATINHEBREW

    o  KANJI

    o  KATAKANA

    o  KOREAN

    o  SHIFT_JIS

    o  TACTIS

    o  WIN_ARABIC

    o  WIN_CYRILLIC

    o  WIN_GREEK

    o  WIN_HEBREW

    o  WIN_LATIN1

    The various ways characters can be coded are:

    o  Single-octet

       A single-octet character set is entirely represented in one
       octet. ASCII is an example of a single-octet character set.
       Each ASCII character is represented in one octet.

    o  Multi-octet

       A multi-octet character set is, in general, entirely
       represented in one or more octets. Some character sets are
       fixed multi-octet character sets and some are mixed multi-
       octet characters.

       -  Fixed multi-octet

          A fixed multi-octet character set is represented by two
          or more fixed number of octets. Kanji is an example of a
          fixed multi-octet character set. Each Kanji character is
          represented in two octets.

       -  Mixed multi-octet

          A mixed multi-octet character set is represented by one or
          more mixed number of octets that allow the use of ASCII and
          a fixed multi-octet character set in the same string. DEC_
          KANJI is an example of a mixed multi-octet character set.
          The ASCII characters are represented in one octet, and the
          Kanji characters are represented in two octets.

1  –  Automatic Translation

    During operations on text data such as assignments of a literal
    to a text column or the comparison of two string variables,
    Oracle Rdb carries out character set compatibility checks to
    ensure that the operation is viable.

    Without automatic translation being enabled this checking is
    quite restrictive in that in most cases the two text objects must
    have identical character set before the operation is allowed.

    The automatic translation feature allows you to choose whether
    the character set checking should be restrictive or whether
    Rdb should attempt a character set translation, similar to
    that provided by the TRANSLATE function prior to assignments
    or comparisons.

    With automatic translation enabled you may easily carry out
    operations that previously required additional translations steps
    to be carried out explicitly.

    1. Carry out comparisons between columns that contain data
       encoded in different character sets that have common character
       subsets, for example, DEC_MCS and DEC_KANJI have ASCII in
       common.

    2. Use the same SQL code to access database data irrespective of
       the client's environment. For example, a user on a Japanese
       PC accessing a DEC_MCS column would have to add TRANSLATE
       statements to the SQL commands to convert the DEC_MCS to
       SHIFT_JIS before they could display it on their screen. With
       automatic translation enabled and a Display Character set
       specified, this would not be required.

    3. Enter data from a native interface without explicit
       translations. For example, users using SHIFT_JIS on a
       Japanenese PC may access and insert data into a DE_KANJI
       column in the database without explicit translations
       statements.

    You may enabled automatic translation by:

    1. Using a SET AUTOMATIC TRANSLATION statement

    2. Defining the SQL$AUTOMATIC_TRANSLATION logical name

    SQL$AUTOMATIC_TRANSLATION

    The logical name SQL$AUTOMATIC_TRANSLATION allows SQL users to
    specify that automatic translations should be enabled by default.

    The logical SQL$AUTOMATIC_TRANSLATION may be placed in any
    logical name table accessible to the client SQL process.

    If the logical name is set to either the string 'TRUE' or the
    value 'T' prior to invoking SQL, then automatic translation will
    be enabled by default, any other value will disable automatic
    translation within SQL.

2  –  Character Set HEX

    The character set HEX is comprised of two octet hexadecimal
    characters '00' through 'FF' and has the characteristic that
    the contents of data objects with this character set will not
    be automatically translated to the display character set when
    automatic translation has been enabled.

    It may be used in conjunction with the CAST and TRANSLATE
    functions to obtain the hexadecimal equivalence of text objects.

    Translation to the HEX character set will translate source data
    octet by octet into hexadecimal notation.

    Translation from the HEX character set will translate from
    hexadecimal notation to the destination character set.

    For example:

    SQL> show character sets
    Default character set is DOS_LATINUS
    National character set is DOS_LATINUS
    Identifier character set is DOS_LATINUS
    Literal character set is DOS_LATINUS
    Display character set is DOS_LATINUS

    Alias RDB$DBHANDLE:
     Identifier character set is DEC_MCS
     Default character set is DEC_MCS
     National character set is DEC_MCS
    SQL> show automatic translation
    Automatic translation: ON
    SQL> create tab latin (f1 char(4) char set win_latin1,
    cont> f2 char(4) char set dos_latinus);
    SQL> insert into latin value ('AÉÖ','AÉÖ');
    1 row inserted
    SQL> select f1, cast(f1 as char(8) char set hex),
    cont> f2, cast(f2 as char(8) char set hex) from latin;
     F1     F2
     AÉÖ 41C9D620   AÉÖ   41909920
    1 row selected
    SQL> select cast (_hex'9099' as char(2) ) from rdb$database;

    1 row selected
    SQL> select translate (_hex'9099' using rdb$dos_latinus )
    Cont> from rdb$database;

    1 row selected

    The previous example also shows automatic translations between
    the literals character set DOS_LATINUS and the field F2
    containing WIN_LATIN1, and the subsequent automatic translation
    from the F2 field back to the display character set.

    The hexadecimal display of the field contents shows that the
    actual data stored in the database is different for field f1
    and f2 even though the input literals and the output displayed
    appears identical.

3  –  Default Character Sets

    The default character set is the character set that SQL uses for
    the following elements:

    o  Database columns with a character data type that does not
       explicitly specify a character set

    o  Parameters that are not qualified by a character set

    You can specify the default character set at the session and
    database level. See the Oracle Rdb Introduction to SQL and Oracle
    Rdb Guide to Database Design and Definition for more detail about
    session and database character sets.

    You can specify the database default character set only when
    you create the database. You cannot change the database default
    character set after you have created the database.

    SQL uses DEC_MCS as the default character set, unless you have
    set the dialect to MIA or specified a default character set
    at the session level. You can override any default character
    set by specifying another default character set when creating a
    database.

    To specify the default character set, use one of the character
    set names listed in Supported Character Sets.

    The default character set does not affect the setting of the
    currency sign.

    When you compile SQL programs (either SQL module language or
    precompiled SQL), SQL uses the following to derive the default
    character set:

    o  The DEFAULT CHARACTER SET clause in the DECLARE ALIAS
       statement specifies the default character set of the alias
       at compile time. At run time, SQL uses the default character
       set of the attached database. At run time, you must ensure
       that the database default character set is identical to the
       default character set specified in the DECLARE ALIAS clause.

    o  The DEFAULT CHARACTER SET clause of the SQL module header or
       the DECLARE MODULE statement specifies the character set for
       parameters that are not qualified by a character set.

    o  In dynamic SQL, the SET DEFAULT CHARACTER SET statement
       specifies, at run time, the character set for parameters that
       are not qualified by a character set.

    o  The RDB$CHARACTER_SET logical name. However, the logical name
       is deprecated and will not be supported in a future release.

4  –  Display Character Set

    The display character set is the character set SQL uses for
    determining the character set that text will automatically be
    translated to before display in interactive SQL or for text being
    returned by SQL to a user program.

    You can specify the display character set only for a session
    or a module by using the SET DISPLAY CHARACTER SET statement or
    the DISPLAY CHARACTER SET clause of the SQL module header, the
    DECLARE MODULE statement, or the DECLARE ALIAS statement.

    The choice of display character set is limited to those character
    sets that include ASCII characters. Identifier Character Set
    identifies the subset of character sets that you can use to
    specify the display character set.

5  –  Identifier Character Set

    The identifier character set is the character set SQL uses for
    database object names, such as table names and column names.
    You can specify the identifier character set at the session and
    database level. The choice of identifier character set is limited
    to those character sets that include ASCII characters. This is
    necessary so that the object names for the Oracle Rdb system
    metadata, which is in ASCII, can be stored.

    You can specify the identifier character set for the database
    only when you create the database. You cannot alter the
    identifier character set of a database after creation.

    Following is a list of the character sets used for the identifier
    character set:

    o  ASCII

    o  AL24UTFFSS

    o  DEC_MCS

    o  DOS_LATIN1

    o  DOS_LATINUS

    o  DEVANAGARI

    o  DEC_SICGCC

    o  DEC_HANYU

    o  DEC_HANZI

    o  GB18030

    o  ISOLATINARABIC

    o  ISOLATINCYRILLIC

    o  ISOLATIN1

    o  ISOLATIN9

    o  ISOLATINGREEK

    o  ISOLATINHEBREW

    o  DEC_KANJI

    o  KATAKANA

    o  DEC_KOREAN

    o  SHIFT_JIS

    o  UTF8

    o  UNSPECIFIED

    o  TACTIS

    o  WIN_ARABIC

    o  WIN_GREEK

    o  WIN_CYRILLIC

    o  WIN_HEBREW

    When you compile SQL programs (either SQL module language or
    precompiled SQL), SQL uses the following to derive the identifier
    character set:

    o  The IDENTIFIER CHARACTER SET clause of the SQL module header
       or the DECLARE MODULE statement specifies the character set
       for parameters that are not qualified by a character set.

    o  In dynamic SQL, the SET IDENTIFIER CHARACTER SET statement
       specifies, at run time, the character set for parameters that
       are not qualified by a character set.

    o  The RDB$CHARACTER_SET logical name. However, the logical name
       is deprecated and will not be supported in a future release.

    SQL uses DEC_MCS as the identifier character set, unless you have
    set the dialect to MIA or specified an identifier character set
    at the session level. You can override any identifier character
    set by specifying another identifier character set when creating
    a database.

6  –  Literal Character Sets

    The literal character set is the character set SQL uses for
    unqualified character string literals.

    You can specify the literal character set only for a session
    or a module by using the SET LITERAL CHARACTER SET statement or
    the LITERAL CHARACTER SET clause of the SQL module header, the
    DECLARE MODULE statement, or the DECLARE ALIAS statement.

    When inserting data into a column, you must qualify the literal
    with the same character set with which you defined the column.

    For example, suppose that the literal character set of the
    module is DEC_MCS. If the column ENGLISH is defined as data type
    DEC_MCS, SQL returns an error when you execute the following
    statement:

    SQL> INSERT INTO COLOURS
    cont>    (ENGLISH)
    cont> VALUES
    cont>    (_DEC_KANJI'Black');
    %SQL-F-INCCSASS, Incompatible character set assignment between ENGLISH and
    <value expression>
    SQL>

7  –  National Character Set

    The national character set is a shorthand notation that you can
    use for a character set of your choice. SQL uses the national
    character set for the following elements:

    o  For all columns and domains with the data type NCHAR or NCHAR
       VARYING and for the NCHAR data type in a CAST function. For
       information about these data types, see the Data_Types HELP
       topic.

    o  For all parameters in SQL module language with the data type
       NCHAR or NCHAR VARYING.

    o  For all character string literals qualified by the national
       character set; that is, the literal is preceded by the letter
       N and a single quotation mark (for example, N'). For more
       information, see the Literals HELP topic.

    You can specify the national character set at the session and
    database level. See the Oracle Rdb Introduction to SQL and the
    Oracle Rdb Guide to Database Design and Definition for more
    detail about session and database character sets.

    You specify the national character set for a database when you
    create the database. You cannot alter the national character set
    of a database.

    SQL uses DEC_MCS as the national character set, unless you have
    set the dialect to MIA or specified a national character set at
    the session level. You can override any national character set
    by specifying another national character set when creating a
    database.

    When you compile SQL programs (either SQL module language or
    precompiled SQL), SQL uses the following to derive the national
    character set:

    o  The NATIONAL CHARACTER SET clause in the DECLARE ALIAS
       statement specifies the national character set of the alias
       at compile time. It controls the national character set for
       column and domain definitions and the NCHAR and NCHAR VARYING
       data types in a CAST function. At run time, SQL uses the
       national character set of the attached database for these
       elements.

    o  The NATIONAL CHARACTER SET clause of the SQL module header
       and the DECLARE MODULE statement specifies the character set
       for literals qualified by the national character set and for
       parameters defined with the data type NCHAR or NCHAR VARYING.

    o  In dynamic SQL, the SET NATIONAL CHARACTER SET statement
       specifies, at run time, the character set for columns with
       the data type NCHAR and NCHAR VARYING and for character string
       literals qualified by the national character set.

    o  The RDB$CHARACTER_SET logical name. However, the logical name
       is deprecated and will not be supported in a future release.

                                   NOTE

       Although SQL does not require that the national character
       set of the database and the module match, Oracle Rdb
       recommends that you define both with the same character
       set.

8  –  Character Set ISOLATIN9

 Oracle Rdb supports the ISOLATIN9 character set (as described by ISO
 8859-15).
    ISOLATIN9 is similar to ISOLATIN1 except for 8 codepoints.

    The following table compares ISOLATIN9 and ISOLATIN1.

    Table 1 ISOLATIN1/ISOLATIN9 Character Set Differences

               ISO Latin 1                    ISO Latin 9

    Code Unicode                  Unicode
    Pos  Pos                      Pos
    Hex  Hex   Name               Hex  Name

    A4   00A4  currency symbol    20AC euro sign
    A6   00A6  broken bar         0160 latin capital letter s with
                                       caron
    A8   00A8  diaeresis          0161 latin small letter s with
                                       caron
    B4   00B4  acute accent       017D latin capital letter z with
                                       caron
    B8   00B8  cedilla            017E latin small letter z with
                                       caron
    BC   00BC  vulgar fraction    0152 latin capital ligature oe
               one quarter
    BD   00BD  vulgar fraction    0153 latin small ligature oe
               one half
    BE   00BE  vulgar fraction    0178 latin capital letter y with
               three quarters          diaeresis

9  –  Oracle NLS Character Set Names

    Oracle Rdb supports the use of Oracle National Language Support
    (NLS) names as aliases for existing Oracle Rdb character sets,
    as summarized in the following table. You can use NLS alias names
    anywhere a character set name can be used.

    Table 2 Oracle NLS Character Set Names Supported as Aliases

    US7ASCII       ASCII
    WE8DEC         DEC_MCS
    WE8ISO8859P1   ISOLATIN1
    WE8ISO8859P1   ISOLATIN9
    CL8ISO8859P5   ISOLATINCYRILLIC
    AR8ISO8859P6   ISOLATINARABIC
    EL8ISO8859P7   ISOLATINGREEK
    IW8ISO8859P8   ISOLATINHEBREW
    TH8TISASCII    TACTIS
    JA16VMS        DEC_KANJI
    JA16SJIS       SHIFT_JIS
    KO16KSC5601    KOREAN
    ZHS16CGB231280 HANZI
    ZH16BIG5       BIG5
    JA16EUCFIXED   KANJI

10  –  Character Set UNSPECIFIED

    Oracle Rdb supports the use of the UNSPECIFIED character set.
    You can make comparisons and assignments between text objects
    (columns, literals, and so on) that have the UNSPECIFIED
    character set, and any other text object regardless of the
    character set of the other text object.

    The characteristics of the UNSPECIFIED character set are as
    follows:

    o  The character set ID is 32767.

    o  It can be used to specify any session or database character
       set, including the identifier character set.

    o  It is a single-octet character set (fixed).

    o  It applies casing (uppercase and lowercase) only to ASCII
       characters.

    o  It contains ASCII, as follows:

       -  The space character is the ASCII space character (0x20).

       -  The wildcard character is the ASCII underscore (0x5f).

       -  The string wildcard is the ASCII percent (0x25).

11  –  Logical Names for Character Sets

    You can define a logical name for a character set name. Doing
    so allows easy portability of applications across national
    boundaries. You can use this logical name or parameter anywhere
    you use a character set name in SQL. SQL translates the logical
    name or parameter at compile time for precompiled SQL and SQL
    module language, or at run time for dynamic SQL and interactive
    SQL.

    The logical name can begin with any of the following:

    o  RDBVMS$

    o  RDB$

    o  SQL$

       Oracle Rdb recommends that you begin logical names with RDB$.
Close Help