Hosting de Calidad
  • Inicio
  • Precios y servicios
  • F.a.q y ayudas
  • Realizar pedido
  • Webs alojadas
  • Quienes somos
  • Foro HyD
  • Contacto

    Zona Dominios

    Entrar
    registro de dominios


    Zona Hosting

    Entrar
    alojamiento web


    5 Métodos de Pago
    Tarjeta de crédito
    Domiciliación
    Transferencia
    Soporte Epagado
    Soporte Paypal

    Liberalización .es

    Ver mas
    dominios .es


  •  
     
     
    Multi_Byte String Functions

    LII_ Multi_Byte String Functions

    Introducción

    There are many languages in which all characters can be expressed by single byte_ Multi_byte character codes are used to express many characters for many languages_ mbstring is developed to handle Japanese characters_ However, many mbstring functions are able to handle character encoding other than Japanese_

    A multi_byte character encoding represents single character with consecutive bytes_ Some character encoding has shift(escape) sequences to start/end multi_byte character strings_ Therefore, a multi_byte character string may be destroyed when it is divided and/or counted unless multi_byte character encoding safe method is used_ This module provides multi_byte character safe string functions and other utility functions such as conversion functions_

    Since PHP is basically designed for ISO_8859_1, some multi_byte character encoding does not work well with PHP_ Therefore, it is important to set mbstring_language to appropriate language (i_e_ "Japanese" for japanese) and mbstring_internal_encoding to a character encoding that works with PHP_

    PHP4 Character Encoding Requirements

    • Per byte encoding

    • Single byte characters in range of 00h_7fh which is compatible with ASCII

    • Multi_byte characters without 00h_7fh

    These are examples of internal character encoding that works with PHP and does NOT work with PHP_

    Character encodings work with PHP: 
    ISO_8859_*, EUC_JP, UTF_8
    
    Character encodings do NOT work with PHP:
    JIS, SJIS

    Character encoding, that does not work with PHP, may be converted with mbstring's HTTP input/output conversion feature/function_

    Nota: SJIS should not be used for internal encoding unless the reader is familiar with parser/compiler, character encoding and character encoding issues_

    Nota: If you use databases with PHP, it is recommended that you use the same character encoding for both database and internal encoding for ease of use and better performance_

    If you are using PostgreSQL, it supports character encoding that is different from backend character encoding_ See the PostgreSQL manual for details_

    Instalación

    mbstring is an extended module_ You must enable the module with the configure script_ Refer to the Install section for details_

    The following configure options are related to the mbstring module_

    • __enable_mbstring=LANG: Enable mbstring functions_ This option is required to use mbstring functions_

      As of PHP 4_3_0, mbstring extension provides enhanced support for Simplified Chinese, Traditional Chinese, Korean, and Russian in addition to Japanese_ To enable that feature, you will have to supply either one of the following options to the LANG parameter; __enable_mbstring=cn for Simplified Chinese support, __enable_mbstring=tw for Traditional Chinese support, __enable_mbstring=kr for Korean support, __enable_mbstring=ru for Russian support, and __enable_mbstring=ja for Japanese support_

      Also __enable_mbstring=all is convenient for you to enable all the supported languages listed above_

      Nota: Japanese language support is also enabled by __enable_mbstring without any options for the sake of backwards compatibility_

    • __enable_mbstr_enc_trans : Enable HTTP input character encoding conversion using mbstring conversion engine_ If this feature is enabled, HTTP input character encoding may be converted to mbstring_internal_encoding automatically_

      Nota: As of PHP 4_3_0, the option __enable_mbstr_enc_trans will be eliminated and replaced with mbstring_encoding_translation_ HTTP input character encoding conversion is enabled when this is set to On (the default is Off)_

    • __enable_mbregex: Enable regular expression functions with multibyte character support_

    Configuración en tiempo de ejecución

    El comportamiento de estas funciones está afectado por los valores definidos en php_ini_

    Tabla 1_ Multi_Byte String configuration options

    NameDefaultChangeable
    mbstring_languageNULLPHP_INI_ALL
    mbstring_detect_orderNULLPHP_INI_ALL
    mbstring_http_inputNULLPHP_INI_ALL
    mbstring_http_outputNULLPHP_INI_ALL
    mbstring_internal_encodingNULLPHP_INI_ALL
    mbstring_script_encodingNULLPHP_INI_ALL
    mbstring_substitute_characterNULLPHP_INI_ALL
    mbstring_func_overload"0"PHP_INI_SYSTEM
    mbstring_encoding_translation"0"PHP_INI_ALL
    For further details and definition of the PHP_INI_* constants see ini_set()_

    A continuación se presenta una corta explicación de las directivas de configuración

    • mbstring_language defines default language used in mbstring_ Note that this option defines mbstring_internal_encoding and mbstring_internal_encoding should be placed after mbstring_language in php_ini

    • mbstring_encoding_translation enables HTTP input character encoding detection and translation into internal chatacter encoding_

    • mbstring_internal_encoding defines default internal character encoding_

    • mbstring_http_input defines default HTTP input character encoding_

    • mbstring_http_output defines default HTTP output character encoding_

    • mbstring_detect_order defines default character code detection order_ See also mb_detect_order()_

    • mbstring_substitute_character defines character to substitute for invalid character encoding_

    • mbstring_func_overloadoverload(replace) single byte functions by mbstring functions_ mail(), ereg(), etc_ are overloaded by mb_send_mail(), mb_ereg(), etc_ Possible values are 0, 1, 2, 4 or a combination of them_ For example, 7 for overload everything_ 0: No overload, 1: Overload mail() function, 2: Overload str*() functions, 4: Overload ereg*() functions_

    Web Browsers are supposed to use the same character encoding when submitting form_ However, browsers may not use the same character encoding_ See mb_http_input() to detect character encoding used by browsers_

    If enctype is set to multipart/form_data in HTML forms, mbstring does not convert character encoding in POST data_ The user must convert them in the script, if conversion is needed_

    Although, browsers are smart enough to detect character encoding in HTML_ charset is better to be set in HTTP header_ Change default_charset according to character encoding_

    Ejemplo 1_ php_ini setting example

    ; Set default language
    mbstring_language        = Neutral; Set default language to Neutral(UTF_8) (default)
    mbstring_language        = English; Set default language to English 
    mbstring_language        = Japanese; Set default language to Japanese
    
    ;; Set default internal encoding
    ;; Note: Make sure to use character encoding works with PHP
    mbstring_internal_encoding    = UTF_8  ; Set internal encoding to UTF_8
    
    ;; HTTP input encoding translation is enabled_
    mbstring_encoding_translation = On
    
    ;; Set default HTTP input character encoding
    ;; Note: Script cannot change http_input setting_
    mbstring_http_input           = pass    ; No conversion_ 
    mbstring_http_input           = auto    ; Set HTTP input to auto
                                    ; "auto" is expanded to "ASCII,JIS,UTF_8,EUC_JP,SJIS"
    mbstring_http_input           = SJIS    ; Set HTTP2 input to  SJIS
    mbstring_http_input           = UTF_8,SJIS,EUC_JP ; Specify order
    
    ;; Set default HTTP output character encoding 
    mbstring_http_output          = pass    ; No conversion
    mbstring_http_output          = UTF_8   ; Set HTTP output encoding to UTF_8
    
    ;; Set default character encoding detection order
    mbstring_detect_order         = auto    ; Set detect order to auto
    mbstring_detect_order         = ASCII,JIS,UTF_8,SJIS,EUC_JP ; Specify order
    
    ;; Set default substitute character
    mbstring_substitute_character = 12307   ; Specify Unicode value
    mbstring_substitute_character = none    ; Do not print character
    mbstring_substitute_character = long    ; Long Example: U+3000,JIS+7E7E

    Ejemplo 2_ php_ini setting for EUC_JP users

    ;; Disable Output Buffering
    output_buffering      = Off
    
    ;; Set HTTP header charset
    default_charset       = EUC_JP    
    
    ;; Set default language to Japanese
    mbstring_language = Japanese
    
    ;; HTTP input encoding translation is enabled_
    mbstring_encoding_translation = On
    
    ;; Set HTTP input encoding conversion to auto
    mbstring_http_input   = auto 
    
    ;; Convert HTTP output to EUC_JP
    mbstring_http_output  = EUC_JP    
    
    ;; Set internal encoding to EUC_JP
    mbstring_internal_encoding = EUC_JP    
    
    ;; Do not print invalid characters
    mbstring_substitute_character = none

    Ejemplo 3_ php_ini setting for SJIS users

    ;; Enable Output Buffering
    output_buffering     = On
    
    ;; Set mb_output_handler to enable output conversion
    output_handler       = mb_output_handler
    
    ;; Set HTTP header charset
    default_charset      = Shift_JIS
    
    ;; Set default language to Japanese
    mbstring_language = Japanese
    
    ;; Set http input encoding conversion to auto
    mbstring_http_input  = auto 
    
    ;; Convert to SJIS
    mbstring_http_output = SJIS    
    
    ;; Set internal encoding to EUC_JP
    mbstring_internal_encoding = EUC_JP    
    
    ;; Do not print invalid characters
    mbstring_substitute_character = none

    Tipos de recursos

    Esta extensión no tiene ningún tipo de recurso definido_

    Constantes predefinidas

    Estas constantes están definidas por esta extensión y estarán disponibles solamente cuando la extensión ha sido o bien compilada dentro de PHP o grabada dinámicamente en tiempo de ejecución_

    MB_OVERLOAD_MAIL (integer)

    MB_OVERLOAD_STRING (integer)

    MB_OVERLOAD_REGEX (integer)

    HTTP Input and Output

    HTTP input/output character encoding conversion may convert binary data also_ Users are supposed to control character encoding conversion if binary data is used for HTTP input/output_

    Nota: For PHP 4_3_2 or earlier, if enctype for HTML form is set to multipart/form_data, mbstring does not convert character encoding in POST data_ If it is the case, strings are needed to be converted to internal character encoding_

    Nota: Since PHP 4_3_3, if enctype for HTML form is set to multipart/form_data, and, mbstring_encoding_translation is set to On in php_ini POST variables and uploaded filename will be converted to internal character encoding_ But, characters specified in 'name' of HTML form will not be converted_

    • HTTP Input

      There is no way to control HTTP input character conversion from PHP script_ To disable HTTP input character conversion, it has to be done in php_ini_

      Ejemplo 4_ Disable HTTP input conversion in php_ini

      ;; Disable HTTP Input conversion
      mbstring_http_input = pass
      ;; Disable HTTP Input conversion (PHP 4_3_0 or higher)
      mbstring_encoding_translation = Off

      When using PHP as an Apache module, it is possible to override PHP ini setting per Virtual Host in httpd_conf or per directory with _htaccess_ Refer to the Configuration section and Apache Manual for details_

    • HTTP Output

      There are several ways to enable output character encoding conversion_ One is using php_ini, another is using ob_start() with mb_output_handler() as ob_start callback function_

      Nota: For PHP3_i18n users, mbstring's output conversion differs from PHP3_i18n_ Character encoding is converted using output buffer_

    Ejemplo 5_ php_ini setting example

    ;; Enable output character encoding conversion for all PHP pages
    
    ;; Enable Output Buffering
    output_buffering    = On
    
    ;; Set mb_output_handler to enable output conversion
    output_handler      = mb_output_handler

    Ejemplo 6_ Script example

    <?php
    
    // Enable output character encoding conversion only for this page
    
    // Set HTTP output character encoding to SJIS
    mb_http_output('SJIS');
    
    // Start buffering and specify "mb_output_handler" as
    // callback function
    ob_start('mb_output_handler');
    
    ?>

    Supported Character Encodings

    Currently, the following character encoding is supported by the mbstring module_ Character encoding may be specified for mbstring functions' encoding parameter_

    The following character encoding is supported in this PHP extension:

    UCS_4, UCS_4BE, UCS_4LE, UCS_2, UCS_2BE, UCS_2LE, UTF_32, UTF_32BE, UTF_32LE, UCS_2LE, UTF_16, UTF_16BE, UTF_16LE, UTF_8, UTF_7, ASCII, EUC_JP, SJIS, eucJP_win, SJIS_win, ISO_2022_JP, JIS, ISO_8859_1, ISO_8859_2, ISO_8859_3, ISO_8859_4, ISO_8859_5, ISO_8859_6, ISO_8859_7, ISO_8859_8, ISO_8859_9, ISO_8859_10, ISO_8859_13, ISO_8859_14, ISO_8859_15, byte2be, byte2le, byte4be, byte4le, BASE64, 7bit, 8bit and UTF7_IMAP_

    As of PHP 4_3_0, the following character encoding support will be added experimentally : EUC_CN, CP936, HZ, EUC_TW, CP950, BIG_5, EUC_KR, UHC (CP949), ISO_2022_KR, Windows_1251 (CP1251), Windows_1252 (CP1252), CP866, KOI8_R_

    php_ini entry, which accepts encoding name, accepts "auto" and "pass" also_ mbstring functions, which accepts encoding name, and accepts "auto"_

    If "pass" is set, no character encoding conversion is performed_

    If "auto" is set, it is expanded to "ASCII,JIS,UTF_8,EUC_JP,SJIS"_

    See also mb_detect_order()

    Nota: "Supported character encoding" does not mean that it works as internal character code_

    Overloading PHP string functions with multi byte string functions

    Because almost PHP application written for language using single_byte character encoding, there are some difficulties for multibyte string handling including japanese_ Almost PHP string functions such as substr() do not support multibyte string_

    Multibyte extension (mbstring) has some PHP string functions with multibyte support (ex_ substr() supports mb_substr())_

    Multibyte extension (mbstring) also supports 'function overloading' to add multibyte string functionality without code modification_ Using function overloading, some PHP string functions will be oveloaded multibyte string functions_ For example, mb_substr() is called instead of substr() if function overloading is enabled_ Function overload makes easy to port application supporting only single_byte encoding for multibyte application_

    mbstring_func_overload in php_ini should be set some positive value to use function overloading_ The value should specify the category of overloading functions, sbould be set 1 to enable mail function overloading_ 2 to enable string functions, 4 to regular expression functions_ For example, if is set for 7, mail, strings, regex functions should be overloaded_ The list of overloaded functions are shown in below_

    Basics of Japanese multi_byte characters

    Most Japanese characters need more than 1 byte per character_ In addition, several character encoding schemas are used under a Japanese environment_ There are EUC_JP, Shift_JIS(SJIS) and ISO_2022_JP(JIS) character encoding_ As Unicode becomes popular, UTF_8 is used also_ To develop Web applications for a Japanese environment, it is important to use the character set for the task in hand, whether HTTP input/output, RDBMS and E_mail_

    • Storage for a character can be up to six bytes

    • A multi_byte character is usually twice of the width compared to single_byte characters_ Wider characters are called "zen_kaku" _ meaning full width, narrower characters are called "han_kaku" _ meaning half width_ "zen_kaku" characters are usually fixed width_

    • Some character encoding defines shift(escape) sequence for entering/exiting multi_byte character strings_

    • ISO_2022_JP must be used for SMTP/NNTP_

    • "i_mode" web site is supposed to use SJIS_

    References

    Multi_byte character encoding and its related issues are very complex_ It is impossible to cover in sufficient detail here_ Please refer to the following URLs and other resources for further readings_

    • Unicode/UTF/UCS/etc

      http://www_unicode_org/

    • Japanese/Korean/Chinese character information

      ftp://ftp_ora_com/pub/examples/nutshell/ujip/doc/cjk_inf

    Tabla de contenidos
    mb_convert_case __ Perform case folding on a string
    mb_convert_encoding __ Convert character encoding
    mb_convert_kana __  Convert "kana" one from another ("zen_kaku" ,"han_kaku" and more)
    mb_convert_variables __ Convert character code in variable(s)
    mb_decode_mimeheader __ Decode string in MIME header field
    mb_decode_numericentity __  Decode HTML numeric string reference to character
    mb_detect_encoding __ Detect character encoding
    mb_detect_order __  Set/Get character encoding detection order
    mb_encode_mimeheader __ Encode string for MIME header
    mb_encode_numericentity __  Encode character to HTML numeric string reference
    mb_ereg_match __  Regular expression match for multibyte string
    mb_ereg_replace __ Replace regular expression with multibyte support
    mb_ereg_search_getpos __  Returns start point for next regular expression match
    mb_ereg_search_getregs __  Retrieve the result from the last multibyte regular expression match
    mb_ereg_search_init __  Setup string and regular expression for multibyte regular expression match
    mb_ereg_search_pos __  Return position and length of matched part of multibyte regular expression for predefined multibyte string
    mb_ereg_search_regs __  Returns the matched part of multibyte regular expression
    mb_ereg_search_setpos __  Set start point of next regular expression match
    mb_ereg_search __  Multibyte regular expression match for predefined multibyte string
    mb_ereg __ Regular expression match with multibyte support
    mb_eregi_replace __  Replace regular expression with multibyte support ignoring case
    mb_eregi __  Regular expression match ignoring case with multibyte support
    mb_get_info __ Get internal settings of mbstring
    mb_http_input __ Detect HTTP input character encoding
    mb_http_output __ Set/Get HTTP output character encoding
    mb_internal_encoding __  Set/Get internal character encoding
    mb_language __  Set/Get current language
    mb_output_handler __  Callback function converts character encoding in output buffer
    mb_parse_str __  Parse GET/POST/COOKIE data and set global variable
    mb_preferred_mime_name __ Get MIME charset string
    mb_regex_encoding __  Returns current encoding for multibyte regex as string
    mb_regex_set_options __  Set/Get the default options for mbregex functions
    mb_send_mail __  Send encoded mail_
    mb_split __ Split multibyte string using regular expression
    mb_strcut __ Get part of string
    mb_strimwidth __ Get truncated string with specified width
    mb_strlen __ Get string length
    mb_strpos __  Find position of first occurrence of string in a string
    mb_strrpos __  Find position of last occurrence of a string in a string
    mb_strtolower __ Make a string lowercase
    mb_strtoupper __ Make a string uppercase
    mb_strwidth __ Return width of string
    mb_substitute_character __ Set/Get substitution character
    mb_substr_count __ Count the number of substring occurrences
    mb_substr __ Get part of string
     
       



    registro de dominios | alojamiento web | hosting por publicidad

       

     

    Manual de linux Manual de apache Manual de php Manual de mysql Manual de SQL Manual del Plesk Como funciona Paypal Manual de html