mb_strcut
(PHP 4 >= 4.0.6, PHP 5, PHP 7, PHP 8)
mb_strcut — 获取字符的一部分
说明
mb_strcut() 和 mb_substr() 类似,都是从字符串中提取子字符串,但是按字节数来执行,而不是字符个数。 如果截断位置位于多字节字符两个字节的中间,将于该字符的第一个字节开始执行。 这也是和 substr() 函数的不同之处,后者简单地将字符串在字节之间截断,这将导致一个畸形的字节序列。
参数
string
-
要截断的 string。
start
-
如果
start
不是负数,返回的字符串会从string
的第start
字节位置开始,从 0 开始计数。举个例子,字符串 'abcdef
',字节位置0
的字符是 'a
',字节位置2
的字符是 'c
',以此类推。如果
start
是负数,返回的字符串是从string
末尾倒数第start
个字节开始的。但是,如果负start
大于字符串的长度,则返回的部分将从string
的开头开始。 length
-
字节长度。如果省略或传递
NULL
,则将所有字节提取到字符串的末尾。如果
length
为负数,则返回的字符串将在从string
末尾倒数的第length
个字节处结束。但是,如果负length
大于start
位置之后的字符数,则返回空字符串。 encoding
-
encoding
参数为字符编码。如果省略或是null
,则使用内部字符编码。
返回值
mb_strcut() 根据 start
和 length
参数返回 string
的一部分。
更新日志
版本 | 说明 |
---|---|
8.0.0 |
现在 encoding 可以为 null。
|
用户贡献的备注 4 notes
Here is an example with UTF8 characters, to see how the start and length arguments are working:
$str_utf8 = utf8_encode("Déjà_vu");
$str_utf8_0 = mb_strcut($str_utf8, 0, 4, "UTF-8"); // Déj
$str_utf8_1 = mb_strcut($str_utf8, 1, 4, "UTF-8"); // éj
$str_utf8_2 = mb_strcut($str_utf8, 2, 4, "UTF-8"); // éj
$str_utf8_3 = mb_strcut($str_utf8, 3, 4, "UTF-8"); // jà_
$str_utf8_4 = mb_strcut($str_utf8, 4, 4, "UTF-8"); // à_v
The string includes two special charaters, "é" and "à" internally coded with two bytes.
Note that a multibyte character is removed rather than kept in half at the end of the output.
Note also that the result is the same for a cut 1,4 and a cut 2,4 with this string.
What the manual and the first commenter are trying to say is that mb_strcut uses byte offsets, as opposed to mb_substr which uses character offsets.
Both mb_strcut and mb_substr appear to treat negative and out-of-range offsets and lengths in the basically the same way as substr. An exception is that if start is too large, an empty string will be returned rather than FALSE. Testing indicates that mb_strcut first works out start and end byte offsets, then moves each offset left to the nearest character boundary.
This was driving me crazy, because mb_strcut() kept returning an empty string. The $length parameter seems to have a max value of 2^32-1 (2147483647).
Works:
<?php
# output: Полуустав
echo mb_strcut('Полуустав', 0, pow(2,31)-1);
?>
Doesn't work:
<?php
# nothing is output
echo mb_strcut('Полуустав', 0, pow(2,31));
?>
My PHP_INT_MAX value is much larger than 2^32-1, so I'm not sure why larger values for $length don't work. :(
<?php
# output: 9223372036854775807
echo PHP_INT_MAX;
?>
diffrence between mb_substr and mb_substr
example:
mb_strcut('I_ROHA', 1, 2) returns 'I_'. Treated as byte stream.
mb_substr('I_ROHA', 1, 2) returns 'ROHA' Treated as character stream.
# 'I_' 'RO' 'HA' means multi-byte character