mb_ereg

(PHP 4 >= 4.2.0, PHP 5, PHP 7)

mb_ereg — Regular expression match with multibyte support

说明

int mb_ereg ( string $pattern , string $string [, array &$regs ] )

Executes the regular expression match with multibyte support.

参数

pattern

The search pattern.

string

The search string.

regs

If matches are found for parenthesized substrings of pattern and the function is called with the third argument regs, the matches will be stored in the elements of the array regs. If no matches are found, regs is set to an empty array.

$regs[1] will contain the substring which starts at the first left parenthesis; $regs[2] will contain the substring starting at the second, and so on. $regs[0] will contain a copy of the complete string matched.

返回值

Returns the byte length of the matched string if a match for pattern was found in string, or FALSE if no matches were found or an error occurred.

If the optional parameter regs was not passed or the length of the matched string is 0, this function returns 1.

更新日志

版本	说明
7.1.0	mb_ereg() will now set `regs` to an empty array, if nothing matched. Formerly, `regs` was not modified in that case.

注释

Note:
mb_regex_encoding() 指定的内部编码或字符编码将会当作此函数用的字符编码。

参见

mb_regex_encoding() - Set/Get character encoding for multibyte regex
mb_eregi() - Regular expression match ignoring case with multibyte support

User Contributed Notes

Anonymous 11-Mar-2017 10:14


Old link to Oniguruma regex syntax is not working anymore, there is a working one:

https://github.com/geoffgarside/oniguruma/blob/master/Syntax.txt

mb_ereg() seems unable to Use "named sub 06-May-2015 10:42


mb_ereg() seems unable to Use "named subpattern".

preg_match() seems a substitute only in UTF-8 encoding.



<?php



$text = 'multi_byte_string';

$pattern = '.*(?<name>string).*';        // "?P" causes "mbregex compile err" in PHP 5.3.5



if(mb_ereg($pattern, $text, $matches)){

    echo '<pre>'.print_r($matches, true).'</pre>';

}else{

    echo 'no match';

}



?>



This code ignores "?<name>" in $pattern and displays below.



Array

(

    [0] => multi_byte_string

    [1] => string

)



$pattern = '/.*(?<name>string).*/u';

if(preg_match($pattern, $text, $matches)){



instead of lines 2 & 3

displays below (in UTF-8 encoding).



Array

(

    [0] => multi_byte_string

    [name] => string

    [1] => string

)

Riikka K 14-Jul-2014 01:23


While hardly mentioned anywhere, it may be useful to note that mb_ereg uses Oniguruma library internally. The syntax for the default mode (ruby) is described here:



http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt

pressler at hotmail dot de 20-Nov-2012 07:50


Note that mb_ereg() does not support the \uFFFF unicode syntax but uses \x{FFFF} instead:



<?PHP



$text = 'Peter is a boy.'; // english

$text = '???? ?? ???.'; // arabic

//$text = '???? ??? ???.'; // hebrew



mb_regex_encoding('UTF-8');



if(mb_ereg('[\x{0600}-\x{06FF}]', $text)) // arabic range

//if(mb_ereg('[\x{0590}-\x{05FF}]', $text)) // hebrew range

{

    echo "Text has some arabic/hebrew characters.";

}

else

{

    echo "Text doesnt have arabic/hebrew characters.";

}



?>

Jon 11-Apr-2009 01:22


Hebrew regex tested on PHP 5, Ubuntu 8.04.


Seems to work fine without the mb_regex_encoding lines (commented out).


Didn't seem to work with \uxxxx (also commented out).





<?php


echo "Line ";


//mb_regex_encoding("ISO-8859-8");


//if(mb_ereg(".*([\u05d0-\u05ea]).*", $this->current_line))


if(mb_ereg(".*([?-?]).*", $this->current_line))


{


    echo "has";


}


else


{


    echo "doesn't have";


}


echo " Hebrew characters.<br>";    


//mb_regex_encoding("UTF-8");


?>