PHP: PCRE is not compiled with PCRE_UTF8 support

Hello,

I installed Sun WS 7.0 with the PHP plugin and DokuWiki on Solaris 9 sparc.

When accessing DokuWiki I get -

Warning: preg_replace() [function.preg-replace]: Compilation failed:this version of PCRE is not compiled with PCRE_UTF8 support at offset 0 in /3beg/doku/htdocs/wiki/inc/utf8.php on line 421

Compiling GNU pcre on the server doesn't change the situation:

# ./configure --enable-utf8 --enable-unicode-properties --disable-shared

Where should I go for this error ?

Can someone help me: is the root cause PHP, pcre or DokuWiki ?

-- Nick

[637 byte] By [der_nikia] at [2007-11-27 5:46:56]
# 1
Can you post the phpinfo output. If PHP has been compiled with --with-pcre-regex option, then it would use pcre module from the specified location. Try replacing it. Else, you would have to recompile PHP without --with-pcre-regex option.
Seema.Aa at 2007-7-12 15:30:43 > top of Java-index,Web & Directory Servers,Web Servers...
# 2

phpinfo gives

System SunOS voyager 5.9 Generic_118558-17 sun4u

Build Date Jan 29 2007 23:12:49

Configure Command './configure' '--prefix=/java/re/phppack/5.2.0/nightly/ws/b01-2007-01-29/solaris-sparc/dist/5 .2.0/SunOS5.8_OPT.OBJ/php'

'--bindir=/java/re/phppack/5.2.0/nightly/ws/b01-2007-01-29/solaris-sparc/dist/5 .2.0/SunOS5.8_OPT.OBJ/php/bin/'

'--libdir=/java/re/phppack/5.2.0/nightly/ws/b01-2007-01-29/solaris-sparc/dist/5 .2.0/SunOS5.8_OPT.OBJ/php/lib/'

'--libexecdir=/java/re/phppack/5.2.0/nightly/ws/b01-2007-01-29/solaris-sparc/di st/5.2.0/SunOS5.8_OPT.OBJ/php/libexec/'

'--disable-static' '--enable-shared' '--disable-cli' '--disable-cgi' '--with-pic'

'--with-nsapi=/tmp/webserver7-spi.26277' '--enable-force-cgi-redirect' '--disable-rpath' '--enable-safe-mode' '--enable-ftp' '--enable-sockets' '--enable-memory-limit' '--enable-inline-optimization' '--enable-zlib'

'--enable-soap' '--with-dba' '--enable-sysvmsg' '--enable-sysvsem'

'--enable-sysvshm' '--enable-sqlite-utf8' '--enable-zend-multibyte'

'--enable-bcmath' '--enable-exif' '--enable-magic-quotes' '--enable-wddx'

'--enable-mbstring' '--enable-mbstr-enc-trans' '--enable-mbregex'

'--enable-gd-native-ttf'

'--with-pcre-regex=/h/iws-files/s/b/c/pcre/6.7/SunOS5.8_OPT.OBJ'

'--with-iconv-dir=/h/iws-files/s/b/c/libiconv/1.11/SunOS5.8_OPT.OBJ'

'--with-libxml-dir=/h/iws-files/s/b/c/libxml2/2.6.27/SunOS5.8_OPT.OBJ'

'--with-zlib=/h/iws-files/s/b/c/zlib/1.2.3/SunOS5.8_OPT.OBJ'

'--with-mysql=/h/iws-files/s/b/c/mysql/5.0.27/SunOS5.8_OPT.OBJ'

'--with-mysqli=/h/iws-files/s/b/c/mysql/5.0.27/SunOS5.8_OPT.OBJ/bin/mysql_confi g'

'--with-pgsql=/h/iws-files/s/b/c/postgresql/8.1.5/SunOS5.8_OPT.OBJ'

...

so PHP has been compiled with

--with-pcre-regex=/h/iws-files/s/b/c/pcre/6.7/SunOS5.8_OPT.OBJ'

Hmm what now?

What do you mean with "...then it would use pcre module from the specified location..."

der_nikia at 2007-7-12 15:30:43 > top of Java-index,Web & Directory Servers,Web Servers...
# 3
phpinfo also shows:pcrePCRE (Perl Compatible Regular Expressions) Support enabledPCRE Library Version 5.0 13-Sep-2004
der_nikia at 2007-7-12 15:30:44 > top of Java-index,Web & Directory Servers,Web Servers...
# 4
the PCRE library is compiled with UTF8 support. but not with multi byte extensions. if u are seeing it on windows, that might explain.
chilidevelopera at 2007-7-12 15:30:44 > top of Java-index,Web & Directory Servers,Web Servers...
# 5

ah, u have clearly mentioned that u r running on solaris sparc. my mistake.

by the way, i wrote a simple program and tried to run this program with the php binary and i got expected response

Regarding the validity of a UTF-8 string when using the /u pattern modifier, some things to be aware of;

1. If the pattern itself contains an invalid UTF-8 character, you get an error (as mentioned in the docs above - "UTF-8 validity of the pattern is checked since PHP 4.3.5"

2. When the subject string contains invalid UTF-8 sequences / codepoints, it basically result in a "quiet death" for the preg_* functions, where nothing is matched but without indication that the string is invalid UTF-8

3. PCRE regards five and six octet UTF-8 character sequences as valid (both in patterns and the subject string) but these are not supported in Unicode ( see section 5.9 "Character Encoding" of the "Secure Programming for Linux and Unix HOWTO" - can be found at http://www.tldp.org/ and other places )

4. For an example algorithm in PHP which tests the validity of a UTF-8 string (and discards five / six octet sequences) head to: http://hsivonen.iki.fi/php-utf8/

The following script should give you an idea of what works and what doesn't;

<?php

$examples = array(

'Valid ASCII' => "a",

'Valid 2 Octet Sequence' => "\xc3\xb1",

'Invalid 2 Octet Sequence' => "\xc3\x28",

'Invalid Sequence Identifier' => "\xa0\xa1",

'Valid 3 Octet Sequence' => "\xe2\x82\xa1",

'Invalid 3 Octet Sequence (in 2nd Octet)' => "\xe2\x28\xa1",

'Invalid 3 Octet Sequence (in 3rd Octet)' => "\xe2\x82\x28",

'Valid 4 Octet Sequence' => "\xf0\x90\x8c\xbc",

'Invalid 4 Octet Sequence (in 2nd Octet)' => "\xf0\x28\x8c\xbc",

'Invalid 4 Octet Sequence (in 3rd Octet)' => "\xf0\x90\x28\xbc",

'Invalid 4 Octet Sequence (in 4th Octet)' => "\xf0\x28\x8c\x28",

'Valid 5 Octet Sequence (but not Unicode!)' => "\xf8\xa1\xa1\xa1\xa1",

'Valid 6 Octet Sequence (but not Unicode!)' => "\xfc\xa1\xa1\xa1\xa1\xa1",

);

echo "++Invalid UTF-8 in pattern\n";

foreach ( $examples as $name => $str ) {

echo "$name\n";

preg_match("/".$str."/u",'Testing');

}

echo "++ preg_match() examples\n";

foreach ( $examples as $name => $str ) {

preg_match("/\xf8\xa1\xa1\xa1\xa1/u", $str, $ar);

echo "$name: ";

if ( count($ar) == 0 ) {

echo "Matched nothing!\n";

} else {

echo "Matched {$ar[0]}\n";

}

}

echo "++ preg_match_all() examples\n";

foreach ( $examples as $name => $str ) {

preg_match_all('/./u', $str, $ar);

echo "$name: ";

$num_utf8_chars = count($ar[0]);

if ( $num_utf8_chars == 0 ) {

echo "Matched nothing!\n";

} else {

echo "Matched $num_utf8_chars character\n";

}

}

?>

chilidevelopera at 2007-7-12 15:30:44 > top of Java-index,Web & Directory Servers,Web Servers...
# 6

The version of the PCRE library which is bundled with PHP add-on is 6.7 . But in your case 5.0 version of the library is used and that I think, is compiled without UTF-8 support. Web Server's lib directory too has pcre library and looks like this is getting used.

That means you must have configured the Web Server to use PHP as NSAPI plugin. If that is the case, try modifying the Web Server instance's bin/startserv script as follows (assuming the Web Server is run in 32 bit mode):

.....

# Add instance-specific information to LD_LIBRARY_PATH for Solaris and Linux

LD_LIBRARY_PATH="<web-server-install-dir>/plugins/php:${SERVER_LIB _PATH}:${SERVER_JVM_LIBPATH}:${LD_LIBRARY_PATH}"; export LD_LIBRARY_PATH

.....

Restart the instance after the modification and check if this resolves the issue.

Seema.Aa at 2007-7-12 15:30:44 > top of Java-index,Web & Directory Servers,Web Servers...
# 7
yes I'm using PHP as a NSAPI plugin.@Seema.AModifying the startup script and adding the plugin/php path changed the picture - now it works! Thanks a lot!-- Nick
der_nikia at 2007-7-12 15:30:44 > top of Java-index,Web & Directory Servers,Web Servers...