Fonctions membres publiques statiques
static	muteErrorHandler ()

static	unsafeIconv ($in, $out, $text)

static	iconv ($in, $out, $text, $max_chunk_size=8000)

static	cleanUTF8 ($str, $force_php=false)

static	unichr ($code)

static	iconvAvailable ()

static	convertToUTF8 ($str, $config, $context)

static	convertFromUTF8 ($str, $config, $context)

static	convertToASCIIDumbLossless ($str)

static	testIconvTruncateBug ()

static	testEncodingSupportsASCII ($encoding, $bypass=false)

Champs de données
const	ICONV_OK = 0

const	ICONV_TRUNCATES = 1

const	ICONV_UNUSABLE = 2

Fonctions membres privées
	__construct ()

Description détaillée

A UTF-8 specific character encoder that handles cleaning and transforming.

Note: All functions in this class should be static.

Documentation des constructeurs et destructeur

◆ __construct()

__construct ( )

private

Constructor throws fatal error if you attempt to instantiate class

Documentation des fonctions membres

◆ cleanUTF8()

static cleanUTF8	(	$str,
		$force_php = `false`
	)

static

Cleans a UTF-8 string for well-formedness and SGML validity

It will parse according to UTF-8 and return a valid UTF8 string, with non-SGML codepoints excluded.

Specifically, it will permit: \x{9}\x{A}\x{D}\x{20}-\x{7E}\x{A0}-\x{D7FF}\x{E000}-\x{FFFD}\x{10000}-\x{10FFFF} Source: https://www.w3.org/TR/REC-xml/#NT-Char Arguably this function should be modernized to the HTML5 set of allowed characters: https://www.w3.org/TR/html5/syntax.html#preprocessing-the-input-stream which simultaneously expand and restrict the set of allowed characters.

Paramètres

string	$str	The string to clean
bool	$force_php

Renvoie: string

Note: Just for reference, the non-SGML code points are 0 to 31 and 127 to 159, inclusive. However, we allow code points 9, 10 and 13, which are the tab, line feed and carriage return respectively. 128 and above the code points map to multibyte UTF-8 representations.; Fallback code adapted from utf8ToUnicode by Henri Sivonen and hsivo.nosp@m.nen@.nosp@m.iki.f.nosp@m.i at http://iki.fi/hsivonen/php-utf8/ under the LGPL license. Notes on what changed are inside, but in general, the original code transformed UTF-8 text into an array of integer Unicode codepoints. Understandably, transforming that back to a string would be somewhat expensive, so the function was modded to directly operate on the string. However, this discourages code reuse, and the logic enumerated here would be useful for any function that needs to be able to understand UTF-8 characters. As of right now, only smart lossless character encoding converters would need that, and I'm probably not going to implement them.

Voici le graphe des appelants de cette fonction :

◆ convertFromUTF8()

static convertFromUTF8	(	$str,
		$config,
		$context
	)

static

Converts a string from UTF-8 based on configuration.

Paramètres

string	$str	The string to convert
HTMLPurifier_Config	$config
HTMLPurifier_Context	$context

Renvoie: string

Note: Currently, this is a lossy conversion, with unexpressable characters being omitted.

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

◆ convertToASCIIDumbLossless()

static convertToASCIIDumbLossless ( $str )

static

Lossless (character-wise) conversion of HTML to ASCII

Paramètres

string $str UTF-8 string to be converted to ASCII

Renvoie: string ASCII encoded string with non-ASCII character entity-ized

Avertissement: Adapted from MediaWiki, claiming fair use: this is a common algorithm. If you disagree with this license fudgery, implement it yourself.

Note: Uses decimal numeric entities since they are best supported.; This is a DUMB function: it has no concept of keeping character entities that the projected character encoding can allow. We could possibly implement a smart version but that would require it to also know which Unicode codepoints the charset supported (not an easy task).; Sort of with cleanUTF8() but it assumes that $str is well-formed UTF-8

Voici le graphe des appelants de cette fonction :

◆ convertToUTF8()

static convertToUTF8	(	$str,
		$config,
		$context
	)

static

Convert a string to UTF-8 based on configuration.

Paramètres

string	$str	The string to convert
HTMLPurifier_Config	$config
HTMLPurifier_Context	$context

Renvoie: string

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

◆ iconv()

static iconv	(	$in,
		$out,
		$text,
		$max_chunk_size = `8000`
	)

static

iconv wrapper which mutes errors and works around bugs.

Paramètres

string	$in	Input encoding
string	$out	Output encoding
string	$text	The text to convert
int	$max_chunk_size

Renvoie: string

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

◆ iconvAvailable()

static iconvAvailable ( )

static

Renvoie: bool

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

◆ muteErrorHandler()

static muteErrorHandler ( )

static

Error-handler that mutes errors, alternative to shut-up operator.

◆ testEncodingSupportsASCII()

static testEncodingSupportsASCII	(	$encoding,
		$bypass = `false`
	)

static

This expensive function tests whether or not a given character encoding supports ASCII. 7/8-bit encodings like Shift_JIS will fail this test, and require special processing. Variable width encodings shouldn't ever fail.

Paramètres

string	$encoding	Encoding name to test, as per iconv format
bool	$bypass	Whether or not to bypass the precompiled arrays.

Renvoie: Array of UTF-8 characters to their corresponding ASCII, which can be used to "undo" any overzealous iconv action.

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

◆ testIconvTruncateBug()

static testIconvTruncateBug ( )

static

glibc iconv has a known bug where it doesn't handle the magic //IGNORE stanza correctly. In particular, rather than ignore characters, it will return an EILSEQ after consuming some number of characters, and expect you to restart iconv as if it were an E2BIG. Old versions of PHP did not respect the errno, and returned the fragment, so as a result you would see iconv mysteriously truncating output. We can work around this by manually chopping our input into segments of about 8000 characters, as long as PHP ignores the error code. If PHP starts paying attention to the error code, iconv becomes unusable.

Renvoie: int Error code indicating severity of bug.

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

◆ unichr()

static unichr ( $code )

static

Translates a Unicode codepoint into its corresponding UTF-8 character.

Note: Based on Feyd's function at http://forums.devnetwork.net/viewtopic.php?p=191404#191404, which is in public domain.; While we're going to do code point parsing anyway, a good optimization would be to refuse to translate code points that are non-SGML characters. However, this could lead to duplication.; This is very similar to the unichr function in maintenance/generate-entity-file.php (although this is superior, due to its sanity checks).

Voici le graphe des appelants de cette fonction :

◆ unsafeIconv()

static unsafeIconv	(	$in,
		$out,
		$text
	)

static

iconv wrapper which mutes errors, but doesn't work around bugs.

Paramètres

string	$in	Input encoding
string	$out	Output encoding
string	$text	The text to convert

Renvoie: string

Voici le graphe d'appel pour cette fonction :

Voici le graphe des appelants de cette fonction :

Documentation des champs

◆ ICONV_OK

const ICONV_OK = 0

No bugs detected in iconv.

◆ ICONV_TRUNCATES

const ICONV_TRUNCATES = 1

Iconv truncates output if converting from UTF-8 to another character set with //IGNORE, and a non-encodable character is found

◆ ICONV_UNUSABLE

const ICONV_UNUSABLE = 2

Iconv does not support //IGNORE, making it unusable for transcoding purposes

La documentation de cette classe a été générée à partir du fichier suivant :

XoopsCore25-2.5.11-Beta1/htdocs/xoops_lib/modules/protector/library/HTMLPurifier/Encoder.php

Fonctions membres publiques statiques

Champs de données

Fonctions membres privées

Description détaillée

Documentation des constructeurs et destructeur

◆ __construct()

Documentation des fonctions membres

◆ cleanUTF8()

◆ convertFromUTF8()

◆ convertToASCIIDumbLossless()

◆ convertToUTF8()

◆ iconv()

◆ iconvAvailable()

◆ muteErrorHandler()

◆ testEncodingSupportsASCII()

◆ testIconvTruncateBug()

◆ unichr()

◆ unsafeIconv()

Documentation des champs

◆ ICONV_OK

◆ ICONV_TRUNCATES

◆ ICONV_UNUSABLE