Modules

Gleez_InputFilter

Creates and returns an XSS safe version of string, or an empty string if $string is not valid UTF-8.

Note: by design, this class does not do any permission checking.

package
Gleez
category
Input Filter
author
Sandeep Sangamreddi - Gleez
copyright
© 2012 Gleez Technologies
license
http://gleezcms.org/license

Class declared in MODPATH/gleez/classes/gleez/inputfilter.php on line 14.

Properties

protected $_config

protected string $_text

The text

protected $allowed_protocols

Protocols that are ok for use in URIs.

protected $allowed_tags

Allowed elements.

protected $benchmark

Methods

public __construct( string $text [, id $format = integer 1 , boolen $filter = NULL ] ) (defined in Gleez_InputFilter)

Create new Core object and initialize our own settings

Parameters

  • string $text required - Text string to filter html
  • id $format = integer 1 - Format id
  • boolen $filter = NULL - Bool admin (used for admin user)

Return Values

  • void

Source Code

public function __construct($text, $format = 1, $filter = NULL)
{
        // Be sure to only profile if it's enabled
hana::$profiling === TRUE)

art a new benchmark
->benchmark = Profiler::start('Gleez Filter', __FUNCTION__);

        
        // Load the configuration for this type
        $config = Kohana::$config->load('inputfilter');

        if($config->allowed_protocols AND is_array($config->allowed_protocols))
        {
                $this->allowed_protocols = $config->allowed_protocols;
        }

        if($config->allowed_tags AND is_array($config->allowed_tags))
        {
                $this->allowed_tags = $config->allowed_tags;
        }

ray_key_exists($format, $config->formats))

e sure a valid format id exists, if not set default format id
at = (int) $Config->get('default_format', 1);


        if(isset($filter['settings']['allowed_html']))
        {
                $this->allowed_tags = preg_split('/\s+|<|>/', $filter['settings']['allowed_html'], -1,
PREG_SPLIT_NO_EMPTY);
        }

        $this->_text  = $text;
>_config = $config;

::$log->add(Log::DEBUG, 'Input Filter Library initialized');
}

public __destruct( ) (defined in Gleez_InputFilter)

Source Code

public function __destruct()

set($this->benchmark))

op the benchmark
ler::stop($this->benchmark);

public __toString( ) (defined in Gleez_InputFilter)

Magic method __toString()

Return Values

  • string

Source Code

public function __toString()
{
	return (string) $this->render();
}

public static callback( ) (defined in Gleez_InputFilter)

Source Code

public static function callback($callback, $text, $format, $filter)
{
        $args = func_get_args();
        array_shift($args);

        if (is_string($callback) AND strpos($callback, '::') !== FALSE)

ke the static callback into an array
back = explode('::', $callback, 2);


        if ( $callback AND is_callable($callback))

                try
                {
                    return  call_user_func_array($callback, $args);
                }
                catch (Exception $e)
                {
                        Kohana::$log->add(Log::ERROR, __('Filter callback :class for :filter',
                                                array(':class' => $e->getMessage(), 'filter' => $filter['name'])));
                        return $text;
                }
        }

        return $text;
}

public static factory( string $text [, int $format = integer 1 , array $filter = NULL ] ) (defined in Gleez_InputFilter)

Creates and returns an XSS safe version of $string

An XSS safe version of $string, or an empty string if $string is not valid UTF-8.

Parameters

  • string $text required - Text string to filter html
  • int $format = integer 1 - Format id
  • array $filter = NULL - Array of allowed tags

Return Values

Source Code

public static function factory($text, $format = 1, $filter = NULL)
{
               return new InputFilter($text, $format, $filter);
}

public filter_xss( string $string ) (defined in Gleez_InputFilter)

Filters an HTML string to prevent cross-site-scripting (XSS) vulnerabilities.

Based on kses by Ulf Harnhammar, see http://sourceforge.net/projects/kses. For examples of various XSS attacks, see: http://ha.ckers.org/xss.html.

This code does four things: - Removes characters and constructs that can trick browsers. - Makes sure all HTML entities are well-formed. - Makes sure all HTML tags and attributes are well-formed. - Makes sure no HTML tags contain URLs with a disallowed protocol (e.g. javascript:).

                 $string is not valid UTF-8.

Parameters

  • string $string required - Input string.

Return Values

  • string - An XSS safe version of $string, or an empty string if

Source Code

public function filter_xss( $string )
{
               // Only operate on valid UTF-8 strings. This is necessary to prevent cross
               // site scripting issues on Internet Explorer 6.
               if (!self::valid_utf8($string))
               {
                       return '';
               }
       
               // Remove NULL characters (ignored by some browsers)
               $string = str_replace(chr(0), '', $string);
               // Remove Netscape 4 JS entities
               $string = preg_replace('%&\s*\{[^}]*(\}\s*;?|$)%', '', $string);
       
               // Defuse all HTML entities
               $string = str_replace('&', '&amp;', $string);
               // Change back only well-formed entities in our whitelist
               // Decimal numeric entities
               $string = preg_replace('/&amp;#([0-9]+;)/', '&#\1', $string);
               // Hexadecimal numeric entities
               $string = preg_replace('/&amp;#[Xx]0*((?:[0-9A-Fa-f]{2})+;)/', '&#x\1', $string);
               // Named entities
               $string = preg_replace('/&amp;([A-Za-z][A-Za-z0-9]*;)/', '&\1', $string);
       
               return preg_replace_callback('%(
			<(?=[^a-zA-Z!/])  # a lone <
			|                 # or
			<!--.*?-->        # a comment
			|                 # or
			<[^>]*(>|$)       # a string that starts with a <, up until the > or the end of the string
			|                 # or
			>                 # just a >
			)%x', array($this, 'xss_split'), $string);
               
       }

public static filters( ) (defined in Gleez_InputFilter)

Returns all available filters

Source Code

public static function filters()
{
	$filters =  new stdClass;
	Module::event('filter_info', $filters);
	
	return $filters->list;
}

public static formats( ) (defined in Gleez_InputFilter)

Returns all available formats

Source Code

public static function formats()
{
	$config = Kohana::$config->load('inputfilter');

	$formats = array();
	foreach($config->formats as $id => $format)
	{
		$formats[$id] = $format['name'];
	}

	return $formats;
}

public render( ) (defined in Gleez_InputFilter)

Magic method __toString() only works on echo/print so we need this

Return Values

  • string

Source Code

public function render()
{
	return (string) $this->filter_xss($this->_text);
}

public static valid_utf8( $text $string ) (defined in Gleez_InputFilter)

Checks whether a string is valid UTF-8.

All functions designed to filter input should use drupal_validate_utf8 to ensure they operate on valid UTF-8 strings to prevent bypass of the filter.

When text containing an invalid UTF-8 lead byte (0xC0 - 0xFF) is presented as UTF-8 to Internet Explorer 6, the program may misinterpret subsequent bytes. When these subsequent bytes are HTML control characters such as quotes or angle brackets, parts of the text that were deemed safe by filters end up in locations that are potentially unsafe; An onerror attribute that is outside of a tag, and thus deemed safe by a filter, can be interpreted by the browser as if it were inside the tag.

The function does not return FALSE for strings containing character codes above U+10FFFF, even though these are prohibited by RFC 3629.

The text to check. TRUE if the text is valid UTF-8, FALSE if not.

Parameters

  • $text $string required

Return Values

Source Code

public static function valid_utf8( $string )
{
        if (strlen($string) == 0)
        {
                return TRUE;
        }
        // With the PCRE_UTF8 modifier 'u', preg_match() fails silently on strings
        // containing invalid UTF-8 byte sequences. It does not reject character
        // codes above U+10FFFF (represented by 4 or more octets), though.
        return (preg_match('/^./us', $string) == 1);
}

protected xss_split( ) (defined in Gleez_InputFilter)

Source Code

protected function xss_split($m)
{
       
        $allowed_html = array_flip($this->allowed_tags);
        
        $string = $m[1];
        
        if (substr($string, 0, 1) != '<')
        {
                // We matched a lone ">" character
                return '&gt;';
        }
        elseif (strlen($string) == 1)
        {
                // We matched a lone "<" character
                return '&lt;';
        }
        
        if (!preg_match('%^<\s*(/\s*)?([a-zA-Z0-9]+)([^>]*)>?|(<!--.*?-->)$%', $string, $matches))
        {
                // Seriously malformed
                return '';
        }
    
        $slash = trim($matches[1]);
        $elem =& $matches[2];
        $attrlist =& $matches[3];
        $comment =& $matches[4];
        
        if ($comment)
        {
                $elem = '!--';
        }
        
        if (!isset($allowed_html[strtolower($elem)]))
        {
                // Disallowed HTML element
                return '';
        }
        
        if ($comment)
        {
                return $comment;
        }
        
        if ($slash != '')
        {
                return "</$elem>";
        }
        
        // Is there a closing XHTML slash at the end of the attributes?
        $attrlist    = preg_replace('%(\s?)/\s*$%', '\1', $attrlist, -1, $count);
        $xhtml_slash = $count ? ' /' : '';
        
        // Clean up attributes
        $attr2 = implode(' ', $this->xss_attributes($attrlist));
        $attr2 = preg_replace('/[<>]/', '', $attr2);
        $attr2 = strlen($attr2) ? ' ' . $attr2 : '';
        
        return "<$elem$attr2$xhtml_slash>";
}

private decode_entities( $text $text ) (defined in Gleez_InputFilter)

Decodes all HTML entities (including numerical ones) to regular UTF-8 bytes.

Double-escaped entities will only be decoded once ("&lt;" becomes "<", not "<"). Be careful when using this function, as decode_entities can revert previous sanitization efforts (<script> will become Documentation comments powered by Disqus