Collections functions
General functions
Node functions
Render functions
Theme permission functions
User functions
Resource functions

cleanse_string()

Parameters

ColumnTypeDefaultDescription
$string
$preserve_separators
$preserve_hyphen false
$is_html false

Location

include/search_functions.php lines 2252 to 2301

Definition

 
function cleanse_string($string,$preserve_separators,$preserve_hyphen=false,$is_html=false)
    {
    
# Removes characters from a string prior to keyword splitting, for example full stops
    # Also makes the string lower case ready for indexing.
    
global $config_separators;
    
$separators=$config_separators;

    
// Replace some HTML entities with empty space
    // Most of them should already be in $config_separators
    // but others, like ­ don't have an actual character that we can copy and paste
    // to $config_separators
    
$string htmlentities($stringENT_QUOTES|ENT_SUBSTITUTE'UTF-8');
    
$string str_replace(' '' '$string);
    
$string str_replace('­'' '$string);
    
$string str_replace('‘'' '$string);
    
$string str_replace('’'' '$string);
    
$string str_replace('“'' '$string);
    
$string str_replace('”'' '$string);
    
$string str_replace('–'' '$string);

    
// Revert the htmlentities as otherwise we lose ability to identify certain text e.g. diacritics
    
$stringhtml_entity_decode($string,ENT_QUOTES,'UTF-8');
    
    if (
        
$preserve_hyphen
        
&& (substr($string,0,1) == "-" || strpos($string," -") !== false/*support minus as first character for simple NOT searches */ 
        
&& strpos($string," - ") == false
        
) {
            
# Preserve hyphen - used when NOT indexing so we know which keywords to omit from the search.
            
$separators=array_diff($separators,array("-")); # Remove hyphen from separator array.
        
}
    if (
substr($string,0,1)=="!" && strpos(substr($string,1),"!")===false
            {
            
// If we have the exclamation mark configured as a config separator but we are doing a special search we don't want to remove it
            
$separators=array_diff($separators,array("!")); 
            }
            
    if (
$preserve_separators)
            {
            return 
mb_strtolower(trim_spaces(str_replace($separators," ",$string)),'UTF-8');
            }
    else
            {
            
# Also strip out the separators used when specifying multiple field/keyword pairs (comma and colon)
            
$s=$separators;
            
$s[]=",";
            
$s[]=":";
            return 
mb_strtolower(trim_spaces(str_replace($s," ",$string)),'UTF-8');
            }
    }

This article was last updated 19th March 2024 04:35 Europe/London time based on the source file dated 15th March 2024 17:00 Europe/London time.