Collections functions
General functions
Render functions
Theme permission functions
User functions
Resource functions

cleanse_string()

Parameters

ColumnTypeDefaultDescription
$string
$preserve_separators
$preserve_hyphen false
$is_html false

Location

include/search_functions.php lines 2240 to 2289

Definition

 
function cleanse_string($string,$preserve_separators,$preserve_hyphen=false,$is_html=false)
    {
    
# Removes characters from a string prior to keyword splitting, for example full stops
    # Also makes the string lower case ready for indexing.
    
global $config_separators;
    
$separators=$config_separators;

    
// Replace some HTML entities with empty space
    // Most of them should already be in $config_separators
    // but others, like ­ don't have an actual character that we can copy and paste
    // to $config_separators
    
$string htmlentities($stringENT_QUOTES|ENT_SUBSTITUTE'UTF-8');
    
$string str_replace(' '' '$string);
    
$string str_replace('­'' '$string);
    
$string str_replace('‘'' '$string);
    
$string str_replace('’'' '$string);
    
$string str_replace('“'' '$string);
    
$string str_replace('”'' '$string);
    
$string str_replace('–'' '$string);

    
// Revert the htmlentities as otherwise we lose ability to identify certain text e.g. diacritics
    
$stringhtml_entity_decode($string,ENT_QUOTES,'UTF-8');
    
    if (
$preserve_hyphen)
        {
        
# Preserve hyphen - used when NOT indexing so we know which keywords to omit from the search.
        
if ((substr($string,0,1)=="-" /*support minus as first character for simple NOT searches */ || strpos($string," -")!==false) && strpos($string," - ")==false)
            {
                
$separators=array_diff($separators,array("-")); # Remove hyphen from separator array.
            
}
        }
    if (
substr($string,0,1)=="!" && strpos(substr($string,1),"!")===false
            {
            
// If we have the exclamation mark configured as a config separator but we are doing a special search we don't want to remove it
            
$separators=array_diff($separators,array("!")); 
            }
            
    if (
$preserve_separators)
            {
            return 
mb_strtolower(trim_spaces(str_replace($separators," ",$string)),'UTF-8');
            }
    else
            {
            
# Also strip out the separators used when specifying multiple field/keyword pairs (comma and colon)
            
$s=$separators;
            
$s[]=",";
            
$s[]=":";
            return 
mb_strtolower(trim_spaces(str_replace($s," ",$string)),'UTF-8');
            }
    }

This article was last updated 6th December 2023 11:35 Europe/London time based on the source file dated 13th November 2023 11:05 Europe/London time.