A Code First Approach to Kentico Smart Search Settings: Stopping Ambiguous Errors

Learn a code-first approach using attributes to configure search indexes within Kentico.

Kentico Xperience provides an integration with Azure Cognitive Search that allows you to store website content and data within the Azure Search Indexes. After configured, data is populated and kept up to date when changes are made. This allows developers to fully utilize the power of Azure to implement solutions on websites.

When configuring the search indexes, you must set the search settings for every field on every page type. If you have a large site with many page types, this can be a daunting task. It becomes even more problematic when multiple page types contain the same field names. Every search setting must be correct, or either the indexing will fail, or the queries will fail. There is no way to know if you have your settings correct unless you try and rebuild the indexes or, after waiting for the indexes to build and submit a query. Troubleshooting this can be very time-consuming and is exacerbated when you have many page types.

Screen capture of indexing error message

To solve this problem, I’ve chosen a code-first approach using attributes to decorate the fields we want indexed and reflection to get the settings out of the code. For simplicity, we’ll set the settings using a single-run scheduled task. You just need to remember to run this if the page type fields change.

Creating The Attributes

Creating a custom attribute in C# is pretty straightforward. It is just another class, but it inherits the Attribute. With the attribute we create, we also need to include a DisplayName attribute. Our property name will be different from the names that Kentico uses in the settings, so the attribute is there to match the names Kentico expects.

using System;
using System.ComponentModel;

namespace BizStream.Xperience.SearchIndex
{
    public class SmartSearchAttribute : Attribute
    {
        [DisplayName( "azurecontent" )]
        public bool IsAzureContent { get; set; }
        [DisplayName( "azurefacetable" )]
        public bool IsAzureFacetable { get; set; }
        [DisplayName( "azurefilterable" )]
        public bool IsAzureFilterable { get; set; }
        [DisplayName( "azureretrievable" )]
        public bool IsAzureRetrievable { get; set; }
        [DisplayName("azuresearchable")]
        public bool IsAzureSearchable { get; set; }
        [DisplayName("azuresortable")]
        public bool IsAzureSortable { get; set; }
        [DisplayName("content")]
        public bool IsLocalContent { get; set; }
        [DisplayName("searchable")]
        public bool IsLocalSearchable { get; set; }
        [DisplayName("tokenized")]
        public bool IsLocalTokenized { get; set; }
        [DisplayName("updatetrigger")]
        public bool IsUpdateTrigger { get; set; }
        [DisplayName("fieldname")]
        public string FieldName { get; set; }
    }
}

Each property corresponds to a check box on the search settings tab within the page type. Also included are properties for local page indexes.

Creating the Search-Enabled Interface

By using an interface, we can specify which page types should have search enabled. This allows us to turn indexing on and off within code and not on the search configurations screens. Because we are using reflection to gather all the search classes by implementing a Search interface, we can easily identify the page types.

It is a very simple interface containing just the Class Name, which many of the page types might already have.

namespace BizStream.Xperience.SearchIndex
{
    public interface ISearchEnabled
    {
        string ClassName { get; set; }
    }
}

Creating the Search Index Service

The first thing that needs to be done when creating our service is creating our interface.

using CMS.DataEngine;

namespace BizStream.Xperience.SearchIndex
{
    public interface ISearchIndexService
    {
        void RebuildClassSearchSettings( DataClassInfo classItem );
    }
}

This step isn’t 100% necessary since we are not using dependency injection with our scheduled task, but I’ve added it for code consistency. We cannot create a search index service class and implement the interface.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using System.Reflection;
using CMS.DataEngine;

namespace BizStream.Xperience.SearchIndex
{
    public class SearchIndexService : ISearchIndexService
    {
        /// <summary>
        /// Factory Method to retrieve an instance of the class.
        /// </summary>
        /// <param name="args">Class types to search for and include.</param>
        public static SearchIndexService Instance( params Type[] args )
        {
            return new SearchIndexService( args );
        }

        public SearchIndexService( params Type[] args )
        {
            SearchableTypes = GetSearchable( args );
        }

        private IEnumerable<ISearchEnabled> SearchableTypes { get; set; }

        /// <summary>
        /// Gets a list of the classes that implement the ISearchEnabled Interface.
        /// </summary>
        /// <param name="args">Class types to identify assemblies to include.</param>
        /// <returns>List of search enabled page types.</returns>
        private static IEnumerable<ISearchEnabled> GetSearchable( params Type[] args )
        {
            var searchable = new List<ISearchEnabled>();

            // for each type we create an instance to access the default values of the classnames
            foreach( var typeArgs in args )
            {
                var types = typeArgs.Assembly.GetLoadableTypes()
                    .Where( w => typeof( ISearchEnabled ).IsAssignableFrom( w ) ).ToList();
                searchable.AddRange( types.Select( isSearchableType => ( ISearchEnabled )Activator.CreateInstance( isSearchableType ) ) );
            }

            return searchable;
        }

        /// <summary>
        /// Rebuilds the search settings for a DataClassInfo of a Document Type.
        /// </summary>
        /// <param name="classItem">DataClassInfo to be rebuilt.</param>
        public virtual void RebuildClassSearchSettings( DataClassInfo classItem )
        {
            var classType = SearchableTypes.FirstOrDefault( w => w.ClassName.Equals( classItem.ClassName, StringComparison.OrdinalIgnoreCase ) );

            // if classType is false then this page type is not searchable we need to make sure it has search disabled.
            if( classType == null )
            {
                if( !classItem.ClassSearchEnabled )
                {
                    return;
                }

                classItem.ClassSearchEnabled = false;
                classItem.Update();
                return;
            }

            // we don't want pages without urls in the search index.
            if( !classItem.ClassHasURL )
            {
                if( classItem.ClassSearchEnabled )
                {
                    classItem.ClassSearchEnabled = false;
                    classItem.Update();
                    return;
                }
            }

            // gets a list of all the property names of the class.
            var classProperties = classType.GetType().GetProperties();

            // iterate through the property names and set the settings.
            foreach( SearchSettingsInfo classItemClassSearchSettingsInfo in classItem.ClassSearchSettingsInfos )
            {
                ReloadPropertySearchSettings( classProperties, classItemClassSearchSettingsInfo );
            }

            // saves the search settings back to the page type.
            var info = classItem.ClassSearchSettingsInfos.GetData();
            classItem.ClassSearchSettings = info;
            classItem.ClassSearchSettingsInfos.LoadData( info );
            classItem.ClassSearchEnabled = true;
            DataClassInfoProvider.SetDataClassInfo( classItem );
        }

        private void ReloadPropertySearchSettings(
            IEnumerable<PropertyInfo> classProperties,
            SearchSettingsInfo classItemClassSearchSettingsInfo )
        {
            // make sure the current property exists in the setting.
            var property = classProperties.FirstOrDefault( w => w.Name.Equals( classItemClassSearchSettingsInfo.Name, StringComparison.OrdinalIgnoreCase ) );
            if( property == null )
            {
                return;
            }

            var attribute = property.GetCustomAttribute<SmartSearchAttribute>() ?? new SmartSearchAttribute();
            var attributeProperties = typeof( SmartSearchAttribute )
                .GetProperties( BindingFlags.Public | BindingFlags.Instance | BindingFlags.DeclaredOnly )
                .ToList();
            attributeProperties.ForEach( attributeProperty =>
            {
                var attributePropertyDisplayName =
                    attributeProperty.GetCustomAttribute<DisplayNameAttribute>().DisplayName;
                var propertyValue = attributeProperty.GetValue( attribute );

                // if the property is a bool then we set a flag in the settings. Otherwise we set a value.
                if( propertyValue is bool value )
                {
                    classItemClassSearchSettingsInfo.SetFlag( attributePropertyDisplayName, value );
                    return;
                }

                if( propertyValue != null )
                {
                    classItemClassSearchSettingsInfo.SetValue( attributePropertyDisplayName, propertyValue );
                }
            } );
        }
    }
}

Sometimes, with reflection, there are assemblies that are referenced that are not loaded. If we try to get the types of that referenced assembly, an exception will be thrown. We need to handle this particular exception properly. We are not interested in those referenced 3rd party assemblies, so we want to ignore them.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;

namespace BizStream.Xperience.SearchIndex
{
    public static class AssemblyExtensions
    {
        public static IEnumerable<Type> GetLoadableTypes( this Assembly assembly )
        {
            if( assembly == null )
            {
                throw new ArgumentNullException( nameof(assembly) );
            }

            try
            {
                return assembly.GetTypes();
            }
            catch( ReflectionTypeLoadException e )
            {
                return e.Types.Where( t => t != null );
            }
        }
    }
}

Scheduled Task to Rebuild the Settings

As mentioned above, the simplest way to execute this code from the Admin UI is to create a scheduled task. It is a simple scheduled task that creates an instance of the service we created and executes it against a list of page type DataClassInfos.

using System.Linq;
using BizStream.Xperience.SearchIndex;
using CMS;
using CMS.DataEngine;
using CMS.Scheduler;

[assembly: RegisterCustomClass( nameof( RebuildSearchSettingsTask ), typeof( RebuildSearchSettingsTask ) )]

namespace BizStream.Xperience.SearchIndex
{
    public class RebuildSearchSettingsTask : ITask
    {
        // pass the typeof one of the decorated page type poco classes. This allows the service to load the assemblies of those types and get all the page types.
        private readonly ISearchIndexService searchIndexService = SearchIndexService.Instance( typeof( Base ) );

        public string Execute( TaskInfo task )
        {
            var classItems = DataClassInfoProvider.GetClasses()
                .WhereTrue( nameof( DataClassInfo.ClassIsDocumentType ) )
                .ToList();

            foreach( DataClassInfo classItem in classItems )
            {
                searchIndexService.RebuildClassSearchSettings( classItem );
            }

            return string.Empty;
        }
    }
}

Attribute Usage

To actually use the attributes, a new class needs to be created that corresponds with the page types. You could probably use Kentico’s page type code generator and decorate those attributes. Still, any time you re-generate the code, you need to remember that you have attributes in your generated code. The way I have found that works best is to create POCOs (plain Old C# Object) based on the page types. The downside here is making sure you keep them updated when you add/remove fields on the page types.

We can create a Base class with all the common properties that our entity classes inherit. Make sure you do not add attributes to built-in Page Type properties such as NodeID and NodeAliasPath.

namespace BizStream.Xperience.SearchIndex
{
    public class Base
    {
        public int NodeID { get; set; }
        [SmartSearch( IsAzureContent = true, IsAzureRetrievable = true, IsUpdateTrigger = true, FieldName = "PageTitle" )]
        public string Title { get; set; }
        public string NodeAliasPath { get; set; }

    }
}

With our base class created we can create our entity POCO

using System;

namespace BizStream.Xperience.SearchIndex
{

    public class Article : Base, ISearchEnabled
    {
        [SmartSearch( IsAzureRetrievable = true, IsUpdateTrigger = true, IsAzureFilterable = true, IsAzureSortable = true )]
        public DateTime? PublishDate { get; set; }
        [SmartSearch( IsAzureContent = true, IsUpdateTrigger = true )]
        public string Content { get; set; }
        [SmartSearch( IsAzureRetrievable = true, IsUpdateTrigger = true )]
        public string ThumbnailImage { get; set; }
        public string FootNotes { get; set; }
        public string ClassName { get; set; } = "BizStream.ArticleNode";
    }
}

Any properties that are not decorated will have all the checkboxes unchecked.

With our attributes added, we can now build our admin project and add our scheduled task.

Scheduled task form

Final Thoughts

Setting up and maintaining search indexes can be a very frustrating task, especially when you are dealing with many page types with the same fields that all need to be included in your indexes. Using a code-first approach is just one way to make this process a little easier. Attributes are a way to keep your settings contained in version control, but there are a few caveats. Since the settings are now in version control, any changes you want to make to your indexes must be done with code. You must also keep your POCOs up to date when adding any fields you want to be included in your indexes. I find this inconvenience makes up for the time-consuming complexities when manually managing your search settings.

About the Author

Tim Stauffer

Completely self-taught and a Jack of all trades, Tim’s the man when it comes to making things happen with websites and software. Given enough time, he can figure anything out. It makes him feel all warm and fuzzy inside when he makes something and others use it to make their lives better. We like his big heart. Tim enjoys “experimenting with food,” and is just a bit addicted to World War II movies.

Subscribe to Our Blog

Stay up to date on what BizStream is doing and keep in the loop on the latest in marketing & technology.