How to Boost Kentico Smart Search Results Relevance Score

By Tim Stauffer on March 02, 2017

How to Boost Kentico Smart Search Results Relevance Score

What is Score Boosting?

Score boosting is moving the rank of a search result up and down based on a algoritm--much the same way Google uses an algorithm to determine which results are more relevant. Based on the boost factors you can change the importance of different elements of a document. For example, a user searches for keyword. This keyword is quite common on your website so it will appear in the content of many documents but only occasionally shows up in the title. By default only the content is ranked. This really doesn’t reflect the real relevance of the document. A document with the keyword in the title would be more likely to be more important than the other documents containing the same keyword only in the content. With score boosting we can increase the rank or weight of this document so it will appear at the top.

Under the hood Kentico uses the .NET port of the Apache Lucene engine to index the documents for its Smart Search. If you have ever added a Smart Search filter you have probably noticed the syntax can be kind of weird. The documentation Kentico provides is just the very basic functionality of its true potential. One of the coolest undocumented features is score boosting. By using the search conditions on the Smart Search Results webpart we can alter the relevancy score of the returned results. 

The Lucene engine includes  three different ways to use boosting:
  1. Document level — Index Time Boosting
  2. Field level — Index Time Boosting
  3. Query level— Query Time Boosting
The real difference is when the boosting is applied. With Document and Field level, the boosting is baked into the index which increases performance and storage efficiency. Query level is more flexible and can achieve the same results as index time boost but is not as economical. Since creating custom indexes is beyond the scope of this article and similar results can be achieved, I’m going to focus on query level boosting

Real world examples

Promote Featured Content to the Top of Results

Sometimes there is a need to specifically highlight content because it has some type of special significance. Does an article need to be featured or do you have sponsored content? We can configure the boost to increase any document that has a featured checkbox set. 
  1. Prepare the page type to include featured bit. Even though this is just a bit we want to make sure it is required so the search doesn’t get confused with any null values.
  2. Modify the Search fields to use a custom search name for the field so it is common. This enables us to use this field with multiple page types without modifying the structure of the page types.
  3. Add check for isfeatured to the search condition of the Smart Search results. Make sure to omit the + or - from the beginning of the condition. This makes it conditional and will require a match or exclude the match as a result. By default we will see a boost of 1x. If you want to increase that weight even more you can increase the boost factor. You also have the option to boost only a little which can be achieved by using a decimal value. Be sure to add a 0 before the decimal (ex. ^0.5). In my example I am boosting isfeatured by a factor of 3 using isfeatured(true)^3 in my expression. You can add multiple conditions to the search expression depending on your needs. (ex. isfeatured:(true)^2 issponsored(true)^3)

Results

As you can see with the image on the top, which includes featured boosting, the relevancy of the first result has a significant score increase compared to the results in the image on the right. Because this can be such a dramatic increase I would display to the users that it is featured content and not display the relevancy score.



All Fields Aren’t Created Equal.

When someone searches for a keyword and it matches the title of the page you would expect that item to have more relevance than if the keyword appeared somewhere in the content. This is especially true if you have a lot of similar content. In a similar way that we boosted for featured content we can boost the score for keywords on a field by field basis.
  1. Since we most likely have the field created we’ll head right to the index. Any fields you want to boost must have the searchable box checked. This allows the field to be seen in the search condition of the search results webpart. Because the “title” is not a standard content field like _content is we need to select a common custom search name so we can apply the condition to it. This custom search name also needs to be used with all the page types that you want to include in your index and boost the title.
  2. The search condition syntax is very similar to the featured boost however we need to include the search keyword in the condition: (title:{ %QueryString["searchtext"]#% }). In this instance we are simply giving the title a 1x boost. Because this is user input and we don’t know what they will search for I would suggest running the query string through a Regex to strip characters that would break the search. This can be done by creating a custom macro. There is no risk of injection but by removing the characters so you can provide a better experience for your users.

Results

Titles that contain the search term will float to the top. Below are examples of both results with the boosting enabled and without.



Newer is Better, Right? 

If you have a lot of content spread across many years one thing you can do is sort by date and have the new stuff come first. This work but it doesn’t include any relevance to your search results. With boosting you can retain the scoring while also giving newer content the much needed exposure. In my example I’m going to boost the content that has been created within the past year by a .5 factor. 
  1. Because the DateCreatedWhen field is already included in the index we don’t have to add any special column names and can skip to adding the search condition. So, simply add our boost: (DocumentCreatedWhen:[20160727035748 TO 20170727035748])^0.5. Since we are working with dates we need to convert the kentico dates into a Lucene friendly format. You can use the ToSearchDateTime macro to accomplish this. The resolved macro will output something like this. (DocumentCreatedWhen:[20160224110652 TO 20170224110652]).

Results

I’ve highlighted the dates in the results below. This is not really a great example but it does demonstrate the results created within the last year are floated to the top. 



What’s next?

I’ve covered just a few of the possibilities of what can be done to boost your Kentico Smart Search results. It is something that is not very well documented but extremely easy to implement. The next steps you could take would be implementing some of the boosting on the index level by creating custom indexes. This will optimize your indexes resulting in faster more efficient queries.
 

Share This Post:

Twitter Pinterest Facebook Google+
Click here to read more Advice posts
Start a Project with Us

About the author

Completely self-taught and a Jack of all trades, Tim’s the man when it comes to making things happen with websites and software. Given enough time, he can figure anything out. It makes him feel all warm and fuzzy inside when he makes something and others use it to make their lives better. We like his big heart. Tim enjoys “experimenting with food,” and is just a bit addicted to World War II movies.

View other posts by Tim

Subscribe to Email

Enter your email address to subscribe to the BizStream Newsletter and receive updates by email.