I recently worked on a project that used the combination of Kentico Kontent and Gatsby, and it was one of the most enjoyable projects I have worked on. I thoroughly enjoy Kentico Kontent as a solid cloud content management platform, and matching it up with Gatsby to deliver a static content site was such a smooth and fun experience. One thing that it lacks out of the box, though, is smart search. We use Azure search extensively here are BizStream, so when tasked with including it in our static site, we had to pave our own way to implement it.
Implementing the Plugin
Since Gatsby is a static site generator, the best way to create and populate a search index is at build time. That means we needed to hook into the build life-cycle using their
Node API. We also implemented a custom Gatsby
local plugin since plugins use the exact same life-cycle methods and are self-contained in one location in the codebase. Our resulting local plugin path is
<project root>/plugins/gatsby-plugin-azure-search.
Gatsby itself uses Node and React to generate the static content of the website, and the resulting pages use React to drive the functionality, so we needed to either create our own JavaScript library to connect to Azure or use an existing one. We wound up going with
node-azure-search, an existing library for connecting to Azure. At the time of writing, it has not been updated in a few years, but we found that there was more than enough included for creating, populating, and querying a search index. In order to use this library, the next step is to navigate to the plugin folder and create a separate NPM project with the command npm init. Then add the Azure search library to this separate project with the command
npm install --save azure-search.
While building the static site, Gatsby will call the Node API methods defined in
gatsby-node.js, so create a new file with that name inside the plugin folder as well. Add the name of your local plugin to the plugins list in
<project root>/gatsby-config.js so that Gatsby will load and execute the plugin while building the site.
module.exports = {
...
plugins: [
...
'gatsby-plugin-azure-search'
],
...
};
After all that, your project should look similar to ours, which looked like this:
Creating and Populating the Index
We chose to hook into the
createPages method since this method is called once all the data nodes are created by Gatsby. In this method, we create the index, query Kontent’s content, transform each result into the format defined by our index, and then update all the documents in the index with the new data.
Creating the index
To create or update the index, we will first define the configuration options and the index structure with plain JavaScript variables and objects. Note that there is an array in
corsOptions.allowedOrigins with all the URLs that are allowed to make requests from the index. This is important because if this is empty, then no sites can search for any results. For now, add localhost:8080 for the default local Gatsby development address.
const adminKey = '<my azure search admin key>';
const serviceName = 'my-search-service';
const indexConfig = {
name: 'my-search-index',
fields: [
{
name: 'id',
type: 'Edm.String',
key: true,
filterable: true
},
{
name: 'content',
type: 'Edm.String',
filterable: false,
retrievable: false,
searchable: true,
sortable: false
},
{
name: 'title',
type: 'Edm.String',
filterable: false,
searchable: true,
sortable: false
},
{
name: 'url',
type: 'Edm.String',
filterable: true,
searchable: true
}
],
corsOptions: {
allowedOrigins: ['http://localhost:8000']
}
};
We can use these to create an Azure client instance and actually create or update the index. This can be done inside the
createPages method, which can also return a promise and use the async and await keywords in JavaScript to simplify the code.
exports.createPages = async ({ graphql }) => {
// Initialize an instance of the Azure client
const client = await azureSearch({
url: `https://${serviceName}.search.windows.net`,
key: adminKey,
});
// Create/update the index
try {
// Does the index exist
await client.getIndex(indexConfig.name);
// Index does exist, so update it
await client.updateIndex(indexConfig.name, indexConfig);
} catch {
// Index does not exist, so create it
await client.createIndex(indexConfig);
}
...
}
Next, we need to get the desired data for the index. In the index configuration above, we specified that we will have ID, Title, Content, and URL fields in the index, so we will query for those fields from Kontent using GraphQL syntax.
...
// Retrieve Kontent data nodes using GraphQL
const result = await graphql(`
query {
allKenticoCloudItemMySearchableContent {
nodes {
elements {
url {
value
}
body {
value
}
title {
value
}
}
system {
id
}
}
}
}
`);
...
At this point, we will have the data from Kontent, but the structure that GraphQL returns does not match the structure of the documents in our Azure search index. The next step, then, is to transform our GraphQL results into objects with the correct structure. We can do this easily with the
map Array method in JavaScript, passing it a function used to map each GraphQL node to a search document object. We will use
destructuring assignment syntax in the mapping method to get the values easier as well.
...
// Map the GraphQL nodes to search documents
const searchDocuments =
result.data.allKenticoCloudItemMySearchableContent.nodes.map(node => {
const { system, elements: { url, body, title } } = node;
return {
id: system.id,
content: body.value,
title: title.value,
url: url.value
};
});
...
Now we have all of our searchable data from Kontent in a format expected by our search index. The last step is to send these documents up to Azure to be indexed.
...
// Add the search documents to the index
await client.addDocuments(indexConfig.name, searchDocuments);
...
With all those pieces in place, we will have an index filled with searchable content from Kontent. For anyone simply skimming or anyone on a deadline that just wants the code to throw into their project due at the end of the day, here is our
gatsby-node.js file in its entirety with some simple logging to output at build time.
const azureSearch = require('azure-search');
const adminKey = '<my azure search admin key>';
const serviceName = 'my-search-service';
const indexConfig = {
name: 'my-search-index',
fields: [
{
name: 'id',
type: 'Edm.String',
key: true,
filterable: true
},
{
name: 'content',
type: 'Edm.String',
filterable: false,
retrievable: false,
searchable: true,
sortable: false
},
{
name: 'title',
type: 'Edm.String',
filterable: false,
searchable: true,
sortable: false
},
{
name: 'url',
type: 'Edm.String',
filterable: true,
searchable: true
}
],
corsOptions: {
allowedOrigins: ['http://localhost:8000']
}
};
exports.createPages = async ({ graphql }) => {
console.log('Starting Azure Search index creation and population');
// Initialize an instance of the Azure client
const client = await azureSearch({
url: `https://${serviceName}.search.windows.net`,
key: adminKey
});
// Create/update the index
console.log('Creating/updating the index');
try {
// Does the index exist
await client.getIndex(indexConfig.name);
// Index does exist, so update it
await client.updateIndex(indexConfig.name, indexConfig);
} catch {
// Index does not exist, so create it
await client.createIndex(indexConfig);
}
// Retrieve Kontent data nodes using GraphQL
console.log('Retrieving raw nodes for the index');
const result = await graphql(`
query {
allKenticoCloudItemMySearchableContent {
nodes {
elements {
url {
value
}
body {
value
}
title {
value
}
}
system {
id
}
}
}
}
`);
// Map the GraphQL nodes to search documents
console.log('Converting raw nodes to documents');
const searchDocuments = result.data.allKenticoCloudItemMySearchableContent.nodes.map(node => {
const { system, elements: { url, body, title } } = node;
return {
id: system.id,
content: body.value,
title: title.value,
url: url.value
};
});
// Add the search documents to the index
console.log('Adding/updating documents in the index');
await client.addDocuments(indexConfig.name, searchDocuments);
console.log('Azure index created and populated');
};
Obviously, this plugin and the code within has room for improvement. The individual bite-sized pieces should be moved to their own methods for maintainability. The index and configuration data can be read from a JSON file on disk, and the administration key should be somewhere outside of the code so that it does not get checked into source control. It can also be made into a generic plugin with life-cycle methods of its own that other developers can override and extend for their own purposes with their own index. This should at least provide a base from which to build new and fun tools for Azure search in Gatsby.