Coveo for Sitecore: Custom Inbound Filters

One of the things I noticed right away was that some unwanted Sitecore content items were being indexed by Coveo. They are legit pages but ones that shouldn’t surface for a search result. Examples: Search pages, user profiles, logins, error pages, etc. One of the 1st extensions I added to my solution (Coveo for Sitecore 4.1) was a HideFromIndexInboundFilter.

Steps:
  • Added a new checkbox field called ‘HideFromIndex’ to my base template.
  • Went through the various content pages within Sitecore and checked them as Hide from Index. NOTE: if they are a particular template, handle at the standard values of template. 
  • Created a new class HideFromIndexInboundFilter that extends AbstractCoveoInboundFilterProcessor. (I’ll explain the class below)
  • Update Coveo.SearchProvider.Custom.config to add the new filter processor to the coveoInboundFilterPipeline
  • Rebuild the coveo indexes

 HideFromIndexInboundFilter
The main method to override is the Process method. In this method you will execute your logic to check whether a Sitecore item should be indexed by Coveo. One thing to note: If you are executing multiple filters, be sure to check that a prior filter hasn’t already excluded the item. My first pass I was a bit confused as to how items were still being indexed, but it was due to missing this check that it allowed the last filter to flip it back to included.



using Coveo.Framework.CNL;
using Coveo.Framework.Items;
using Coveo.Framework.Log;
using Coveo.SearchProvider.InboundFilters;
using Coveo.SearchProvider.Pipelines;
using Sitecore.Collections;
using Sitecore.ContentSearch.Pipelines;
using Sitecore.ContentSearch.Pipelines.IndexingFilters;
using Sitecore.Data.Fields;
using Sitecore.Data.Items;
using Sitecore.Pipelines;
using System;
using System.Linq.Expressions;
using System.Reflection;
using System.Runtime.CompilerServices;

namespace YourDomain.Search.Coveo.Filters.CoveoInboundFilters
{
    public class HideFromIndexInboundFilter: AbstractCoveoInboundFilterProcessor
    {
        private readonly static ILogger s_Logger;

        static HideFromIndexInboundFilter()
        {
            HideFromIndexInboundFilter.s_Logger = CoveoLogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType);
        }

        public HideFromIndexInboundFilter()
        {
        }

        public override void Process(CoveoInboundFilterPipelineArgs p_Args)
        {
            Precondition.NotNull(p_Args, () => () => p_Args); 
            HideFromIndexInboundFilter.s_Logger.TraceEntering("Process");
            if (p_Args.IndexableToIndex != null && base.ShouldExecute(p_Args))
            {
                if (!p_Args.IsExcluded) // make sure another process hasn't already flagged it
                {
                    Item currentItemBeingProcesseed = p_Args.IndexableToIndex.Item.SitecoreItem;
                    Field hideFromIndexField = currentItemBeingProcesseed.Fields["HideFromIndex"];
                    if (hideFromIndexField != null)
                    {
                        string s_hideFromIndex = hideFromIndexField.Value;
                        //s_Logger.Info(string.Format("HideFromIndexInboundFilter: ProcessItem: s_hideFromIndex: {0}", s_hideFromIndex));
                        if (!string.IsNullOrEmpty(s_hideFromIndex) && s_hideFromIndex.Equals("1", StringComparison.OrdinalIgnoreCase))
                        {
                            s_Logger.Debug(string.Format("HideFromIndexInboundFilter: ProcessItem: HIDING: {0}", currentItemBeingProcesseed.Paths.FullPath));
                            p_Args.IsExcluded = true;
                        }
                    }
                }
            }
            HideFromIndexInboundFilter.s_Logger.TraceExiting("Process");
        }
    }
}


 Coveo.SearchProvider.Custom.config
 #1 rule – always patch your custom configurations to avoid upgrade headaches in the future. So make sure you are editing the .Custom.config version! Within the configuration file, you simply add you inbound filter processor to the coveoInboundFilterPipeline:


<coveoInboundFilterPipeline>
  <processor type="Coveo.SearchProvider.InboundFilters.ApplySitecoreInboundFilterProcessor, Coveo.SearchProviderBase">

  </processor>
  <!-- SF: This Filter restricts content checked as Hide from Index from being added to index -->
  <processor type="YourDomain.Search.Coveo.Filters.CoveoInboundFilters.HideFromIndexInboundFilter, YourDLL">

  </processor>
</coveoInboundFilterPipeline>


UPDATE: After implementing this code, I noticed that items were not being removed from the Coveo index (unless doing a full rebuild). The reason is that the inbound filter prevents the index from updating that item. A small workaround for when you need to quickly remove content from the search index and don't want to do a full rebuild.

1) Add a publishing restriction
2) Publish item (which removes it from index)
3) Check the ‘Hide from Index’ checkbox
4) Remove the publishing restriction
5) Republish page

Comments