Sometimes knowing what doesn’t work is as useful as what does work. In that vein here’s how I spent my journey home…
A post on technet asked about how to deal with long running search crawls that were impacting users when they overran into business hours. In large SharePoint environments that shouldn’t really happen but it’s a fairly common concern in smaller shops.
Ideally you’ll tune your search so that it always completes in time but that doesn’t always work. For those edge cases there’s two options:
- Pause a specific (or all) crawl(s) during working hours.
- Reduce the impact of the crawls during working hours
Pausing a crawl is easy, it’s also done very well by other people such as:
I wanted to drop the performance of the crawl down so that it can still keep going but not impact the end users.
The first step was to find out how to create a crawl rule to reduce the impact of the search
$shRule = Get-SPEnterpriseSearchSiteHitRule –Identity "SharePoint" #Cripple search $shRule.HitRate = 1000 $shRule.Behavior = 'DelayBetweenRequests' $shRule.Update() #Revive search $shRule.HitRate = 8 $shRule.Behavior = 'SimultaneousRequests' $shRule.Update()
It turns out that in the API a crawl rule is known as a hit rule. Hit rules have two important values, the ‘rate’ and the behaviour.
The script above was enough to let me create a rule and set it to either run at a normal page or with a 16 minute delay between requests. And it worked!
Well, it created a rule and that rule worked. Sadly i’d forgotten that the crawler rules are only checked when you start a crawl. If you start a crawl then punch the delay between items up to 1000 it won’t make a blind bit of difference.
It turns out that pausing the crawl doesn’t make the search engine re-check the crawl rate.
So, a failure. The only thing i can think of is using reflection to find out what is happening in the background and then doing something deeply unsupported to modify the values in flight. Maybe another time.