I was browsing technet, as you do, when i found this comment on Search best practices:
We recommend that you test the crawling and querying functionality of the server farm after you make configuration changes or apply updates
http://technet.microsoft.com/en-us/library/cc850696(v=office.14).aspx
This chimed with me as a client I worked with had failed to do this and had paid the price when their Production search service went down for a day.
The article continues:
An easy way to do this is to create a temporary content source that is used only for this purpose. To test, we recommend that you crawl ten items — for example .txt files on a file share — and then perform search queries for those files. Make sure that the test items are currently not in the index. It is helpful if they contain unique words that will be displayed at the top of the search results page when queried. After the test is complete, we recommend that you delete the content source that you created for this test. Doing this removes the items that you crawled from the index and those test items will not appear in search results after you are finished testing
To put that in bullet point format:
- Test the search system doesn’t already have test content
- Create some test content to search
- Create a new content source
- Crawl the test content
- Search for the test content
- Check that the test content is there
- Remove the test content by blowing away the content source
- Confirm it’s no longer there
It’s a good test script. It also breaks down nicely into some bullet points. With a bit of thought we can break this down into some simple tasks:
- Run a search and test the results
- Create some files
- Create a new content source
- Crawl the test content
- Run a search and test the results
- Remove the test content by blowing away the content source
- Run a search and test the results
The common aspect is to run a search three times and test the results. Of course the results will, hopefully, vary depending on when we run that test but we can manage that.
Let’s go with the big one. Running a search:
Function Check-TestPairValue () { <# .DESCRIPTION Takes a pipeline bound collection of test values and search terms and searches for them using the searchPageURL. Returns either 'Present' or 'Not Found' depending on the result. Not currently production grade#> [CmdletBinding()] Param ( [Parameter(Mandatory=$true,ValueFromPipeline=$true)]$testPair, [Parameter(Mandatory=$true)]$searchPageURL ) BEGIN{ #Create the IE window in the begin block so if the input is pipelined we don't have to re-open it each time. $ie = New-Object -com "InternetExplorer.Application" $ie.visible = $true } PROCESS { #Get the test value and the search term from the pair $testValue = $testPair[0] $searchTerm = $testPair[1] #Open the navigation page $ie.navigate($searchPageURL) #Wait for the page to finish loading while ($ie.readystate -ne 4) { start-sleep -Milliseconds 100 } Write-Verbose "Page loaded" #Get the search box $searchTextBoxID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_InputKeywords" $document = $ie.Document $searchBox = $document.getElementByID($searchTextBoxID) #enter the search terms $searchBox.innerText = $searchTerm Write-Verbose "Searching for: $searchTerm - Expected result: $testValue" #Get the search button $searchButtonID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_go" #Run the search $btn = $document.getElementByID($searchButtonID) $btn.click() #Wait for the results to be loaded while ($ie.locationurl -eq $searchPageURL) { start-sleep -Milliseconds 100 } Write-Verbose "Left the page, waiting for results page to load" #Wait for the results page to load while ($ie.readystate -ne 4) { start-sleep -Milliseconds 100 } Write-Verbose "Results page loaded" #Once loaded check that the results are correct $document = $ie.document #Check that the search term results contain the test results: $firstSearchTermID = "SRB_g_9acbfa38_98a6_4be5_b860_65ed452b3b09_1_Title" $firstSearchResult = $document.getElementByID($firstSearchTermID) $result ="" #test that the title of the file is equal to the search result If ($firstSearchResult.innerHTML -match $testValue) { $result ="Present" } else { $result ="Not Found" } Write-Verbose "Test $result" #Create a new PS Object for our result and let PowerShell pass it out. New-Object PSObject -Property @{ TestCase = $searchTerm Result = $result } } END { #Close the IE window after us $ie.Quit() } }
Well to be honest that’s the only tricky bit in the process. From there on in it’s plumbing.
We create some test files:
Function Create-SPTestFiles () { [CmdletBinding()] Param( $filesToCreate, [string]$folderPath ) If (!(Test-Path $folderPath)) { #Folder doesn't exist. Write-Verbose "Folder does not exist - attempting to create" New-Item $folderPath -type directory } #if the files don’t exist. Create them Foreach ($file in $filesToCreate) { $fileName = $file[0] $filePath = $folderPath + "\" + $fileName If (Test-Path $filePath) { Write-Verbose "File $fileName already exists. Skipping" Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 103 -EntryType Error -Message "Test content already present." } else { Write-Verbose "Creating $fileName" $file[1] >> $filePath } } Write-Verbose "All files created" }
We create a content source (this function isn’t perfect here but i’m stealing it from another script)
Function Ensure-TestContentSourceExists () { [CmdletBinding()] Param( $sa, [string]$contentSourceName, [string]$filePath ) $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName} if ($testCS) { Write-Verbose "Content Source $contentSourceName already exists" } else { Write-Verbose "Content Source $contentSourceName does not exist, creating" New-SPEnterpriseSearchCrawlContentSource -SearchApplication $sa -Type file -name $contentSourceName -StartAddresses $filePath | Out-Null $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName} } #Output the content source $testCS }
Run the crawl and wait for it to finish.
Function Run-TestCrawl () { [CmdletBinding()] Param ($contentSource) #Run a crawl for that content source $contentSource.StartFullCrawl() #Set a flag to allow us to abort if the duration is excessive $stillNotStupidDuration = $true $startTime = Get-Date $crawlTimeout = 5 $crawlInitalTime = 2 Write-Verbose "Starting crawl. Waiting for 2 minutes (Default SharePoint minimum search duration)" Sleep -Seconds 120 #Wait for it to finish while ($contentSource.CrawlStatus -ne "Idle" -AND $stillNotStupidDuration -eq $true) { Write-Verbose "Crawl still running at $timeDifference, waiting 10 seconds" Sleep -Seconds 10 $timeDifference = (Get-Date) - $startTime if ($timeDifference.Minutes -gt $crawlTimeout) { $stillNotStupidDuration = $false } } Write-Verbose "Crawl complete" }
Then we’re back to searching and clean up! Easy.
Of course there’s a little bit more plumbing to be done to stick it all together: so here’s a fully functioning script.
Param ( #Name of the search service application to test $searchAppName = "Search Service Application", #Path to the shared folder #NOTE: THIS HAS TO BE SETUP BEFORE RUNNING THE SCRIPT MANUALLY (It can be scripted but i haven't) $fileSharePath = "\\spintdev\TestFolder", #The search page $searchSiteURL = "http://sharepoint/sites/search/Pages/default.aspx", #Start generating the report $reportFolder = "C:\AutomatedTest", #Flag to set or reject verbose output $printVerbose = $false ) Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue Function Process-ASTPassFail () { <#Internal helper function. Will be used for reporting#> Param($collectionThatShuldBeEmpty, $failText, $passText ) if ($collectionThathouldBeEmpty -ne $null) { Write-Warning $failText Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 102 -EntryType Error -Message $failText $thisTestText = $failText + "`n" } else { $sucsessText = $passText Write-Host $sucsessText Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 102 -EntryType Information -Message $passText $thisTestText = $passText + "`n" } $thisTestText } Function Create-ASTFiles () { <#Creates sometest files for us to search later#> [CmdletBinding()] Param( $filesToCreate, [string]$folderPath ) If (!(Test-Path $folderPath)) { #Folder doesn't exist. Write-Verbose "Folder does not exist - attempting to create" New-Item $folderPath -type directory } #if the files don’t exist. Create them Foreach ($file in $filesToCreate) { $fileName = $file[0] $filePath = $folderPath + "\" + $fileName If (Test-Path $filePath) { Write-Verbose "File $fileName already exists. Skipping" Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 103 -EntryType Error -Message "Test content already present." } else { Write-Verbose "Creating $fileName" $file[1] >> $filePath } } Write-Verbose "All files created" } Function Test-ContentSourceCountAcceptable() { [CmdletBinding()] Param($searchServiceApplication) #Check the maximum number of content sources allowed #http://technet.microsoft.com/en-us/library/cc262787(v=office.14).aspx $maxContentSources = 50 $ContentSources = $sa | Get-SPEnterpriseSearchCrawlContentSource #Lazy way to check if there is only one item (note, also works for none) if ($ContentSources.Count -ne $null) { $CTSourceCount = $ContentSources.Count } else { #Note that this might be wrong if there are no CTs. Not a problem here but it's not a rigourous number $CTSourceCount = 1 } #if it is below the limit. Stop and throw an error if ($count -ge $maxContentSources) { #Throw error and let slip the dogs of war Write-Verbose "Warning content type count is higher than Microsoft Boundaries" $false } else { #If we're under the MS limit then return true $true } } Function Ensure-ASTContentSourceExists () { <#Check if conent source already exists. This should be re-written to delete it but for development purposes this is more efficient#> [CmdletBinding()] Param( $sa, [string]$contentSourceName, [string]$filePath ) $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName} if ($testCS) { Write-Verbose "Content Source $contentSourceName already exists. Deleting it." Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 100 -EntryType Warning -Message "Unable to create a Content Source as one already exists" $testCS.Delete() } else { Write-Verbose "Content Source $contentSourceName does not exist, creating" New-SPEnterpriseSearchCrawlContentSource -SearchApplication $sa -Type file -name $contentSourceName -StartAddresses $filePath | Out-Null $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName} } #Output the content source - Note that this could result in an error state as a pre-existing one might be re-used. $testCS } Function Run-ASTCrawl () { <# .SYNOPSIS Runs a crawl for a content source and waits until it is complete. .DESCRIPTION Runs a crawl for a content source and waits for it to complete, features abort option that will exit the function if the crawl takes too long. #> [CmdletBinding()] Param ( [Parameter(Mandatory=$true,ValueFromPipeline=$true)]$contentSource, [Parameter(Mandatory=$false,ValueFromPipeline=$false)]$crawlTimeOut = 5 ) #Run a crawl for that content source $contentSource.StartFullCrawl() #Start the stopwatch, Note: replace with stopwatch. $startTime = Get-Date #Inital pause time under which there is no point checking for the crawl to be complete $crawlInitalTime = 120 #Set a flag to allow us to abort if the duration is excessive $stillNotStupidDuration = $true Write-Verbose "Starting crawl. Waiting for $crawlInitalTime seconds (Default SharePoint minimum search duration)" Sleep -Seconds $crawlInitalTime #Wait for it to finish while ($contentSource.CrawlStatus -ne "Idle" -AND $stillNotStupidDuration -eq $true) { Write-Verbose "Crawl still running at $timeDifference, waiting 10 seconds" Sleep -Seconds 10 $timeDifference = (Get-Date) - $startTime if ($timeDifference.Minutes -gt $crawlTimeout) { $stillNotStupidDuration = $false } } if ($stillNotStupidDuration) { Write-Verbose "Crawl complete" } else { Write-Warning "No longer waiting for process to complete. Search not finished, results will be unpredictable" Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 103 -EntryType Critical -Message "Crawler took longer than the timeout value of $crawlTimeOut so the function exited early." } } Function Check-ASTPairValue () { <# .SYNOPSIS Tests that a search term returns a file with the appropriate name .DESCRIPTION Takes a pipeline bound pair of test values and search terms and searches for them using the searchPageURL page. Returns either 'Present' or 'Not Found' depending on the result. .EXAMPLE $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL #> [CmdletBinding()] Param ( [Parameter(Mandatory=$true,ValueFromPipeline=$true)]$testPair, [Parameter(Mandatory=$true)]$searchPageURL ) BEGIN{ #Create the IE window in the begin block so if the input is pipelined we don't have to re-open it each time. $ie = New-Object -com "InternetExplorer.Application" $ie.visible = $true } PROCESS { #Get the test value and the search term from the pair $testValue = $testPair[0] $searchTerm = $testPair[1] #Open the navigation page $ie.navigate($searchPageURL) #Wait for the page to finish loading while ($ie.readystate -ne 4) { start-sleep -Milliseconds 100 } Write-Verbose "Page loaded" #Get the search box $searchTextBoxID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_InputKeywords" $document = $ie.Document $searchBox = $document.getElementByID($searchTextBoxID) #enter the search terms $searchBox.innerText = $searchTerm Write-Verbose "Searching for: $searchTerm - Expected result: $testValue" #Get the search button $searchButtonID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_go" #Run the search $btn = $document.getElementByID($searchButtonID) $btn.click() #Wait for the results to be loaded while ($ie.locationurl -eq $searchPageURL) { start-sleep -Milliseconds 100 } Write-Verbose "Left the page, waiting for results page to load" #Wait for the results page to load while ($ie.readystate -ne 4) { start-sleep -Milliseconds 100 } Write-Verbose "Results page loaded" #Once loaded check that the results are correct $document = $ie.document #Check that the search term results contain the test results: $firstSearchTermID = "SRB_g_9acbfa38_98a6_4be5_b860_65ed452b3b09_1_Title" $firstSearchResult = $document.getElementByID($firstSearchTermID) $result ="" #test that the title of the file is equal to the search result If ($firstSearchResult.innerHTML -match $testValue) { $result ="Present" } else { $result ="Not Found" } Write-Verbose "Test $result" #Create a new PS Object for our result and let PowerShell pass it out. New-Object PSObject -Property @{ TestCase = $searchTerm Result = $result } } END { #Close the IE window after us $ie.Quit() } } ###################################################################################### #Execution script begins here ###################################################################################### #Generate the output file location $reportFilePath = $reportFolder+ "\SearchTest_Results_" + (Get-Date -Format "dd_MM_yyyy") + ".txt" #Name of the search service application to test $searchAppName = "Search Service Application" #Path to the shared folder #NOTE: THIS HAS TO BE SETUP BEFORE RUNNING THE SCRIPT MANUALLY (It can be scripted but i haven't) $fileSharePath = "\\spintdev\TestFolder" #The search page $searchSiteURL = "http://sharepoint/sites/search/Pages/default.aspx" #Start generating the report $reportFilePath = "C:\AutomatedTest\SearchTest_Results_" + (Get-Date -Format "dd_MM_yyyy") + ".txt" #All items from here on in are internal and do not have to be specified or modified unless you wish it. #test content - deliberately junk and non sensical rubbish to trim down search results and avoid false negatives. #Note: I have no particular insight or interest in the dietry foibles of the politicans listed below. $testContent = @( ("FileA.txt","Miliband loves pie"), ("FileB.txt","Osbourne despises soup"), ("FileC.txt","Cameron tolerates beans"), ("FileD.txt","Clegg loathes eggs which is ironic"), ("FileE.txt","Benn likes red meat"), ("FileF.txt","Balls desires flan"), ("FileG.txt","Cable adores sandwiches"), ("FileH.txt","Hunt regrets cake") ) #Junk content for an additional test to exclude false positive results $itemToConfirmFailure =@( "sdkfslskjladsflkj", "lflkfdskjlfdskjfdslkjf" "sdkfslsfdjklfkjladsflkj", "lflskjfdslkjf" ) #Only used internally. $testCTName = "TestSearchContentType" $startDateTime = Get-Date $currentComputerName = $env:computername #Header info "Automated SharePoint Search Testing Results`n" >> $reportFilePath "Test started at $startDateTime on Computer $currentComputerName" >> $reportFilePath #Write the first test to the report "Test 1 - Confirm search terms do not retrieve values `n" >> $reportFilePath "Confirms that there are no files that can generate a false positive in the system.`n" >> $reportFilePath Write-Host "Starting tests, checking that there is no pre-existing content that might cause false positives" #Run a search for the testcontent $deliberatelyFailedResults = @() $deliberatelyFailedResults += $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose $falsePositives = $deliberatelyFailedResults | ? {$_.Result -eq "Present"} $errorText = "Test failed, files found by search engine. Results not reliable" $sucsessText = "Test Passed, moving to next stage" $testText = (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText) $testText >> $reportFilePath #Create the test files based on the array above Create-ASTFiles -filesToCreate $testContent -folderPath $fileSharePath #Get the search app $sa = Get-SPEnterpriseSearchServiceApplication -Identity $searchAppName if ($sa -eq $null) { Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 101 -EntryType Error -Message "Could not find search application $searchAppName" } Write-Host "Checking that we are within guidelines for number of Content Sources" #Test the number of content sources already in place $numberOfContentSourcesBelowThreshold = Test-ContentSourceCountAcceptable -searchServiceApplication $sa -Verbose:$printVerbose #Only progress if we're not going to breach the content type limit. if ($numberOfContentSourcesBelowThreshold) { Write-Host "Within the Acceptable number of Site Collections" #Get the content source. $testCS = Ensure-ASTContentSourceExists -sa $sa -contentSourceName $testCTName -filePath $fileSharePath -Verbose:$printVerbose Write-Host "Running the crawl - estimated completion in approximately 2 minutes" #Run the crawl and wait for it to complete Run-ASTCrawl -contentSource $testCS -Verbose:$printVerbose $searchResults = @() Write-Host "Crawl Complete, testing links" $searchResults += $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose $failures = $deliberatelyFailedResults | ? {$_.Result -ne "Present"} #Write the test to the report "Test 2 - Test new content`n" >> $reportFilePath "Confirms that search works for our new content.`n" >> $reportFilePath $errorText = "Test failed, files were not found" $sucsessText = "Passed main test." $failures += (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText) #Confirm that the test will fail given junk input. $deliberatelyFailedResults = @() $deliberatelyFailedResults += $itemToConfirmFailure | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose $falsePositives = $deliberatelyFailedResults | ? {$_.Result -eq "Present"} #Write the test to the report "Test 3 - Check for junk terms `n" >> $reportFilePath "Confirms that search doens't find some junk values.`n" >> $reportFilePath $errorText = "Test failed, files found by search engine when given junk data" $sucsessText = "Passed confirmation test - junk values not found" $testText = (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText) $testText >> $reportFilePath #Clean up the content source $CSToDelete = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $testCTName} $CSToDelete.Delete() #Delete the files foreach ($combo in $testContent) { $fileName = $combo[0] $file = Get-ChildItem -Path $fileSharePath | ? {$_.name -eq $fileName} $file.Delete() } #Note that the content source may take a minute to be deleted Write-Host "Pausing for 1 minute to allow the index to update" Sleep -Seconds 60 #Run a search for the testcontent $deliberatelyFailedResults = @() $deliberatelyFailedResults += $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose $falsePositives = $deliberatelyFailedResults | ? {$_.Result -eq "Present"} #Write the test to the report "Test 3 - Confirm search terms are removed`n" >> $reportFilePath "Confirms that the test search content is removed from the system.`n" >> $reportFilePath $errorText = "Test failed, files found by search engine when given junk data" $sucsessText = "Passed confirmation test. Test files are not present" $testText = (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText) $testText >> $reportFilePath } else { $errorText = "Error - Unable to create a Content Source as the total number of Content Sources is greater than the Microsoft boundary" Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 100 -EntryType Warning -Message $errorText $errorText >> $reportFilePath } "Automated SharePoint Search Testing Completed at $(Get-Date) `n" >> $reportFilePath
So there we have it. A fully functioning automated testing process for SharePoint Search. It would be nice if it sent an email but i’m planning on rolling this into some SCOM work i’m playing with.
I haven’t tested this on 2013 yet, it’d need at least some tweaks to field IDs and maybe more structural work to get the Search API right for 2013. If anyone is interested i’ll knock up a new version.
Hil Alex,
Good idea this script!
Do you already have a 2013 version?
Or is it just a matter of tweaking field ID’s used in the web forms as you suggest?
kind regards
Carry
I don’t have anything for 2013 at the moment, it should be possible to re-write it to work there but i mostly put this together to prove I could rather than any expectation of it being useful.
That is a nice script! I will give it a try on SP2013 and let you know how it goes!
Thanks for sharing!
Max
2013 has a significantly different search system so i’d be very surprised if you’d be able to use this script as it is. On the other hand you might be able to re-work it to conform to the 2013 model. (Belated) Good luck