SharePoint database growth settings

This is a basic topic but one that crops up time and time again.

By default SharePoint will create databases with settings to grow 1 MB at a time. That means that if you add a 5 MB file the database will grow 5 times to fit it in. If you add a 250 MB file (the default largest size file for a 2013 farm) that means a worrying 250 growth operations will be needed.

Why is growth bad?
During a growth operation the database is completely unresponsive. That means that any access at all, read only or not, will have to wait for the growth to complete. That will result in slower responses for users and reduce the number of users that your environment can support at any one time. As such growth is bad.

How long does growing take
It varies depending on your hardware. In my test environment, a Hyer-V machine with a dynamically growing VHDX on an SSD I got the following numbers. I’m going to put up a post on how I got these numbers sometime soon, i’ll update this post once I’ve done so.

Growth Size / MB Average time / ms Average time per MB
1 13.40 13.40
10 46.11 4.61
100 328.17 3.28
1000 2480.67 2.48

If we plot that data on a chart we get a clearer image

Graph of autogrowth duration

Graph of autogrowth duration


The duration for growth operations scales with the size, however the change isn’t quite linear, there is an overhead to each growth that makes a single larger growth more efficient. As such it’s best to grow as few times as possible with larger individual growth operations.

On the other hand as you can see a single growth operation can take a significant amount of time. Growing by 1 GB took around 2 and a half seconds. That is a significant delay and whilst it will be infrequent your users may well notice, especially in a heavily used site.

For a 250 MB file the 1 MB growth rate would take 3.25 seconds. A 10 MB growth rate gives 1.15, 100 MB comes in at 0.98 and 1000 MB at 2.48 seconds (but you only have to grow every fourth item).

Category Example Growth Rate
Default 1 MB
Max File Size 250 MB
Large Fixed 1000 MB
Nightly 2000 MB
Pre-grow N/A

Default

This is more or less the worst possible choice. Every time a database grows it needs to lock the entire system. Whilst that might only be for a short time for each one the total time is excessive. As we saw this will take around 3.25 seconds to perform all the growth needed for a 250 MB file. The only redeeming feature of this setting is that it is very efficient for storage space.

Max File Size

By setting the file growth rate to the web application maximum file size you can ensure that only a single growth operation will happen for any file upload. This gives a relatively low individual growth duration which means the disruption for a single file will be as small as possible. On the other hand it will result in a lot more individual growth operations than a larger size. That means that in a highly used site you may have statistically significant and detectable amounts of downtime.

Large Fixed

By setting the autogrowth setting to an arbitrarily large size, eg. 1 GB, you can minimise the number of growth operations that need to occur. This has a much lower total disruption than the Default settings or setting to the maximum file size but when it does occur the delay can be significant. This is a popular setting.

Nightly Growth

As an alternative to autogrowth settings you can grow the databases ahead of the rate that users put data into them. To do this your SQL management team would set an alert on the database fill rate to ensure that there is more free space available than can reasonably be expected to be used in a day. If the database drops below that point then your SQL admins will grow the databases manually during the next scheduled downtime or period of low load.
This avoids/minimises any of the aforementioned disruption to users when growth occurs but requires an alert and capable SQL team. If your SQL team miss the alert then either a Max File Size or Large Fixed policy should be in place as backup.

Pre-grow

The most performant option is to grow your databases to their final size on creation. This requires a highly predictable environment with site collection quotas in use to prevent the databases growing beyond their expected size. It’s also a very storage-space inefficient way to manage your databases.

Summary

Which option you should use depends on where you’re putting SharePoint in. My preference is to set the growth settings to the maximum file size for the web application it is attached to. This provides the ideal blend of minimising total growth time whilst avoiding large delays when larger growth operations are needed.
If you have a good SQL team behind you to grow the databases nightly then that is an even better way to avoid disruption to users. However that does require the SQL team to be available and able to assist, if this isn’t available or you’re in a really high usage environment then you should pre-grow the databases up front and never have to suffer growth again.

SharePoint Surveys and the mystery of the missing partial response

Note: this applies to SharePoint 2010 and 2013, it is not viable for SharePoint online / Office 365.

Fun fact, SharePoint, if you’ve got branching logic in your survey, allows you to save your response to a survey!

This is great news. However on TechNet someone asked if it were possible to find these incomplete posts to remind the user that they haven’t finished?

Normally you’d assume that all you’d need to do is log in with a site collection admin account and you’d magically be able to see all the responses, even those that haven’t been completed. However in this case you’d be wrong. Even with Site Collection admin rights those incomplete responses are hidden from you.

Now that data is still in the SharePoint list and rather than mess about with permissions and settings, which could have unforeseen consequences, let’s see if we can’t pull it out of the list with PowerShell?

The first thing to do is see if we can get the real number of items from the survey. The easiest way to do that is to check the item count.

$web = Get-SPWeb "URL"
$survey = $web.Lists["SurveyName"]
Write-Host "Items in the survey : " $survey.ItemCount

That gives you the number of items, which you can compare to those visible. In my case I had one completed and one partial, leading to a total count of two. Which the script agreed with.

When you look at the object in PowerShell you can see that there’s a few fields that are different between a completed and a partial response. In this case we have two that are of interest.

Image showing different HasPublishedVersion and Level for two items.

The differences between a visible, completed item (1) and an incomplete item (2)

Of the two i’m going to use the ‘HasPublishedVersion’ field. Live the ‘Level’ field it will become true as long as a version has been completed, however unlike Level it will remain true even if the user starts to edit it later and somehow manages to do a partial save.

Let’s extend our script to list out the number in each group, then list the users who created them.

$web = Get-SPWeb "http://portal.tracy.com/sites/GD/"
$survey = $web.lists["Survey"]
$unPublishedEntries = $survey.Items | ? {-not $_.HasPublishedVersion}

Write-Host "Surveys in list: " $survey.ItemCount
Write-Host "Of which entries are incomplete: " $unPublishedEntries.Count

Foreach ($entry in $unPublishedEntries)
{
    Write-Host "User: {0} has not completed their entry" -f $entry["Created By"]
}

And voila, my results give us:

PowerShell results showing the results from a script

Printout from script

Cataloging Choice Columns

It’s been a while but it’s time for a new post.

Someone asked on TechNet () how to get a summary of all the custom choice columns and the options they have. I didn’t have the time to put it together at that moment but I thought It’d be a good exercise.

This will work for 2010 or 2013 but will not work for Office 365 or SharePoint online. For that a different approach would be needed.

“I inherited a SharePoint 2010 Foundation site that contains about 40 custom columns and about 10 of those custom columns are of the type “Choice”. Is there a way using Powershell or something else to export to a .csv file a list of the custom columns and if they are the type “choice” to show the list of what the various choices are for each column?”

So, let’s assume that we’re only interested in the site columns. To do that we’ll have to grab the SPWeb object and loop through the columns there that are of the appropriate type and list them out.

#Get the web
$web = Get-SPWeb "http://sharepoint.domain.com/sites/sitecollection/subsite"

#Get the columns (note, these are named fields)
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice"}

#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Loop through each choice and print the name
foreach ($entry in $columns)
{
Write-Host ("Field Name: {0}" -f $entry.InternalName)
#Loop through the choices and print those out
foreach ($choice in $entry.Choices)
{
Write-Host ("  Choice: {0}" -f $choice)}
}

From here

That’ll list out the columns to the screen but it’s not a great solution. It’s printing out too many columns, it’s also just printing them to the screen. We need it to serialise this into a format that we can use.

Let’s start with serialisation.

There’s loads of ways to do this but my preference is to create custom objects to contain the information we collect, then assign them to an array which we can process later.

For a quick guide to PSObjects have a look here
So, after changing the write-hosts to write verbose and putting in our custom objects we get this!

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#Get the web
$web = Get-SPWeb "http://portal.tracy.com/sites/cthub"

#Get the columns (note, these are named fields) 
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice"}

#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Create empty array
$ColumnDetailsArray = @()

#Loop through each choice column and create an object to hold it
foreach ($entry in $columns)
{
    $choicesArray = @()
    Write-Verbose ("Field Name: {0}" -f $entry.InternalName)
    
    #Loop through the choices and print those out
    foreach ($choice in $entry.Choices)
    {
        #Add each choice to the (local) array
        Write-Verbose ("  Choice: {0}" -f $choice)
        $choicesArray += $choice
    }
    #Create a result object to hold our data
    $ColumnDetailsArray += New-Object PSObject -Property @{
                        "Name" = $entry.InternalName
                        "Choices" = $choicesArray
                        }
}

Which actually makes things worse as we no longer get any results! Let’s add that in some xml work. I’m still not entirely happy with the way PowerShell and XML work together so this example is a bit clunky but it works.

#Create a starter XML for the system to work with
[xml] $xml = "<root><summary rundate='{0}' web='{1}'/></root>" -f 
    (Get-Date -f ddMMyyyy), 
    $web.Title

#loop through the results and genearate an xml object
foreach ($column in $ColumnDetailsArray)
{
    #Create an element to hold the top level item
    $columnElement = $xml.CreateElement("Choice")
    $columnElement.SetAttribute("Name", $column.Name) 

    #Loop through the choices and add entries for each
    foreach ($choice in $column.Choices)
    {
        $choiceElement = $xml.CreateElement("Choice")
        
        #Note that you need this Pipe Out-Null to prevent it writing to the console
        $choiceElement.InnerText = $choice
        $columnElement.AppendChild($choiceElement) | Out-Null
    }
    #Once it's built add the element to the root node
    $xml.root.AppendChild($columnElement)  | Out-Null
}
$xml.Save("C:\ResultFolder\ColumnSummary.xml")

So, this now dumps out the data we’ve asked for but it also dumps out all the pre-packaged columns. This is a problem that isn’t easily fixed, there isn’t an ‘OOTB’ flag on fields but there are a few we can use to filter them out.

If we grab a column from the results and run Get-Members on it there are a couple of fields that should be useful for filtering the results:

Sealed – This shows if the column is not supposed to be edited by human hands. Note that this could give false negatives in scenarios where a column has been deployed via the CTHub which I think seals columns (it definitely seals Content Types) as it pushes to consuming site collections

Hidden – Not relevant in this case but often handy. In this case we’ll filter out groups that are part of the ‘_hidden’ group earlier.

So if we now add that criteria to the earlier $columns filtering process we get

$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice" -and 
        -not $_.Sealed -and $_.Group -ne "_Hidden"
    }

But that’s still not perfect, so instead of filtering the terms out right now let’s make it a bit more useful first. When you look at columns in SharePoint they are organised in Groups. We can see those properties in PowerShell and group our elements using that field.

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#Get the web
$web = Get-SPWeb "http://portal.tracy.com/sites/cthub"

#Get the columns (note, these are named fields) 
#Also filter out the sealed fields
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice" -and 
        -not $_.Sealed -and $_.Group -ne "_Hidden"
    }


#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Create empty array
$ColumnDetailsArray = @()

#Loop through each choice column and create an object to hold it
foreach ($entry in $columns)
{
    $choicesArray = @()
    Write-Verbose ("Field Name: {0}" -f $entry.InternalName)
    
    #Loop through the choices and print those out
    foreach ($choice in $entry.Choices)
    {
        #Add each choice to the (local) array
        Write-Verbose ("  Choice: {0}" -f $choice)
        $choicesArray += $choice
    }
    #Create a result object to hold our data
    $ColumnDetailsArray += New-Object PSObject -Property @{
                        "Name" = $entry.InternalName
                        "Group" = $entry.Group
                        "Choices" = $choicesArray
                    }
}

#Create a starter XML for the system to work with
[xml] $xml = "<root><summary rundate='{0}' web='{1}'/></root>" -f 
    (Get-Date -f ddMMyyyy), 
    $web.Title

#Get a unique list of the groups in use in the site
foreach ($group in $ColumnDetailsArray | select Group -Unique)
{
    $groupText = $group.Group
    Write-Host "Group name: " $groupText
    $groupElement = $xml.CreateElement("Group")
    $groupElement.SetAttribute("Name", $groupText)
    
    #loop through the results and add them to the xml object
    foreach ($column in $ColumnDetailsArray | ? {$_.Group -eq $groupText})
    {
        #Create an element to hold the top level item
        $columnElement = $xml.CreateElement("Choice")
        $columnElement.SetAttribute("Name", $column.Name)

        #Loop through the choices and add entries for each
        foreach ($choice in $column.Choices)
        {
            $choiceElement = $xml.CreateElement("Choice")
        
            #Note that you need this Pipe Out-Null to prevent it writing to the console
            $choiceElement.InnerText = $choice
            $columnElement.AppendChild($choiceElement) | Out-Null
        }
        #Once it's built add the element to the root node
        $groupElement.AppendChild($columnElement)  | Out-Null
    }
    $xml.root.AppendChild($groupElement)  | Out-Null
}
$xml.Save("C:\ResultFolder\ColumnSummary.xml")

Of course once you have the group name you can filter those options out by using a blacklist of groups to avoid reporting on.


Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#BlackList Group Names
#These are the known groups you get in a non publishing team site:
$blackList = @(
    "_Hidden",                                                                         
    "Base Columns",                                                                    
    "Content Feedback",                                                   
    "Core Contact and Calendar Columns",                                               
    "Core Document Columns",                                                         
    "Core Task and Issue Columns",                                                          
    "Custom Columns",                                                            
    "Display Template Columns",                                                          
    "Document and Record Management Columns",                                                
    "Enterprise Keywords Group",                                                             
    "Extended Columns",                                                             
    "JavaScript Display Template Columns",                                                   
    "Reports",                                                                  
    "Status Indicators"
)

#Get the web
$web = Get-SPWeb "http://portal.tracy.com/sites/cthub"

#Get the columns (note, these are named fields) 
#Also filter out the sealed fields
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice" -and 
        -not $_.Sealed -and $_.Group -ne "_Hidden"
    }


#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Create empty array
$ColumnDetailsArray = @()

#Loop through each choice column and create an object to hold it
foreach ($entry in $columns)
{
    $choicesArray = @()
    Write-Verbose ("Field Name: {0}" -f $entry.InternalName)
    
    #Loop through the choices and print those out
    foreach ($choice in $entry.Choices)
    {
        #Add each choice to the (local) array
        Write-Verbose ("  Choice: {0}" -f $choice)
        $choicesArray += $choice
    }
    #Create a result object to hold our data
    $ColumnDetailsArray += New-Object PSObject -Property @{
                        "Name" = $entry.InternalName
                        "Group" = $entry.Group
                        "Choices" = $choicesArray
                    }
}
#Create a starter XML for the system to work with
[xml] $xml = "<root><summary rundate='{0}' web='{1}'/></root>" -f 
    (Get-Date -f ddMMyyyy), 
    $web.Title

foreach ($group in $ColumnDetailsArray | select Group -Unique)
{
    $groupText = $group.Group

    #Check to see if the group name is in our blacklist
    if (-not $blackList.Contains($groupText))
    {
        Write-Verbose "Group name: " $groupText
        $groupElement = $xml.CreateElement("Group")
        $groupElement.SetAttribute("Name", $groupText)
    
        #loop through the results and genearate an xml
        foreach ($column in $ColumnDetailsArray | ? {$_.Group -eq $groupText})
        {
            #Create an element to hold the top level item
            $columnElement = $xml.CreateElement("Choice")
            $columnElement.SetAttribute("Name", $column.Name)

            #Loop through the choices and add entries for each
            foreach ($choice in $column.Choices)
            {
                $choiceElement = $xml.CreateElement("Choice")
        
                $choiceElement.InnerText = $choice
                #Note that you need this Pipe Out-Null to prevent it writing to the console
                $columnElement.AppendChild($choiceElement) | Out-Null
            }
            #Once it's built add the element to the root node
            $groupElement.AppendChild($columnElement)  | Out-Null
        }
        $xml.root.AppendChild($groupElement)  | Out-Null
    }
    else
    {
        Write-Verbose "Group skipped:" $groupText
    }
}
$xml.Save("C:\ResultFolder\ColumnSummary.xml")

And there you have it. A working report that will summarise all custom choice columns in a SPWeb object and save them in an XML file.

Changing Modified, and Created details in SharePoint

Sometimes you need to lie to SharePoint. In this post i’ll show you how to change the details for who created an item, modified it and when they modified it.

When you’re doing bulk uploads, dealing with lists where you wish to use the Advanced features of only allowing users to edit their own items or just testing some behaviour, eventually you’ll wish you can change the values that SharePoitn doesn’t let you change.

The first thing is, as always, to find the value we want to change:

#Add the SharePoint snapin
Add-PSSnapin Microsoft.SharePoint.Powershell -ea SilentlyContinue

#set the web url and the list name to work upon
$url = "http://sharepoint/sites/cthub"
$listName = "Shared Documents"
$fileName = "FileName.xlsx"

#Get the appropriate list from the web
$web = get-SPWeb $url
$list = $web.lists[$listName]

#Get the file using the filename
$item = $list.Items | ? {$_.Name -eq $fileName}

#Print out current Created by and Created date
Write-Output ("item created by {0} on {1}" -f $item["Author"].tostring(), $item["Created"] )

#Print out current Created by and Created date
Write-Output ("item last modified by {0} on {1}" -f $item["Editor"].tostring(), ($item["Modified"] -f "dd-MM-yyyy"))

As you can see we access the item properties by treating the $item as a hashtable and use the property name as the key.

#Set the created by values
$userLogin = "ALEXB\AlexB"
$dateToStore = Get-Date "10/02/1984"

$user = Get-SPUser -Web $web | ? {$_.userlogin -eq $userLogin}
$userString = "{0};#{1}" -f $user.ID, $user.UserLogin.Tostring()

#Sets the created by field
$item["Author"] = $userString
$item["Created"] = $dateToStore

#Set the modified by values
$item["Editor"] = $userString
$item["Modified"] = $dateToStore


#Store changes without overwriting the existing Modified details.
$item.UpdateOverwriteVersion()

Setting the value is a bit more complicated. To do that you have to build the appropriate user string. In my example the user is already part of the site, if they haven’t previously been added to the user information list you’ll need an extra step here to insert them.

The second bit which differs from your usual PowerShell update is the use of the UpdateOverwriteVersion() method. There’s several update methods in SharePoint but only this one will preserve your changes to modified by and created by.

And now, the full script:

<#
Title: Set modified and created by details
Author: Alex Brassington
Category: Proof of Concept Script
Description
This script is to show how to read, modify and otherwise manipulate the created by and modified by details on documents.
This is to enable correction of incorrect data as part of migrations. It is also useful to enable testing of retention policies.
#>

#Add the SharePoint snapin
Add-PSSnapin Microsoft.SharePoint.Powershell -ea SilentlyContinue

#set the web url and the list name to work upon
$url = "http://sharepoint/sites/cthub"
$listName = "Shared Documents"
$fileName = "FileName.xlsx"

#Get the appropriate list from the web
$web = get-SPWeb $url
$list = $web.lists[$listName]

#Get the file using the filename
$item = $list.Items | ? {$_.Name -eq $fileName}

#Print out current Created by and Created date
Write-Output ("item created by {0} on {1}" -f $item["Author"].tostring(), $item["Created"] )

#Print out current Created by and Created date
Write-Output ("item last modified by {0} on {1}" -f $item["Editor"].tostring(), ($item["Modified"] -f "dd-MM-yyyy"))

#Set the created by values
$userLogin = "ALEXB\AlexB"
$dateToStore = Get-Date "10/02/1984"

$user = Get-SPUser -Web $web | ? {$_.userlogin -eq $userLogin}
$userString = "{0};#{1}" -f $user.ID, $user.UserLogin.Tostring()


#Sets the created by field
$item["Author"] = $userString
$item["Created"] = $dateToStore

#Set the modified by values
$item["Editor"] = $userString
$item["Modified"] = $dateToStore


#Store changes without overwriting the existing Modified details.
$item.UpdateOverwriteVersion()

Deleting Versions

Someone asked for a script that could delete previous versions for SharePoint 3.0. I don’t have a 3.0 dev environment but I do have a 2010 build and it interested me.

Function Delete-SPVersions ()
{
[CmdletBinding(SupportsShouldProcess=$true)]
param(
    [Parameter(Mandatory=$True)][string]$webUrl, 
    [Parameter(Mandatory=$True)][string]$listName, 
    [Parameter(Mandatory=$False)][string]$numberOfMajorVersions
    ) 

    #Get the web
    $web = Get-SPWeb $webUrl


    #Get the list
    $list = $web.Lists[$listName]

    $list.Items | % {
        #Get the item in a slightly more usable variable
        $item = $_

        Write-Output ("Deleting versions for item {0}" -f $item.Name)
        
        #Get all major versions
        #Note, the version ID goes up by 512 for each major version.
        $majorVersions = $item.Versions | ? { $_.VersionID % 512 -eq 0}

        #get the largest version number
        $latestMajorID = $majorVersions | select VersionID -ExpandProperty VersionID | sort -Descending | select -First 1

        #Slightly lazy way to get the latest major version and format it as a single point decimal
        Write-Output ("   Latest major version to retain is {0:0.0}" -f ($latestMajorID /512))
        
        #Filter the major versions to only those which are lower than the highest number - 512 * $numberOfMajorVersions
        $majorVersionsToDelete = $majorVersions | ? {$_.VersionID -le ($latestMajorID - 512 * $numberOfMajorVersions)}
        if ($majorVersionsToDelete)
        {
            $majorVersionsToDelete | % {
                Write-Verbose ("  Deleting major version {0}" -f $_.VersionLabel)
                if ($pscmdlet.ShouldProcess($_.VersionLabel,"Deleting major version"))
                {
                    $_.Delete()
                }
            }
        }
        else
        {
            Write-Verbose "No major versions to delete"
        }
        
        #Re-fetch the item to ensure that the versions are still valid
        $item = $list.GetItemByUniqueId($item.UniqueId)
        
        #Get all the minor versions
        $minorVersions = $item.Versions | ? { $_.VersionID % 512 -ne 0}

        #Delete Minor versions greater than the last major version kept
        $minorVersionsToDelete = $minorVersions | ? {$_.VersionID -lt $latestMajorID}
        if ($minorVersionsToDelete)
        {
            $minorVersionsToDelete | % {
                Write-Verbose ("Deleting minor version {0}" -f $_.VersionLabel)
                if ($pscmdlet.ShouldProcess($_.VersionLabel,"Deleting minor version"))
                {
                    $_.Delete()
                }
            }
        }
        else
        {
            Write-Verbose "No minor versions to delete"
        }
    }
    $web.Dispose()
}

Failed Search Scripting

Sometimes knowing what doesn’t work is as useful as what does work. In that vein here’s how I spent my journey home…

A post on technet asked about how to deal with long running search crawls that were impacting users when they overran into business hours. In large SharePoint environments that shouldn’t really happen but it’s a fairly common concern in smaller shops.

Ideally you’ll tune your search so that it always completes in time but that doesn’t always work. For those edge cases there’s two options:

  1. Pause a specific (or all) crawl(s) during working hours.
  2. Reduce the impact of the crawls during working hours

Pausing a crawl is easy, it’s also done very well by other people such as:
Ed Wilson

I wanted to drop the performance of the crawl down so that it can still keep going but not impact the end users.

The first step was to find out how to create a crawl rule to reduce the impact of the search

$shRule = Get-SPEnterpriseSearchSiteHitRule –Identity "SharePoint"

#Cripple search
$shRule.HitRate = 1000
$shRule.Behavior = 'DelayBetweenRequests'
$shRule.Update()

#Revive search
$shRule.HitRate = 8
$shRule.Behavior = 'SimultaneousRequests'
$shRule.Update()

It turns out that in the API a crawl rule is known as a hit rule. Hit rules have two important values, the ‘rate’ and the behaviour.

The script above was enough to let me create a rule and set it to either run at a normal page or with a 16 minute delay between requests. And it worked!

Well, it created a rule and that rule worked. Sadly i’d forgotten that the crawler rules are only checked when you start a crawl. If you start a crawl then punch the delay between items up to 1000 it won’t make a blind bit of difference.

It turns out that pausing the crawl doesn’t make the search engine re-check the crawl rate.

So, a failure. The only thing i can think of is using reflection to find out what is happening in the background and then doing something deeply unsupported to modify the values in flight. Maybe another time.

Check Blob Cache settings across a farm

A routine but really annoying task came up today. The status of BLOB caching for all Web Applications, normally I’d hop into the config files and check but these are spread over 6 farms.
To make things worse each farm had between 4 and 5 web apps and between 1 and 3 WFEs. In total that should have meant checking in the region of 50 config files.

Sod that.

Getting data from the config file is easy, after all it’s just an xml document. Once we’ve got the document it’s a nice easy task to suck the actual values out:

	$xml = [xml](Get-Content $webConfig)
           $blobNode =  $xml.configuration.SharePoint.BlobCache
            
           $props = @{
                'Location' = $blobNode.location;
                'MaxSize' = $blobNode.maxsize;
                'Enabled' = $blobNode.enabled
            }
           New-Object -TypeName PSObject -Property $props
That on it's own isn't that helpful. After all we don't just want to get the values for ONE config file, we want to get them for something like 50. Also I don't really want to have to list each one out, I just want them all... Through some rather excessive PowerShelling I know that we can get the config file's physical path from the IIS settings on a web application:
$WebApp = Get-SPWebApplication "http://sharepoint.domain.com" 
$WebApp.IISSettings
#This is horrible code but I haven't found a better way to get the value in a usable format.
$physicalPath = ($settings.Values | select path -Expand Path).Tostring()

That will give us the physical path to the folder containing the web.config file. It's only a small leap to make it loop through all the locations…

#Script to check the blob caching configuration in all webconfig files in your farm on all servers.

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

$outputFile = "C:\Folder\TestFile.txt"

#This is a rubbish way of getting the server names where the foundation web app is running but i can't
#find a better one at the moment.
$serversWithFWA =Get-SPServiceInstance `
    | ? { $_.TypeName-eq "Microsoft SharePoint Foundation Web Application" -AND $_.Status -eq "Online"} `
    | Select Server -ExpandProperty Server | select Address

	
	
Get-SPWebApplication | %{
 
    $configResults = @()
	$webAppName = $_.name
    Write-Host "Webapp $webAppName"
    
    #Get the physical path from the iis settings
    $settings =$_.IISSettings
    $physicalPath = ($settings.Values | select path -Expand Path).Tostring()
     
    #foreach server running the Foundation Web Application.
    foreach ($server in $serversWithFWA)
    {
        #Build the UNC path to the file Note that this relies on knowing more or less where your config files are
        #This could be improved using regex.
        $serverUNCPath = $physicalPath.Replace("C:\",("\\" + $server.Address + "\C$\"))
        $webConfig = $serverUNCPath +"\web.config"
 
        #If the file exists then try to read the values
        If(Test-Path ($webConfig))
        {
            $xml = [xml](Get-Content $webConfig)
            $blobNode =$xml.configuration.SharePoint.BlobCache

            $props = @{
				'Server' = $server.Address;
                'Location' = $blobNode.location;
                'MaxSize' = $blobNode.maxsize;
                'Enabled' = $blobNode.enabled
            }
            $configResults += New-Object -TypeName PSObject -Property $props
        }
    }
	#Print the results out for the GUI
	$webAppName
    $configResults | ft
	
	#Output the data into a useful format - start by printing out the file name
	$webAppName >>  $outputFile
	#CSV because the data is immeasurably easier to load into a table etc. later. HTML would be a good alternative
	$configResults | ConvertTo-CSV >> $outputFile
}

Shared Folder Logs In SharePoint

First of all. Do not try this on anything other than a disposable test farm.

I’m 99% certain Microsoft will laugh in your face if you do this to your production server and try to get support. Doing this on a client site would be reckless and irresponsible, this is posted for interests sake.

I was down the pub one evening and It occurred to me, what happens if you want to put all your logs on a mapped drive? It quickly occurred to me that I should drink more.

… Six months passed …

In the world of SANs my question doesn’t make that much sense. You build your aggregate, parcel it up into LUNs and attach them to the relevant hosts. We’re not talking about a significant improvement in performance, scalability, maintenance etc.

On the other hand if you’re got a big farm the idea of having 20 x 10GB disks with one attached to each server has to add to the complexity From an admin perspective it’s also a pain as your ULS files are all over the shop. But the main reason I want to do this is because I want to know if it can be done…

To assist me I have a friendly shared drive:

\\SPINTDEV\TestFolder

By default my install was using the C:\Program Files path, not best practice but this is a throwaway VM. It was good enough to start:

Default Log Directory

Default Log Directory


So, let’s try putting in my shared folder:

Attempting to set a shared folder as the log file path

Attempting to set a shared folder as the log file path

Unsurprisingly, SharePoint objected to my choice of paths. Normally this would be a good time to realise that this is a bad idea but let’s try something else:

Adding a shared folder as a mapped drive

Adding a shared folder as a mapped drive

Perhaps a mapped drive will do it (note I actually used H:\ instead). Once that’s created we can try using it instead:

Failing to add a mapped drive as a log folder path

Failing to add a mapped drive as a log folder path


I guess not. It seems SharePoint doesn’t like my idea as much as I do. Time to see if PowerShell also dislikes my clever plan …
Setting the log folder path using PowerShell

Setting the log folder path using PowerShell


*manic cackle*
It turns out that PowerShell will let us set it, even when the GUI complains! Score one for PowerShell! Note at this point a wider lesson, PowerShell will sometimes let you do things you can’t through the GUI. That can be a good thing or a bad thing.

Of course when I went back to check my folder I saw an ocean of emptiness, no log files. Event viewer tells us the folly of our ways…

Errors in Windows event viewer

Errors in Windows event viewer

Error 2163 Unable to write log

Error 5402 Unable to create log
So, a resounding failure, on the other hand that error message includes the Log location in the registry.
Regedit.exe open at the location of the log file location value

There we can see our, not working, mapped drive. I wonder what happens if we try for the full house…

Updated log file location to point to the shared folder

And if we force SharePoint to create a new log file using ‘New-SPLogFile’ then..

Log files being created in the correct directory

*Manic cackle*

It works. Nothing new in the event viewer, log files seem to be being created correctly. The GUI still shows the old folder path and for any new servers you’ll get the same error message we saw above.

So it turns out that yes, you can use Network drives for SharePoint log files but doing so has some serious drawbacks:

  • Probably out of support
  • Inability to edit the diagnostic logs through the GUI afterwards
  • Not thoroughly tested
  • Adds to DR / new server build complexity

There’s probably a lot more but any one of the first three should stop you deploying to a live system.

On the other hand i’d quite like it if MS did open this up. Disavowing UNC paths might help to avoid some admins using a consumer grade NAS as if it was flash cached fiber SAN but it also restricts everyone else and i’m not sure if there’s a better reason.

Automating SharePoint Search Testing

I was browsing technet, as you do, when i found this comment on Search best practices:

We recommend that you test the crawling and querying functionality of the server farm after you make configuration changes or apply updates

http://technet.microsoft.com/en-us/library/cc850696(v=office.14).aspx

This chimed with me as a client I worked with had failed to do this and had paid the price when their Production search service went down for a day.

The article continues:

An easy way to do this is to create a temporary content source that is used only for this purpose. To test, we recommend that you crawl ten items — for example .txt files on a file share — and then perform search queries for those files. Make sure that the test items are currently not in the index. It is helpful if they contain unique words that will be displayed at the top of the search results page when queried. After the test is complete, we recommend that you delete the content source that you created for this test. Doing this removes the items that you crawled from the index and those test items will not appear in search results after you are finished testing

To put that in bullet point format:

  1. Test the search system doesn’t already have test content
  2. Create some test content to search
  3. Create a new content source
  4. Crawl the test content
  5. Search for the test content
  6. Check that the test content is there
  7. Remove the test content by blowing away the content source
  8. Confirm it’s no longer there

It’s a good test script. It also breaks down nicely into some bullet points. With a bit of thought we can break this down into some simple tasks:

  1. Run a search and test the results
  2. Create some files
  3. Create a new content source
  4. Crawl the test content
  5. Run a search and test the results
  6. Remove the test content by blowing away the content source
  7. Run a search and test the results

The common aspect is to run a search three times and test the results. Of course the results will, hopefully, vary depending on when we run that test but we can manage that.

Let’s go with the big one. Running a search:

Function Check-TestPairValue ()
{
<#
.DESCRIPTION
Takes a pipeline bound collection of test values and search terms and searches for
them using the searchPageURL.
Returns either 'Present' or 'Not Found' depending on the result.
Not currently production grade#> 
    [CmdletBinding()]
    Param (
    [Parameter(Mandatory=$true,ValueFromPipeline=$true)]$testPair,
    [Parameter(Mandatory=$true)]$searchPageURL
     )
    BEGIN{
        #Create the IE window in the begin block so if the input is pipelined we don't have to re-open it each time.
        $ie = New-Object -com "InternetExplorer.Application"
        $ie.visible = $true
    }
    PROCESS
    { 
        #Get the test value and the search term from the pair
        $testValue = $testPair[0]
        $searchTerm = $testPair[1]
        
        #Open the navigation page
        $ie.navigate($searchPageURL)

        #Wait for the page to finish loading
        while ($ie.readystate -ne 4)
        {
            start-sleep -Milliseconds 100
        }
        Write-Verbose "Page loaded"
        
        #Get the search box
        $searchTextBoxID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_InputKeywords"
        $document = $ie.Document
        $searchBox = $document.getElementByID($searchTextBoxID)

        #enter the search terms
        $searchBox.innerText = $searchTerm
        Write-Verbose "Searching for: $searchTerm - Expected result: $testValue"    
        
        #Get the search button
        $searchButtonID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_go"
        
        #Run the search
        $btn = $document.getElementByID($searchButtonID)
        $btn.click()
        
        #Wait for the results to be loaded
        while ($ie.locationurl -eq $searchPageURL)
        {
           start-sleep -Milliseconds 100
        }
        
        Write-Verbose "Left the page, waiting for results page to load"
        #Wait for the results page to load
        while ($ie.readystate -ne 4)
        {
            start-sleep -Milliseconds 100
        }
        Write-Verbose "Results page loaded"
        #Once loaded check that the results are correct
        $document = $ie.document

        #Check that the search term results contain the test results:
        $firstSearchTermID = "SRB_g_9acbfa38_98a6_4be5_b860_65ed452b3b09_1_Title"
        $firstSearchResult = $document.getElementByID($firstSearchTermID)
        
        $result =""
        
        #test that the title of the file is equal to the search result
        If ($firstSearchResult.innerHTML -match $testValue)
        {
            $result ="Present"
        }
        else
        {
            $result ="Not Found"
        }
        
        Write-Verbose "Test $result"
        
        #Create a new PS Object for our result and let PowerShell pass it out.
        New-Object PSObject -Property @{
            TestCase = $searchTerm
            Result = $result
        }    
    }
    END {
        #Close the IE window after us
        $ie.Quit()
    }
}

Well to be honest that’s the only tricky bit in the process. From there on in it’s plumbing.
We create some test files:


Function Create-SPTestFiles ()
{
[CmdletBinding()]
Param(
    $filesToCreate,
    [string]$folderPath
    )
    
    If (!(Test-Path $folderPath))
    {
    	#Folder doesn't exist.
        Write-Verbose "Folder does not exist - attempting to create"
    	New-Item $folderPath -type directory
    }

    #if the files don’t exist. Create them
    Foreach ($file in $filesToCreate)
    {
         $fileName = $file[0]
    	$filePath = $folderPath + "\" + $fileName
    	If (Test-Path $filePath)
        {
            Write-Verbose "File $fileName already exists. Skipping"
            Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 103 -EntryType Error -Message "Test content already present."
        }
        else
    	{
            Write-Verbose "Creating $fileName"
    		$file[1] >> $filePath
    	}
    }
    Write-Verbose "All files created"
}

We create a content source (this function isn’t perfect here but i’m stealing it from another script)


Function Ensure-TestContentSourceExists ()
{
    [CmdletBinding()]
    Param(
    $sa,
    [string]$contentSourceName,
    [string]$filePath
    )
    $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName}
    if ($testCS)
    {
        Write-Verbose "Content Source $contentSourceName already exists"
    }
    else
    {
        Write-Verbose "Content Source $contentSourceName does not exist, creating"
        New-SPEnterpriseSearchCrawlContentSource -SearchApplication $sa -Type file -name $contentSourceName -StartAddresses $filePath | Out-Null
        $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName}
    }
    #Output the content source
    $testCS
}

Run the crawl and wait for it to finish.


Function Run-TestCrawl ()
{
    [CmdletBinding()]
    Param ($contentSource)
    #Run a crawl for that content source
    $contentSource.StartFullCrawl()

    #Set a flag to allow us to abort if the duration is excessive
    $stillNotStupidDuration = $true
    $startTime = Get-Date
    $crawlTimeout = 5
    $crawlInitalTime = 2

    Write-Verbose "Starting crawl. Waiting for 2 minutes (Default SharePoint minimum search duration)"
    Sleep -Seconds 120
    #Wait for it to finish
    while ($contentSource.CrawlStatus -ne "Idle" -AND $stillNotStupidDuration -eq $true)
    {
        Write-Verbose "Crawl still running at $timeDifference, waiting 10 seconds"
        Sleep -Seconds 10
        $timeDifference = (Get-Date) - $startTime
        if ($timeDifference.Minutes -gt $crawlTimeout)
        {
            $stillNotStupidDuration = $false
        }
        
    }
    Write-Verbose "Crawl complete"
}

Then we’re back to searching and clean up! Easy.

Of course there’s a little bit more plumbing to be done to stick it all together: so here’s a fully functioning script.

Param (

    #Name of the search service application to test
    $searchAppName = "Search Service Application",

    #Path to the shared folder
    #NOTE: THIS HAS TO BE SETUP BEFORE RUNNING THE SCRIPT MANUALLY (It can be scripted but i haven't)
    $fileSharePath = "\\spintdev\TestFolder",


    #The search page
    $searchSiteURL = "http://sharepoint/sites/search/Pages/default.aspx",

    #Start generating the report
    $reportFolder = "C:\AutomatedTest",
   
    #Flag to set or reject verbose output
    $printVerbose = $false
)


Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue


Function Process-ASTPassFail ()
{
<#Internal helper function. Will be used for reporting#>
Param($collectionThatShuldBeEmpty,
    $failText,
    $passText
    )

    if ($collectionThathouldBeEmpty -ne $null)
     {
        Write-Warning $failText
        Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 102 -EntryType Error -Message $failText
        $thisTestText = $failText + "`n"
    }
    else
    {
        $sucsessText =  $passText
        Write-Host $sucsessText
        Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 102 -EntryType Information -Message $passText
        $thisTestText = $passText + "`n"
    }
    $thisTestText
}


Function Create-ASTFiles ()
{
<#Creates sometest files for us to search later#>
[CmdletBinding()]
Param(
    $filesToCreate,
    [string]$folderPath
    )
    
    If (!(Test-Path $folderPath))
    {
    	#Folder doesn't exist.
        Write-Verbose "Folder does not exist - attempting to create"
    	New-Item $folderPath -type directory
    }

    #if the files don’t exist. Create them
    Foreach ($file in $filesToCreate)
    {
        $fileName = $file[0]
    	$filePath = $folderPath + "\" + $fileName
    	If (Test-Path $filePath)
        {
            Write-Verbose "File $fileName already exists. Skipping"
            Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 103 -EntryType Error -Message "Test content already present."
        }
        else
    	{
            Write-Verbose "Creating $fileName"
    		$file[1] >> $filePath
    	}
    }
    Write-Verbose "All files created"
}

Function Test-ContentSourceCountAcceptable()
{
[CmdletBinding()]
Param($searchServiceApplication)
    
    #Check the maximum number of content sources allowed
    #http://technet.microsoft.com/en-us/library/cc262787(v=office.14).aspx
    $maxContentSources = 50
    
    $ContentSources = $sa | Get-SPEnterpriseSearchCrawlContentSource

    #Lazy way to check if there is only one item (note, also works for none)
    if ($ContentSources.Count -ne $null)
    {
        $CTSourceCount = $ContentSources.Count
    }
    else
    {
        #Note that this might be wrong if there are no CTs. Not a problem here but it's not a rigourous number
        $CTSourceCount = 1
    }

    #if it is below the limit. Stop and throw an error
    if ($count -ge $maxContentSources)
    {
        #Throw error and let slip the dogs of war
        Write-Verbose "Warning content type count is higher than Microsoft Boundaries"
        $false
    }
    else
    {
        #If we're under the MS limit then return true
        $true
    }
}

Function Ensure-ASTContentSourceExists ()
{
<#Check if conent source already exists. This should be re-written to delete it but for development purposes this is more efficient#>
    [CmdletBinding()]
    Param(
    $sa,
    [string]$contentSourceName,
    [string]$filePath
    )
    $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName}
    if ($testCS)
    {
        Write-Verbose "Content Source $contentSourceName already exists. Deleting it."
        Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 100 -EntryType Warning -Message "Unable to create a Content Source as one already exists"
        $testCS.Delete()     
    }
    else
    {
        Write-Verbose "Content Source $contentSourceName does not exist, creating"
        New-SPEnterpriseSearchCrawlContentSource -SearchApplication $sa -Type file -name $contentSourceName -StartAddresses $filePath | Out-Null
        $testCS = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $contentSourceName}
    }
    #Output the content source - Note that this could result in an error state as a pre-existing one might be re-used.
    $testCS
}

Function Run-ASTCrawl ()
{
<#
.SYNOPSIS
Runs a crawl for a content source and waits until it is complete.
.DESCRIPTION
Runs a crawl for a content source and waits for it to complete, features abort option that will exit the function if the crawl takes too long.
#>
    [CmdletBinding()]
    Param (
    [Parameter(Mandatory=$true,ValueFromPipeline=$true)]$contentSource,
    [Parameter(Mandatory=$false,ValueFromPipeline=$false)]$crawlTimeOut = 5
    )
    #Run a crawl for that content source
    $contentSource.StartFullCrawl()

    
    #Start the stopwatch, Note: replace with stopwatch.
    $startTime = Get-Date
    
    #Inital pause time under which there is no point checking for the crawl to be complete
    $crawlInitalTime = 120
    
    #Set a flag to allow us to abort if the duration is excessive
    $stillNotStupidDuration = $true

    Write-Verbose "Starting crawl. Waiting for $crawlInitalTime seconds (Default SharePoint minimum search duration)"
    Sleep -Seconds $crawlInitalTime
    #Wait for it to finish
    while ($contentSource.CrawlStatus -ne "Idle" -AND $stillNotStupidDuration -eq $true)
    {
        Write-Verbose "Crawl still running at $timeDifference, waiting 10 seconds"
        Sleep -Seconds 10
        $timeDifference = (Get-Date) - $startTime
        if ($timeDifference.Minutes -gt $crawlTimeout)
        {
            $stillNotStupidDuration = $false
        }
        
    }
    if ($stillNotStupidDuration)
    {
        Write-Verbose "Crawl complete"
    }
    else
    {
        Write-Warning "No longer waiting for process to complete. Search not finished, results will be unpredictable"
        Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 103 -EntryType Critical -Message "Crawler took longer than the timeout value of $crawlTimeOut so the function exited early."
    }
}


Function Check-ASTPairValue ()
{
<#
.SYNOPSIS
Tests that a search term returns a file with the appropriate name
.DESCRIPTION
Takes a pipeline bound pair of test values and search terms and searches for
them using the searchPageURL page.
Returns either 'Present' or 'Not Found' depending on the result.
.EXAMPLE
$testContent | Check-ASTPairValue -searchPageURL $searchSiteURL 
#> 
    [CmdletBinding()]
    Param (
    [Parameter(Mandatory=$true,ValueFromPipeline=$true)]$testPair,
    [Parameter(Mandatory=$true)]$searchPageURL
     )
    BEGIN{
        #Create the IE window in the begin block so if the input is pipelined we don't have to re-open it each time.
        $ie = New-Object -com "InternetExplorer.Application"
        $ie.visible = $true
    }
    PROCESS
    { 
        #Get the test value and the search term from the pair
        $testValue = $testPair[0]
        $searchTerm = $testPair[1]
        
        #Open the navigation page
        $ie.navigate($searchPageURL)

        #Wait for the page to finish loading
        while ($ie.readystate -ne 4)
        {
            start-sleep -Milliseconds 100
        }
        Write-Verbose "Page loaded"
        
        #Get the search box
        $searchTextBoxID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_InputKeywords"
        $document = $ie.Document
        $searchBox = $document.getElementByID($searchTextBoxID)

        #enter the search terms
        $searchBox.innerText = $searchTerm
        Write-Verbose "Searching for: $searchTerm - Expected result: $testValue"    
        
        #Get the search button
        $searchButtonID = "ctl00_m_g_2f1edfa4_ab03_461a_8ef7_c30adf4ba4ed_SD2794C0C_go"
        
        #Run the search
        $btn = $document.getElementByID($searchButtonID)
        $btn.click()
        
        #Wait for the results to be loaded
        while ($ie.locationurl -eq $searchPageURL)
        {
           start-sleep -Milliseconds 100
        }
        
        Write-Verbose "Left the page, waiting for results page to load"
        #Wait for the results page to load
        while ($ie.readystate -ne 4)
        {
            start-sleep -Milliseconds 100
        }
        Write-Verbose "Results page loaded"
        #Once loaded check that the results are correct
        $document = $ie.document

        #Check that the search term results contain the test results:
        $firstSearchTermID = "SRB_g_9acbfa38_98a6_4be5_b860_65ed452b3b09_1_Title"
        $firstSearchResult = $document.getElementByID($firstSearchTermID)
        
        $result =""
        
        #test that the title of the file is equal to the search result
        If ($firstSearchResult.innerHTML -match $testValue)
        {
            $result ="Present"
        }
        else
        {
            $result ="Not Found"
        }
        
        Write-Verbose "Test $result"
        
        #Create a new PS Object for our result and let PowerShell pass it out.
        New-Object PSObject -Property @{
            TestCase = $searchTerm
            Result = $result
        }    
    }
    END {
        #Close the IE window after us
        $ie.Quit()
    }
}

######################################################################################
#Execution script begins here
######################################################################################


#Generate the output file location
$reportFilePath = $reportFolder+  "\SearchTest_Results_" + (Get-Date -Format "dd_MM_yyyy") + ".txt"

#Name of the search service application to test
$searchAppName = "Search Service Application"

#Path to the shared folder
#NOTE: THIS HAS TO BE SETUP BEFORE RUNNING THE SCRIPT MANUALLY (It can be scripted but i haven't)
$fileSharePath = "\\spintdev\TestFolder"


#The search page
$searchSiteURL = "http://sharepoint/sites/search/Pages/default.aspx"

#Start generating the report
$reportFilePath = "C:\AutomatedTest\SearchTest_Results_" + (Get-Date -Format "dd_MM_yyyy") + ".txt"



#All items from here on in are internal and do not have to be specified or modified unless you wish it.

#test content - deliberately junk and non sensical rubbish to trim down search results and avoid false negatives.
#Note: I have no particular insight or interest in the dietry foibles of the politicans listed below.
$testContent = @(
    ("FileA.txt","Miliband loves pie"),
    ("FileB.txt","Osbourne despises soup"),
    ("FileC.txt","Cameron tolerates beans"),
    ("FileD.txt","Clegg loathes eggs which is ironic"),
    ("FileE.txt","Benn likes red meat"),
    ("FileF.txt","Balls desires flan"),
    ("FileG.txt","Cable adores sandwiches"),
    ("FileH.txt","Hunt regrets cake")
)

#Junk content for an additional test to exclude false positive results
$itemToConfirmFailure =@(
"sdkfslskjladsflkj", "lflkfdskjlfdskjfdslkjf"
"sdkfslsfdjklfkjladsflkj", "lflskjfdslkjf"
)

#Only used internally.
$testCTName = "TestSearchContentType"

$startDateTime = Get-Date
$currentComputerName = $env:computername

#Header info
"Automated SharePoint Search Testing Results`n" >> $reportFilePath
"Test started at $startDateTime on Computer $currentComputerName" >> $reportFilePath
        
#Write the first test to the report
"Test 1 - Confirm search terms do not retrieve values `n" >> $reportFilePath
"Confirms that there are no files that can generate a false positive in the system.`n" >> $reportFilePath

Write-Host "Starting tests, checking that there is no pre-existing content that might cause false positives"
 
#Run a search for the testcontent
$deliberatelyFailedResults = @()
$deliberatelyFailedResults +=  $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose
$falsePositives = $deliberatelyFailedResults | ? {$_.Result -eq "Present"}

$errorText = "Test failed, files found by search engine. Results not reliable"
$sucsessText =  "Test Passed, moving to next stage"

$testText = (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText)
$testText >> $reportFilePath 
#Create the test files based on the array above
Create-ASTFiles -filesToCreate $testContent -folderPath $fileSharePath

#Get the search app
$sa = Get-SPEnterpriseSearchServiceApplication -Identity $searchAppName
if ($sa -eq $null)
{
    Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 101 -EntryType Error -Message "Could not find search application $searchAppName"
}


Write-Host "Checking that we are within guidelines for number of Content Sources"
#Test the number of content sources already in place
$numberOfContentSourcesBelowThreshold = Test-ContentSourceCountAcceptable -searchServiceApplication $sa -Verbose:$printVerbose

#Only progress if we're not going to breach the content type limit.
if ($numberOfContentSourcesBelowThreshold)
{
    Write-Host "Within the Acceptable number of Site Collections"
    #Get the content source.
    $testCS = Ensure-ASTContentSourceExists -sa $sa -contentSourceName $testCTName -filePath $fileSharePath -Verbose:$printVerbose
    
    Write-Host "Running the crawl - estimated completion in approximately 2 minutes"
    #Run the crawl and wait for it to complete
    Run-ASTCrawl -contentSource $testCS -Verbose:$printVerbose

    $searchResults = @()
    Write-Host "Crawl Complete, testing links"
    
    $searchResults += $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose
    $failures = $deliberatelyFailedResults | ? {$_.Result -ne "Present"}
    
    #Write the  test to the report
    "Test 2 - Test new content`n" >> $reportFilePath
    "Confirms that search works for our new content.`n" >> $reportFilePath
    
    $errorText = "Test failed, files were not found"
    $sucsessText =  "Passed main test."
    $failures += (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText)

    #Confirm that the test will fail given junk input.
    $deliberatelyFailedResults = @()
    $deliberatelyFailedResults +=  $itemToConfirmFailure | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose
    $falsePositives = $deliberatelyFailedResults | ? {$_.Result -eq "Present"}
    
    #Write the  test to the report
    "Test 3 - Check for junk terms `n" >> $reportFilePath
    "Confirms that search doens't find some junk values.`n" >> $reportFilePath
    
    $errorText = "Test failed, files found by search engine when given junk data"
    $sucsessText =  "Passed confirmation test - junk values not found"
    $testText = (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText)
    $testText >> $reportFilePath 
    
    
    #Clean up the content source 
    $CSToDelete = $sa | Get-SPEnterpriseSearchCrawlContentSource | ? {$_.Name -eq $testCTName}
    $CSToDelete.Delete()
    
    #Delete the files
    foreach ($combo in $testContent)
    {
        $fileName = $combo[0]
        $file = Get-ChildItem -Path $fileSharePath | ? {$_.name -eq $fileName}
        $file.Delete()
    }
    #Note that the content source may take a minute to be deleted
    Write-Host "Pausing for 1 minute to allow the index to update"
    Sleep -Seconds 60   

    #Run a search for the testcontent
    $deliberatelyFailedResults = @()
    $deliberatelyFailedResults +=  $testContent | Check-ASTPairValue -searchPageURL $searchSiteURL -Verbose:$printVerbose
    $falsePositives = $deliberatelyFailedResults | ? {$_.Result -eq "Present"}

    #Write the  test to the report
    "Test 3 - Confirm search terms are removed`n" >> $reportFilePath
    "Confirms that the test search content is removed from the system.`n" >> $reportFilePath
    
    $errorText = "Test failed, files found by search engine when given junk data"
    $sucsessText =  "Passed confirmation test. Test files are not present"
    $testText = (Process-ASTPassFail -collectionThatShuldBeEmpty $falsePositives -passText $sucsessText -failText $errorText)
    $testText >> $reportFilePath 
}
else
{
    $errorText = "Error - Unable to create a Content Source as the total number of Content Sources is greater than the Microsoft boundary"
    Write-EventLog -LogName "Windows PowerShell" -Source "PowerShell" -EventId 100 -EntryType Warning -Message $errorText
    $errorText >> $reportFilePath 
}

"Automated SharePoint Search Testing Completed at $(Get-Date) `n" >> $reportFilePath 

So there we have it. A fully functioning automated testing process for SharePoint Search. It would be nice if it sent an email but i’m planning on rolling this into some SCOM work i’m playing with.

I haven’t tested this on 2013 yet, it’d need at least some tweaks to field IDs and maybe more structural work to get the Search API right for 2013. If anyone is interested i’ll knock up a new version.

Gooey SharePoint Scripting

Today i’m going to show you how to script Sharepoint through the GUI. Whilst in this example we’ll be running the code on server the same concepts and approach can be used to Script it from any machine that can hit the relevant website…

Our example might seem to be a little forced but it’s based on a real world experience. We had a client who had a fairly complicated Content Type scenario, over 150 Content Types spread over 8 levels of inheritance with untold columns. Then we discovered an issue and needed to publish every single one of those content types. This is the classic example of where PowerShell should be used but awkwardly they’d been burnt with PowerShell publishing before.
As such we had a flat edict, no PowerShell publishing of content types. It must go through the GUI.

A post i’d seen recently by Dr James McCaffrey popped into my head. It was about using PowerShell to automate testing of web applications using PowerShell.
Why not use the same process to automate the publishing of the content types?

The first thing to do is to get ourselves an IE window:

$ie = New-Object -com "Internet Explorer"
#This starts by default in a hidden mode so let's show it
$ie.Visible = $true

This isn’t much use on its’ own so let’s send it to a page. In our case we want to go to the page to publish one of our content types. We know that the publish page itself is an application page that is referenced from a site collection root web with the following URL syntax:

siteCollectionRoot/_layouts/managectpublishing.aspx?ctype=ContentTypeID

Glossing over how to get the ContentTypeID for now we have this:

$pageUrl= "http://sharepoint/sites/cthub/_layouts/managectpublishing.aspx?ctype=0x0100A4CF347707AC054EA9C3735EBDAC1A7C"
$ie.Naviagte($pageUrl)

Now PowerShell moves fast, so we’ll need to wait for Javascript to catch up.

While ($ie.ReadyState -ne 4)
{
	Sleep -Milliseconds 100
}

Now we’re there, let’s get the publish button. Thankfully this button has a consistent ID that we can get using the trusty F12 button in IE.

Image of Identifying an element's ID uwing F12

Identifying an element’s ID uwing F12

The catchily titled “ctl00_PlaceHolderMain_ctl00_RptControls_okButton” button? Depressingly i think i’m starting to see the naming convention behind these ids…

$textBoxID = "ctl00_PlaceHolderMain_ctl00_RptControls_okButton";
#You have to pass the Document into it's own object otherwise it will fail
$document = $ie.Document
$button= $document.getElementByID($buttonID)

And now all we need to do is to click that button:

$button.Click()

Now you might think that we’ve done all we need to do here and slap it into a foreach loop and be done with it. Of course you can’t do that as you need to give IE time to send that request using our good old friend Javascript.
So we wait for the page to re-direct us:

 
While ($ie.locationurl -eq $url)
{
start-sleep -Milliseconds 100
}

Now we can slap it into a foreach loop and with a little bit of work we can come up with something like the code below:

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#URL for the content type hub to use
$CTHubURL= "https://sharepoint/sites/cthub"

#Get the Content Type hub
$site = Get-SPSite $CTHubURL

#Content Types to publish
$ContentAndColumns = @(
("Document Type 1"),
("Document Type 2"),
("Document Type 3")
)


#Open a new IE window
$ie = New-Object -com "InternetExplorer.Application"

#Make the window visible
$ie.visible = $true

#Loop through the content types and publish them
foreach ($contentTypeName in $ContentTypes)
{
    
    Write-Verbose "Processing $ContentTypeName"
    
    #Content types live at the root web
    $web = $site.rootWeb
    #Get the content type using it's name
    $ct =   $web.ContentTypes[$ContentTypeName]
    #Get the GUID for the CT
    $GUID = $ct.ID.ToString()
    #Get the URL for the page based on the content type hub url, the application page that does publishing and the GUID
    $url = $CTHubURL+ "/_layouts/managectpublishing.aspx?ctype=" + $GUID  
    #Go to the page
    $ie.navigate($url)
    #Wait for the page to finish loading
    while ($ie.ReadyState -ne 4)
     {
        start-sleep -Milliseconds 100
     }
     #The ID of the button to press
    $buttonID = "ctl00_PlaceHolderMain_ctl00_RptControls_okButton"
    $document = $ie.Document
    $btn = $document.getElementByID($buttonID)
    #Push the button
    $btn.click()
    #Wait for the page to be re-directed
     while ($ie.locationurl -eq $url)
     {
        start-sleep -Milliseconds 100
     }
     Write-Verbose "Content Type $contentTypeName published"
}

I don’t know about you but there is something deeply neat about sitting at your desk watching IE do the dull task that you were convinced was going to bring your RSI back with a vengance, and in half the time you could do it.

This example might not be useful for that many people but the concept is intriguing. There’s no reason most of this can’t be done without any code on the server at all, the only time we use it is to get the GUIDs and those can be pre-fetched if needs be. Nor does it need any significant rights, as long as the account you use has permision to get into that site collection and publish content types then that’s all they need.

The logical destination of this is Office 365, the scripts and rules for running them on there are limited and limiting, they have to be. But the beauty of Scripting is that we don’t have to be limited by the detail of code, we can use higher level components and tools to worry about that for us. In this case, the GUI that microsoft were kind enough to provide us for when it’s too awkward to find the PowerShell console.