Splitting Office 365 Audit Logs

Office 365 audit exports are quite complicated behind the scenes. If you look at the data in CSV format you’ll quickly see there’s four fields; Time, User, Action and Detail.

The fun is in the Detail field, it’s really a JSON object with all of the interesting data that the audit log holds. Whilst it’s possible to use Excel to expand those objects and convert them into usable CSV content it’s a bit clumsy.

The script below is a first draft at a script to split the content up into separate logs for future analysis.

<#
Notes: This function is not optimised for large files. 

For large files it would be desirable to batch write actions so that no more than (for example)
100 rows are held in memory at once. Immediate write activities are possible but are suspected
to result in excessive disk write activity.
#>
Function Split-O365AuditLogs ()
{
Params(
    $sourcePath,
    $resultsFolderPath
)
    #Get the content to process
    $content = Import-Csv $sourcePath
    $datumArray = @();

    foreach ($line in $content)
    {
   
        #Details are encoded as JSON data.
        $expandedDetails = $line.Detail | ConvertFrom-Json

        #add the non JSON parameters
        Add-Member -InputObject $expandedDetails -MemberType NoteProperty -Name "Time" -Value $line.Time
        Add-Member -InputObject $expandedDetails -MemberType NoteProperty -Name "User" -Value $line.User
        Add-Member -InputObject $expandedDetails -MemberType NoteProperty -Name "Action" -Value $line.Action
        $datumArray += $expandedDetails
    }
    #Build a list of unique actions
    $actions = $datumArray | select Action -Unique -ExpandProperty Action

    foreach ($action in $actions)
    {
    
        $datumArray | ? {$_.Action -eq $action} | ConvertTo-Csv | Out-File -FilePath ("{0}\{1}.csv" -f $resultsFolderPath, $Action) -Append
    }
}

There’s plenty of room for improvement. It’s highly likely that the UserType field is the key to exporting to a more concise set of exports that share common fields. If anyone’s interested i’ll give it another go.

Exporting Office 365 audit logs

Open: https://protection.office.com/#/homepage

Login with your tenant admin account. It may be that you can use an account with less excessive rights, however I can’t test that at this moment.

Office 365 Protection Center

In the Protection Center, click on Reports in the left hand navigation pane, then click on View Reports

Office 365 Protection Center Reports
Then click on Office 365 audit log report in the main body of the page

This will open a popup that can show you the audit log.

Select an appropriate Start date, normally 1 month in the past. By default it will start with only the last week selected and will not allow you get more than a month of data at one time, probably to reduce the data size to something more manageable.
(Correction: You can do more than a month, I just hit issues because I selected the day I started logging, which failed because I started logging at mid-day but asked for logs from the start of the day)
Select date range for audit report

Click Search to get all the audit logs for that period.

Scroll to the top right corner, in the ridiculously small window that you can’t even re-size in my older version of IE.

Click on Export Results and in the dropdown click on Download all results

Export results from Office 365
This will give you a file download dialog
Confirmation save box

Save the file, or Save As to pick the location

It will then download the file, this may take some time, in limited tests I have seen ~100KB/sec transfer and file sizes of approximately 4-6MB per user per month.

Zip the file up before sending it anywhere, the content is highly compressible with roughly 95% reduction in size.
Compressed and uncompressed log files

Creating SharePoint Site Collection through PowerShell CSOM

This is based on another blog post here: http://blog.scoreman.net/2013/02/create-site-collections-in-sharepoint-online-using-csom/. In that article the author shows how to use the CSOM with C# to create a Site Collection in Office 365.

This script is a near direct translation of that script into a PowerShell version of the code. I’ve also liberally taken inspiration from Chris O’Brien’s excellent series of posts on SharePoint PowerShell and CSOM here: http://www.sharepointnutsandbolts.com/2013/12/Using-CSOM-in-PowerShell-scripts-with-Office365.html

#Add the dlls required for working with Office 365
Add-Type -Path "C:\Program Files\Common Files\microsoft shared\Web Server Extensions\15\ISAPI\Microsoft.SharePoint.Client.dll"  
Add-Type -Path "C:\Program Files\SharePoint Online Management Shell\Microsoft.Online.SharePoint.PowerShell\Microsoft.Online.SharePoint.Client.Tenant.dll"

#URLs and prerequisites
$adminSiteUrl = "<Admin URL>"
$newsiteUrl = "<URL of Site Collection to Create>"
$username = "<username"
$password = Read-Host "Please enter your Password" -AsSecureString

Write-Host "Establishing Connection to Office 365."
#Get the context and feed in the credentials
$ctx = New-Object Microsoft.SharePoint.Client.ClientContext($adminSiteUrl) 
$credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($username, $password)  
$ctx.Credentials = $credentials 

Write-Host "Now configuring the new Site Collection"

#Get the tenant object
$tenant = New-Object Microsoft.Online.SharePoint.TenantAdministration.Tenant($ctx)

#Set the Site Creation Properties values
$properties = New-Object Microsoft.Online.SharePoint.TenantAdministration.SiteCreationProperties
$properties.Url = $newsiteUrl
$properties.Template =  "STS#0"
$properties.Owner = $username
$properties.StorageMaximumLevel = 1000
$properties.UserCodeMaximumLevel = 300

#Create the site using the properties
$tenant.CreateSite($properties) | Out-Null

Write-Host "Creating site collection"
#Create the site in the tennancy
$ctx.ExecuteQuery()
Write-Host "Site Creation request completed. Note that the creation process is asynchronous and provisioning may take a short while."

I’ve tested this on Office 365 but haven’t tried it with On-Premise SharePoint 2013 so far.

SharePoint database growth settings

This is a basic topic but one that crops up time and time again.

By default SharePoint will create databases with settings to grow 1 MB at a time. That means that if you add a 5 MB file the database will grow 5 times to fit it in. If you add a 250 MB file (the default largest size file for a 2013 farm) that means a worrying 250 growth operations will be needed.

Why is growth bad?
During a growth operation the database is completely unresponsive. That means that any access at all, read only or not, will have to wait for the growth to complete. That will result in slower responses for users and reduce the number of users that your environment can support at any one time. As such growth is bad.

How long does growing take
It varies depending on your hardware. In my test environment, a Hyer-V machine with a dynamically growing VHDX on an SSD I got the following numbers. I’m going to put up a post on how I got these numbers sometime soon, i’ll update this post once I’ve done so.

Growth Size / MB Average time / ms Average time per MB
1 13.40 13.40
10 46.11 4.61
100 328.17 3.28
1000 2480.67 2.48

If we plot that data on a chart we get a clearer image

Graph of autogrowth duration

Graph of autogrowth duration


The duration for growth operations scales with the size, however the change isn’t quite linear, there is an overhead to each growth that makes a single larger growth more efficient. As such it’s best to grow as few times as possible with larger individual growth operations.

On the other hand as you can see a single growth operation can take a significant amount of time. Growing by 1 GB took around 2 and a half seconds. That is a significant delay and whilst it will be infrequent your users may well notice, especially in a heavily used site.

For a 250 MB file the 1 MB growth rate would take 3.25 seconds. A 10 MB growth rate gives 1.15, 100 MB comes in at 0.98 and 1000 MB at 2.48 seconds (but you only have to grow every fourth item).

Category Example Growth Rate
Default 1 MB
Max File Size 250 MB
Large Fixed 1000 MB
Nightly 2000 MB
Pre-grow N/A

Default

This is more or less the worst possible choice. Every time a database grows it needs to lock the entire system. Whilst that might only be for a short time for each one the total time is excessive. As we saw this will take around 3.25 seconds to perform all the growth needed for a 250 MB file. The only redeeming feature of this setting is that it is very efficient for storage space.

Max File Size

By setting the file growth rate to the web application maximum file size you can ensure that only a single growth operation will happen for any file upload. This gives a relatively low individual growth duration which means the disruption for a single file will be as small as possible. On the other hand it will result in a lot more individual growth operations than a larger size. That means that in a highly used site you may have statistically significant and detectable amounts of downtime.

Large Fixed

By setting the autogrowth setting to an arbitrarily large size, eg. 1 GB, you can minimise the number of growth operations that need to occur. This has a much lower total disruption than the Default settings or setting to the maximum file size but when it does occur the delay can be significant. This is a popular setting.

Nightly Growth

As an alternative to autogrowth settings you can grow the databases ahead of the rate that users put data into them. To do this your SQL management team would set an alert on the database fill rate to ensure that there is more free space available than can reasonably be expected to be used in a day. If the database drops below that point then your SQL admins will grow the databases manually during the next scheduled downtime or period of low load.
This avoids/minimises any of the aforementioned disruption to users when growth occurs but requires an alert and capable SQL team. If your SQL team miss the alert then either a Max File Size or Large Fixed policy should be in place as backup.

Pre-grow

The most performant option is to grow your databases to their final size on creation. This requires a highly predictable environment with site collection quotas in use to prevent the databases growing beyond their expected size. It’s also a very storage-space inefficient way to manage your databases.

Summary

Which option you should use depends on where you’re putting SharePoint in. My preference is to set the growth settings to the maximum file size for the web application it is attached to. This provides the ideal blend of minimising total growth time whilst avoiding large delays when larger growth operations are needed.
If you have a good SQL team behind you to grow the databases nightly then that is an even better way to avoid disruption to users. However that does require the SQL team to be available and able to assist, if this isn’t available or you’re in a really high usage environment then you should pre-grow the databases up front and never have to suffer growth again.

SharePoint Surveys and the mystery of the missing partial response

Note: this applies to SharePoint 2010 and 2013, it is not viable for SharePoint online / Office 365.

Fun fact, SharePoint, if you’ve got branching logic in your survey, allows you to save your response to a survey!

This is great news. However on TechNet someone asked if it were possible to find these incomplete posts to remind the user that they haven’t finished?

Normally you’d assume that all you’d need to do is log in with a site collection admin account and you’d magically be able to see all the responses, even those that haven’t been completed. However in this case you’d be wrong. Even with Site Collection admin rights those incomplete responses are hidden from you.

Now that data is still in the SharePoint list and rather than mess about with permissions and settings, which could have unforeseen consequences, let’s see if we can’t pull it out of the list with PowerShell?

The first thing to do is see if we can get the real number of items from the survey. The easiest way to do that is to check the item count.

$web = Get-SPWeb "URL"
$survey = $web.Lists["SurveyName"]
Write-Host "Items in the survey : " $survey.ItemCount

That gives you the number of items, which you can compare to those visible. In my case I had one completed and one partial, leading to a total count of two. Which the script agreed with.

When you look at the object in PowerShell you can see that there’s a few fields that are different between a completed and a partial response. In this case we have two that are of interest.

Image showing different HasPublishedVersion and Level for two items.

The differences between a visible, completed item (1) and an incomplete item (2)

Of the two i’m going to use the ‘HasPublishedVersion’ field. Live the ‘Level’ field it will become true as long as a version has been completed, however unlike Level it will remain true even if the user starts to edit it later and somehow manages to do a partial save.

Let’s extend our script to list out the number in each group, then list the users who created them.

$web = Get-SPWeb "http://portal.tracy.com/sites/GD/"
$survey = $web.lists["Survey"]
$unPublishedEntries = $survey.Items | ? {-not $_.HasPublishedVersion}

Write-Host "Surveys in list: " $survey.ItemCount
Write-Host "Of which entries are incomplete: " $unPublishedEntries.Count

Foreach ($entry in $unPublishedEntries)
{
    Write-Host "User: {0} has not completed their entry" -f $entry["Created By"]
}

And voila, my results give us:

PowerShell results showing the results from a script

Printout from script

Cataloging Choice Columns

It’s been a while but it’s time for a new post.

Someone asked on TechNet () how to get a summary of all the custom choice columns and the options they have. I didn’t have the time to put it together at that moment but I thought It’d be a good exercise.

This will work for 2010 or 2013 but will not work for Office 365 or SharePoint online. For that a different approach would be needed.

“I inherited a SharePoint 2010 Foundation site that contains about 40 custom columns and about 10 of those custom columns are of the type “Choice”. Is there a way using Powershell or something else to export to a .csv file a list of the custom columns and if they are the type “choice” to show the list of what the various choices are for each column?”

So, let’s assume that we’re only interested in the site columns. To do that we’ll have to grab the SPWeb object and loop through the columns there that are of the appropriate type and list them out.

#Get the web
$web = Get-SPWeb "http://sharepoint.domain.com/sites/sitecollection/subsite"

#Get the columns (note, these are named fields)
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice"}

#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Loop through each choice and print the name
foreach ($entry in $columns)
{
Write-Host ("Field Name: {0}" -f $entry.InternalName)
#Loop through the choices and print those out
foreach ($choice in $entry.Choices)
{
Write-Host ("  Choice: {0}" -f $choice)}
}

From here

That’ll list out the columns to the screen but it’s not a great solution. It’s printing out too many columns, it’s also just printing them to the screen. We need it to serialise this into a format that we can use.

Let’s start with serialisation.

There’s loads of ways to do this but my preference is to create custom objects to contain the information we collect, then assign them to an array which we can process later.

For a quick guide to PSObjects have a look here
So, after changing the write-hosts to write verbose and putting in our custom objects we get this!

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#Get the web
$web = Get-SPWeb "http://portal.tracy.com/sites/cthub"

#Get the columns (note, these are named fields) 
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice"}

#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Create empty array
$ColumnDetailsArray = @()

#Loop through each choice column and create an object to hold it
foreach ($entry in $columns)
{
    $choicesArray = @()
    Write-Verbose ("Field Name: {0}" -f $entry.InternalName)
    
    #Loop through the choices and print those out
    foreach ($choice in $entry.Choices)
    {
        #Add each choice to the (local) array
        Write-Verbose ("  Choice: {0}" -f $choice)
        $choicesArray += $choice
    }
    #Create a result object to hold our data
    $ColumnDetailsArray += New-Object PSObject -Property @{
                        "Name" = $entry.InternalName
                        "Choices" = $choicesArray
                        }
}

Which actually makes things worse as we no longer get any results! Let’s add that in some xml work. I’m still not entirely happy with the way PowerShell and XML work together so this example is a bit clunky but it works.

#Create a starter XML for the system to work with
[xml] $xml = "<root><summary rundate='{0}' web='{1}'/></root>" -f 
    (Get-Date -f ddMMyyyy), 
    $web.Title

#loop through the results and genearate an xml object
foreach ($column in $ColumnDetailsArray)
{
    #Create an element to hold the top level item
    $columnElement = $xml.CreateElement("Choice")
    $columnElement.SetAttribute("Name", $column.Name) 

    #Loop through the choices and add entries for each
    foreach ($choice in $column.Choices)
    {
        $choiceElement = $xml.CreateElement("Choice")
        
        #Note that you need this Pipe Out-Null to prevent it writing to the console
        $choiceElement.InnerText = $choice
        $columnElement.AppendChild($choiceElement) | Out-Null
    }
    #Once it's built add the element to the root node
    $xml.root.AppendChild($columnElement)  | Out-Null
}
$xml.Save("C:\ResultFolder\ColumnSummary.xml")

So, this now dumps out the data we’ve asked for but it also dumps out all the pre-packaged columns. This is a problem that isn’t easily fixed, there isn’t an ‘OOTB’ flag on fields but there are a few we can use to filter them out.

If we grab a column from the results and run Get-Members on it there are a couple of fields that should be useful for filtering the results:

Sealed – This shows if the column is not supposed to be edited by human hands. Note that this could give false negatives in scenarios where a column has been deployed via the CTHub which I think seals columns (it definitely seals Content Types) as it pushes to consuming site collections

Hidden – Not relevant in this case but often handy. In this case we’ll filter out groups that are part of the ‘_hidden’ group earlier.

So if we now add that criteria to the earlier $columns filtering process we get

$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice" -and 
        -not $_.Sealed -and $_.Group -ne "_Hidden"
    }

But that’s still not perfect, so instead of filtering the terms out right now let’s make it a bit more useful first. When you look at columns in SharePoint they are organised in Groups. We can see those properties in PowerShell and group our elements using that field.

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#Get the web
$web = Get-SPWeb "http://portal.tracy.com/sites/cthub"

#Get the columns (note, these are named fields) 
#Also filter out the sealed fields
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice" -and 
        -not $_.Sealed -and $_.Group -ne "_Hidden"
    }


#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Create empty array
$ColumnDetailsArray = @()

#Loop through each choice column and create an object to hold it
foreach ($entry in $columns)
{
    $choicesArray = @()
    Write-Verbose ("Field Name: {0}" -f $entry.InternalName)
    
    #Loop through the choices and print those out
    foreach ($choice in $entry.Choices)
    {
        #Add each choice to the (local) array
        Write-Verbose ("  Choice: {0}" -f $choice)
        $choicesArray += $choice
    }
    #Create a result object to hold our data
    $ColumnDetailsArray += New-Object PSObject -Property @{
                        "Name" = $entry.InternalName
                        "Group" = $entry.Group
                        "Choices" = $choicesArray
                    }
}

#Create a starter XML for the system to work with
[xml] $xml = "<root><summary rundate='{0}' web='{1}'/></root>" -f 
    (Get-Date -f ddMMyyyy), 
    $web.Title

#Get a unique list of the groups in use in the site
foreach ($group in $ColumnDetailsArray | select Group -Unique)
{
    $groupText = $group.Group
    Write-Host "Group name: " $groupText
    $groupElement = $xml.CreateElement("Group")
    $groupElement.SetAttribute("Name", $groupText)
    
    #loop through the results and add them to the xml object
    foreach ($column in $ColumnDetailsArray | ? {$_.Group -eq $groupText})
    {
        #Create an element to hold the top level item
        $columnElement = $xml.CreateElement("Choice")
        $columnElement.SetAttribute("Name", $column.Name)

        #Loop through the choices and add entries for each
        foreach ($choice in $column.Choices)
        {
            $choiceElement = $xml.CreateElement("Choice")
        
            #Note that you need this Pipe Out-Null to prevent it writing to the console
            $choiceElement.InnerText = $choice
            $columnElement.AppendChild($choiceElement) | Out-Null
        }
        #Once it's built add the element to the root node
        $groupElement.AppendChild($columnElement)  | Out-Null
    }
    $xml.root.AppendChild($groupElement)  | Out-Null
}
$xml.Save("C:\ResultFolder\ColumnSummary.xml")

Of course once you have the group name you can filter those options out by using a blacklist of groups to avoid reporting on.


Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

#BlackList Group Names
#These are the known groups you get in a non publishing team site:
$blackList = @(
    "_Hidden",                                                                         
    "Base Columns",                                                                    
    "Content Feedback",                                                   
    "Core Contact and Calendar Columns",                                               
    "Core Document Columns",                                                         
    "Core Task and Issue Columns",                                                          
    "Custom Columns",                                                            
    "Display Template Columns",                                                          
    "Document and Record Management Columns",                                                
    "Enterprise Keywords Group",                                                             
    "Extended Columns",                                                             
    "JavaScript Display Template Columns",                                                   
    "Reports",                                                                  
    "Status Indicators"
)

#Get the web
$web = Get-SPWeb "http://portal.tracy.com/sites/cthub"

#Get the columns (note, these are named fields) 
#Also filter out the sealed fields
$Columns = $web.Fields | ? { $_.TypeAsString -eq "Choice" -and 
        -not $_.Sealed -and $_.Group -ne "_Hidden"
    }


#Print out the number of columns
Write-Host "Number of columns found: " $Columns.Count

#Create empty array
$ColumnDetailsArray = @()

#Loop through each choice column and create an object to hold it
foreach ($entry in $columns)
{
    $choicesArray = @()
    Write-Verbose ("Field Name: {0}" -f $entry.InternalName)
    
    #Loop through the choices and print those out
    foreach ($choice in $entry.Choices)
    {
        #Add each choice to the (local) array
        Write-Verbose ("  Choice: {0}" -f $choice)
        $choicesArray += $choice
    }
    #Create a result object to hold our data
    $ColumnDetailsArray += New-Object PSObject -Property @{
                        "Name" = $entry.InternalName
                        "Group" = $entry.Group
                        "Choices" = $choicesArray
                    }
}
#Create a starter XML for the system to work with
[xml] $xml = "<root><summary rundate='{0}' web='{1}'/></root>" -f 
    (Get-Date -f ddMMyyyy), 
    $web.Title

foreach ($group in $ColumnDetailsArray | select Group -Unique)
{
    $groupText = $group.Group

    #Check to see if the group name is in our blacklist
    if (-not $blackList.Contains($groupText))
    {
        Write-Verbose "Group name: " $groupText
        $groupElement = $xml.CreateElement("Group")
        $groupElement.SetAttribute("Name", $groupText)
    
        #loop through the results and genearate an xml
        foreach ($column in $ColumnDetailsArray | ? {$_.Group -eq $groupText})
        {
            #Create an element to hold the top level item
            $columnElement = $xml.CreateElement("Choice")
            $columnElement.SetAttribute("Name", $column.Name)

            #Loop through the choices and add entries for each
            foreach ($choice in $column.Choices)
            {
                $choiceElement = $xml.CreateElement("Choice")
        
                $choiceElement.InnerText = $choice
                #Note that you need this Pipe Out-Null to prevent it writing to the console
                $columnElement.AppendChild($choiceElement) | Out-Null
            }
            #Once it's built add the element to the root node
            $groupElement.AppendChild($columnElement)  | Out-Null
        }
        $xml.root.AppendChild($groupElement)  | Out-Null
    }
    else
    {
        Write-Verbose "Group skipped:" $groupText
    }
}
$xml.Save("C:\ResultFolder\ColumnSummary.xml")

And there you have it. A working report that will summarise all custom choice columns in a SPWeb object and save them in an XML file.

Changing Modified, and Created details in SharePoint

Sometimes you need to lie to SharePoint. In this post i’ll show you how to change the details for who created an item, modified it and when they modified it.

When you’re doing bulk uploads, dealing with lists where you wish to use the Advanced features of only allowing users to edit their own items or just testing some behaviour, eventually you’ll wish you can change the values that SharePoitn doesn’t let you change.

The first thing is, as always, to find the value we want to change:

#Add the SharePoint snapin
Add-PSSnapin Microsoft.SharePoint.Powershell -ea SilentlyContinue

#set the web url and the list name to work upon
$url = "http://sharepoint/sites/cthub"
$listName = "Shared Documents"
$fileName = "FileName.xlsx"

#Get the appropriate list from the web
$web = get-SPWeb $url
$list = $web.lists[$listName]

#Get the file using the filename
$item = $list.Items | ? {$_.Name -eq $fileName}

#Print out current Created by and Created date
Write-Output ("item created by {0} on {1}" -f $item["Author"].tostring(), $item["Created"] )

#Print out current Created by and Created date
Write-Output ("item last modified by {0} on {1}" -f $item["Editor"].tostring(), ($item["Modified"] -f "dd-MM-yyyy"))

As you can see we access the item properties by treating the $item as a hashtable and use the property name as the key.

#Set the created by values
$userLogin = "ALEXB\AlexB"
$dateToStore = Get-Date "10/02/1984"

$user = Get-SPUser -Web $web | ? {$_.userlogin -eq $userLogin}
$userString = "{0};#{1}" -f $user.ID, $user.UserLogin.Tostring()

#Sets the created by field
$item["Author"] = $userString
$item["Created"] = $dateToStore

#Set the modified by values
$item["Editor"] = $userString
$item["Modified"] = $dateToStore


#Store changes without overwriting the existing Modified details.
$item.UpdateOverwriteVersion()

Setting the value is a bit more complicated. To do that you have to build the appropriate user string. In my example the user is already part of the site, if they haven’t previously been added to the user information list you’ll need an extra step here to insert them.

The second bit which differs from your usual PowerShell update is the use of the UpdateOverwriteVersion() method. There’s several update methods in SharePoint but only this one will preserve your changes to modified by and created by.

And now, the full script:

<#
Title: Set modified and created by details
Author: Alex Brassington
Category: Proof of Concept Script
Description
This script is to show how to read, modify and otherwise manipulate the created by and modified by details on documents.
This is to enable correction of incorrect data as part of migrations. It is also useful to enable testing of retention policies.
#>

#Add the SharePoint snapin
Add-PSSnapin Microsoft.SharePoint.Powershell -ea SilentlyContinue

#set the web url and the list name to work upon
$url = "http://sharepoint/sites/cthub"
$listName = "Shared Documents"
$fileName = "FileName.xlsx"

#Get the appropriate list from the web
$web = get-SPWeb $url
$list = $web.lists[$listName]

#Get the file using the filename
$item = $list.Items | ? {$_.Name -eq $fileName}

#Print out current Created by and Created date
Write-Output ("item created by {0} on {1}" -f $item["Author"].tostring(), $item["Created"] )

#Print out current Created by and Created date
Write-Output ("item last modified by {0} on {1}" -f $item["Editor"].tostring(), ($item["Modified"] -f "dd-MM-yyyy"))

#Set the created by values
$userLogin = "ALEXB\AlexB"
$dateToStore = Get-Date "10/02/1984"

$user = Get-SPUser -Web $web | ? {$_.userlogin -eq $userLogin}
$userString = "{0};#{1}" -f $user.ID, $user.UserLogin.Tostring()


#Sets the created by field
$item["Author"] = $userString
$item["Created"] = $dateToStore

#Set the modified by values
$item["Editor"] = $userString
$item["Modified"] = $dateToStore


#Store changes without overwriting the existing Modified details.
$item.UpdateOverwriteVersion()

Deleting Versions

Someone asked for a script that could delete previous versions for SharePoint 3.0. I don’t have a 3.0 dev environment but I do have a 2010 build and it interested me.

Function Delete-SPVersions ()
{
[CmdletBinding(SupportsShouldProcess=$true)]
param(
    [Parameter(Mandatory=$True)][string]$webUrl, 
    [Parameter(Mandatory=$True)][string]$listName, 
    [Parameter(Mandatory=$False)][string]$numberOfMajorVersions
    ) 

    #Get the web
    $web = Get-SPWeb $webUrl


    #Get the list
    $list = $web.Lists[$listName]

    $list.Items | % {
        #Get the item in a slightly more usable variable
        $item = $_

        Write-Output ("Deleting versions for item {0}" -f $item.Name)
        
        #Get all major versions
        #Note, the version ID goes up by 512 for each major version.
        $majorVersions = $item.Versions | ? { $_.VersionID % 512 -eq 0}

        #get the largest version number
        $latestMajorID = $majorVersions | select VersionID -ExpandProperty VersionID | sort -Descending | select -First 1

        #Slightly lazy way to get the latest major version and format it as a single point decimal
        Write-Output ("   Latest major version to retain is {0:0.0}" -f ($latestMajorID /512))
        
        #Filter the major versions to only those which are lower than the highest number - 512 * $numberOfMajorVersions
        $majorVersionsToDelete = $majorVersions | ? {$_.VersionID -le ($latestMajorID - 512 * $numberOfMajorVersions)}
        if ($majorVersionsToDelete)
        {
            $majorVersionsToDelete | % {
                Write-Verbose ("  Deleting major version {0}" -f $_.VersionLabel)
                if ($pscmdlet.ShouldProcess($_.VersionLabel,"Deleting major version"))
                {
                    $_.Delete()
                }
            }
        }
        else
        {
            Write-Verbose "No major versions to delete"
        }
        
        #Re-fetch the item to ensure that the versions are still valid
        $item = $list.GetItemByUniqueId($item.UniqueId)
        
        #Get all the minor versions
        $minorVersions = $item.Versions | ? { $_.VersionID % 512 -ne 0}

        #Delete Minor versions greater than the last major version kept
        $minorVersionsToDelete = $minorVersions | ? {$_.VersionID -lt $latestMajorID}
        if ($minorVersionsToDelete)
        {
            $minorVersionsToDelete | % {
                Write-Verbose ("Deleting minor version {0}" -f $_.VersionLabel)
                if ($pscmdlet.ShouldProcess($_.VersionLabel,"Deleting minor version"))
                {
                    $_.Delete()
                }
            }
        }
        else
        {
            Write-Verbose "No minor versions to delete"
        }
    }
    $web.Dispose()
}

Failed Search Scripting

Sometimes knowing what doesn’t work is as useful as what does work. In that vein here’s how I spent my journey home…

A post on technet asked about how to deal with long running search crawls that were impacting users when they overran into business hours. In large SharePoint environments that shouldn’t really happen but it’s a fairly common concern in smaller shops.

Ideally you’ll tune your search so that it always completes in time but that doesn’t always work. For those edge cases there’s two options:

  1. Pause a specific (or all) crawl(s) during working hours.
  2. Reduce the impact of the crawls during working hours

Pausing a crawl is easy, it’s also done very well by other people such as:
Ed Wilson

I wanted to drop the performance of the crawl down so that it can still keep going but not impact the end users.

The first step was to find out how to create a crawl rule to reduce the impact of the search

$shRule = Get-SPEnterpriseSearchSiteHitRule –Identity "SharePoint"

#Cripple search
$shRule.HitRate = 1000
$shRule.Behavior = 'DelayBetweenRequests'
$shRule.Update()

#Revive search
$shRule.HitRate = 8
$shRule.Behavior = 'SimultaneousRequests'
$shRule.Update()

It turns out that in the API a crawl rule is known as a hit rule. Hit rules have two important values, the ‘rate’ and the behaviour.

The script above was enough to let me create a rule and set it to either run at a normal page or with a 16 minute delay between requests. And it worked!

Well, it created a rule and that rule worked. Sadly i’d forgotten that the crawler rules are only checked when you start a crawl. If you start a crawl then punch the delay between items up to 1000 it won’t make a blind bit of difference.

It turns out that pausing the crawl doesn’t make the search engine re-check the crawl rate.

So, a failure. The only thing i can think of is using reflection to find out what is happening in the background and then doing something deeply unsupported to modify the values in flight. Maybe another time.

Check Blob Cache settings across a farm

A routine but really annoying task came up today. The status of BLOB caching for all Web Applications, normally I’d hop into the config files and check but these are spread over 6 farms.
To make things worse each farm had between 4 and 5 web apps and between 1 and 3 WFEs. In total that should have meant checking in the region of 50 config files.

Sod that.

Getting data from the config file is easy, after all it’s just an xml document. Once we’ve got the document it’s a nice easy task to suck the actual values out:

	$xml = [xml](Get-Content $webConfig)
           $blobNode =  $xml.configuration.SharePoint.BlobCache
            
           $props = @{
                'Location' = $blobNode.location;
                'MaxSize' = $blobNode.maxsize;
                'Enabled' = $blobNode.enabled
            }
           New-Object -TypeName PSObject -Property $props
That on it's own isn't that helpful. After all we don't just want to get the values for ONE config file, we want to get them for something like 50. Also I don't really want to have to list each one out, I just want them all... Through some rather excessive PowerShelling I know that we can get the config file's physical path from the IIS settings on a web application:
$WebApp = Get-SPWebApplication "http://sharepoint.domain.com" 
$WebApp.IISSettings
#This is horrible code but I haven't found a better way to get the value in a usable format.
$physicalPath = ($settings.Values | select path -Expand Path).Tostring()

That will give us the physical path to the folder containing the web.config file. It's only a small leap to make it loop through all the locations…

#Script to check the blob caching configuration in all webconfig files in your farm on all servers.

Add-PSSnapin Microsoft.SharePoint.PowerShell -ea SilentlyContinue

$outputFile = "C:\Folder\TestFile.txt"

#This is a rubbish way of getting the server names where the foundation web app is running but i can't
#find a better one at the moment.
$serversWithFWA =Get-SPServiceInstance `
    | ? { $_.TypeName-eq "Microsoft SharePoint Foundation Web Application" -AND $_.Status -eq "Online"} `
    | Select Server -ExpandProperty Server | select Address

	
	
Get-SPWebApplication | %{
 
    $configResults = @()
	$webAppName = $_.name
    Write-Host "Webapp $webAppName"
    
    #Get the physical path from the iis settings
    $settings =$_.IISSettings
    $physicalPath = ($settings.Values | select path -Expand Path).Tostring()
     
    #foreach server running the Foundation Web Application.
    foreach ($server in $serversWithFWA)
    {
        #Build the UNC path to the file Note that this relies on knowing more or less where your config files are
        #This could be improved using regex.
        $serverUNCPath = $physicalPath.Replace("C:\",("\\" + $server.Address + "\C$\"))
        $webConfig = $serverUNCPath +"\web.config"
 
        #If the file exists then try to read the values
        If(Test-Path ($webConfig))
        {
            $xml = [xml](Get-Content $webConfig)
            $blobNode =$xml.configuration.SharePoint.BlobCache

            $props = @{
				'Server' = $server.Address;
                'Location' = $blobNode.location;
                'MaxSize' = $blobNode.maxsize;
                'Enabled' = $blobNode.enabled
            }
            $configResults += New-Object -TypeName PSObject -Property $props
        }
    }
	#Print the results out for the GUI
	$webAppName
    $configResults | ft
	
	#Output the data into a useful format - start by printing out the file name
	$webAppName >>  $outputFile
	#CSV because the data is immeasurably easier to load into a table etc. later. HTML would be a good alternative
	$configResults | ConvertTo-CSV >> $outputFile
}