Function: Search Files in Datastores with PowerCLI vimdatastore PSProvider

From a very simplistic point of view, in vSphere datastores are like "hard drives" where VM files are stored. Normally, there is very little direct interaction between users or administrators with datastore files, they are managed via API by vCenter, ESXi and ultimately by the users who log into the vSphere Web Client. However, there may be situations where you need to access a datastore directly to manage or look for files, these are two  potential scenarios:

  1. You upload an ISO, OVA / OVF, or any other file that makes sense, then you forget where it is.
  2. A VM is unregistered from inventory, then you need to either re-add it or delete the orphaned files but you can't remember in which datastore it originally was. Assuming there are at least 10 datastores, this might be a little cumbersome.

For times like these, there is a very convenient way to access datastores using a PSProvider, vimdatastore. This provider installed with PowerCLI allows to browse datastores just like cmd (and PowerShell) navigates the filesystem, as opposed to the GUI method which requires plenty of clicking and scrolling.

So, today I would like to share a simple function that takes advantage of the ability provided by vimdatastore to search for files using the Filter parameter of the Get-ChildItem cmdlet. As a prerequisite, make sure PowerCLI is installed and that you have an active connection to a vCenter. If PowerCLI is installed and correctly loaded by PowerShell, Get-PSProvider should return something similar to this:

Notice how VimDatastore is not listed the first time Get-PSProvider is run, then after a random PowerCLI cmdlet is executed, the VMware modules are loaded and the second time Get-PSProvider lists VimDatastore in the output (as well as VimInventory which is out of the scope of this post).

Working with vimdatastore

To access the datastores via vimdatastore provider, we need to type the following:

Set-Location vmstore:

The prompt changes to 'PS vmstore:\>'. From here on, everything is case sensitive until we leave the vimdatastore provider, in other words, 'MyDS' is not the same as 'myds'.

After that, we would use Set-Location or its alias, 'cd' to get to the datastores, but the first level is the datacenter, so we need to type cd <datacenter name>, e.g. 'cd MyDC'. At the datacenter level, if we run the ls command (alias for Get-ChildItem) all the datastores would be listed. Now you we can simply type cd <datastore name>, e.g. 'cd MyDS' and the location would change to the root of the datastore. Let's take a look at how the output of these commands would look in the console.

PS C:\> Get-Datacenter

Name                                    
----                                    
MyDC                              

PS C:\> cd vmstore:
PS vmstore:\> cd MyDC
PS vmstore:\MyDC> ls

Name                           Type                 Id             
----                           ----                 --             
ma-ds-52558a80-2928b538-73f... VmfsDatastore        Datastore-da...
ma-ds-52e72c34-693e519a-30a... VmfsDatastore        Datastore-da...
Datastore_01                   VmfsDatastore        Datastore-da...
Datastore_02                   VmfsDatastore        Datastore-da...
Datastore_03                   VmfsDatastore        Datastore-da...
Datastore_04                   VmfsDatastore        Datastore-da...
Datastore_05                   VmfsDatastore        Datastore-da...
Datastore_06                   VmfsDatastore        Datastore-da...
Datastore_07                   VmfsDatastore        Datastore-da...
Datastore_08                   VmfsDatastore        Datastore-da...
Datastore_09                   VmfsDatastore        Datastore-da...
Datastore_10                   VmfsDatastore        Datastore-da...
Datastore_11                   VmfsDatastore        Datastore-da...
Datastore_12                   VmfsDatastore        Datastore-da...
Datastore_13                   VmfsDatastore        Datastore-da...
Datastore_14                   VmfsDatastore        Datastore-da...
Datastore_15                   VmfsDatastore        Datastore-da...

PS vmstore:\MyDC> cd 'Datastore_10'
PS vmstore:\MyDC\Datastore_10> ls

Name                           Type                 Id             
----                           ----                 --             
.sdd.sf                        DatastoreFolder                     
DSFolder103A                   DatastoreFolder                     
DSFolder127-10                 DatastoreFolder                     
DSFolder037-01                 DatastoreFolder                     
DSFolderRC100-A08              DatastoreFolder                     
DSFolderAP013B                 DatastoreFolder                     
DSFolderPP102                  DatastoreFolder                     
079                            DatastoreFolder                     
1231XCRWL_3_9_2016             DatastoreFolder                     
DSFolderVS164-09               DatastoreFolder                     
DSFolderRCL077                 DatastoreFolder                                          
DSFolderAP110A                 DatastoreFolder                     
DSFolderOPI098B                DatastoreFolder                     
DSFolderAP098B                 DatastoreFolder                     
DSFolderAP098C                 DatastoreFolder                     
DSFolderAP098D                 DatastoreFolder                     
DSFolderAP098E                 DatastoreFolder                     
DSFolderAP098F                 DatastoreFolder                     
DSFolderOPI098A                DatastoreFolder                     
DSFolderPP098                  DatastoreFolder                     
DSFolderAP098A                 DatastoreFolder                     
DSFolderOPI098C                DatastoreFolder                     
DSFolderOPI098D                DatastoreFolder                     
SLI22_A00.iso                  DatastoreFile                       

Building the Function

As a first step for our Search-DatastoreItem function - it seems an appropriate Verb-Noun combination - we are going to build the parameter block inside the function opening and closing curly brackets.

Function Search-DatastoreFile {

    [CmdletBinding()]
    param (
        [Parameter(Mandatory=$true,
                   Position=0)]
        [string]$Expression,
        
        [Parameter(ValueFromPipeline,
                   Position=1)]
        [Alias('DS')]
        [string[]]$Datastore = '*'
    )

}

We start by adding the CmdletBinding attribute, this will give our function access to common parameters, with the added benefit of being able to add some useful output to the Verbose or Information streams, see about_common_parameters for more info.

Then we add the parameters:

  1. A mandatory parameter, in position 0, of type string and with name Expression. The search string or expression will be the value for this parameter.
  2. An optional parameter, in position 1, of type string (array), with alias DS, named Datastore and with default value of '*'. From this we can conclude that if the value for Datastore is omitted, all datastores are searched, since * is a wildcard.

Based on the information above we can easily tell that any of the following is correct:

  • Search-DatastoreItem -DS 'MyDS1, 'MyDS2, 'MyDS3' -Expression '*sql*'
  • Get-Datastore 'MyDS1', 'MyDS2, 'MyDS3' | Search-DatastoreItem '*SQL*'
  • Search-DatastoreItem '*SQL*'
  • Get-Content 'MyDSList' | Search-DatastoreItem '*sql*'
  • Search-DatastoreItem '*SQL*' 'MyDS'
  • Etc.

Remember that in the vimdatastore space, everything is case sensitive, however, when typing the command to execute the Search-DatastoreItem function, capitalization for the value of the Expression parameter is not relevant. We will use the Filter parameter of Get-ChildItem which is NOT case sensitive.

If the syntax of any of the commands in the list above is confusing or not clear enough let me know in the comments and I will try to elaborate. The idea is to show how the function's parameters we have defined can be utilized in many different ways.

By the way, wildcard characters (*) means anything, so we are searching for anything that has SQL in the middle, or not, because * also means nothing. Therefore, for a match to happen, it is not mandatory to have a string before or after SQL, the position of the * matches anything, including no characters at all. So, 'SQL' is a match of ''*SQL*', and so are 'My-SQL', 'SQL-installer' and 'My-SQL-installer'.

OK. Back to the code! Here are the BEGIN and PROCESS blocks.

BEGIN {
    $Counter = 0
}
 
PROCESS {
    
    $DSNames = Get-Datastore -Name $Datastore | Sort-Object Name | Select-Object -ExpandProperty Name

    foreach ($DS in $DSNames) {

        $Percent = '{0:n2}' -f (100/$DSNames.Count*$Counter)

        Write-Progress -Activity 'Searching datastores' -PercentComplete $Percent -CurrentOperation "Searching $DS" -Status "$Percent% Complete - $Counter of $($DSNames.Count) datastore(s) completed"

        Write-Verbose "Searching $DS"

        Set-Location "vmstore:\$(Get-Datacenter)"

        Set-Location $DS

        Get-ChildItem -Filter $Expression -Recurse | Select-Object Name,FolderPath,ItemType | Format-Table -AutoSize

        Write-Verbose "Done with $DS"

        Set-Location ..
     
        $Counter ++
}

In the BEGIN block there is only one line that creates the $Counter variable and assigns 0 as its value. This variable will be used later to count how many times a foreach loop is executed, this will help us calculate progress by comparing the number of times the loop has executed vs the total amount of objects that must to be processed.

Within the PROCESS block, we start by building a collection of datastore objects which are the ones that will be "scanned" for files, the Get-Datastore cmdlet finds the objects based on the value initially entered for the Datastore parameter. Remember the default value for this parameter is *, meaning all datastores that are accessible to the connected server.

Next, there is a foreach loop, which contains most of the lines of code that get the job done. It processes all datastores from the previous line (stored in the DSNames variable). We are going to break down this foreach construct in two blocks. The first one consists of  just two lines:

$Percent = '{0:n2}' -f (100/$DSNames.Count*$Counter)

Write-Progress -Activity 'Searching datastores' -PercentComplete $Percent -CurrentOperation "Searching $DS" -Status "$Percent% Complete - $Counter of $($DSNames.Count) datastore(s) completed"

The first line creates a variable whose value is basically the result of the division of 100 by the total number of datastores to process, multiplied by the value of the Counter variable (which will be 0 until we change it). This returns the percentage of progress based on the total amount of datastores and the amount that have been searched. At the same time, the -f operator along with the 0:n2 format string ensures that we get the result formatted with two decimals, this is recommended because there might be numbers with many decimals which will not look good in the output.

The second line (line 3 in the code above) uses the Write-Progress cmdlet to "draw" and display a progress bar that will look like the one in the image below.

By comparing the output in the image with the Write-Progress command from the code, it is easy to figure out what each parameter does.

This would be the second block:

Write-Verbose "Searching $DS"

Set-Location "vmstore:\$(Get-Datacenter)"

Set-Location $DS

Get-ChildItem -Filter $Expression -Recurse | Select-Object Name,FolderPath,ItemType | Format-Table -AutoSize

Write-Verbose "Done with $DS"

Set-Location ..
     
$Counter ++

Here we are taking advantage of the PowerShell output streams and adding some informational messages with Write-Verbose. Verbose output is available thanks to the CmdletBinding attribute added at the beginning of our function which enables common parameters, including the Verbose parameter. Write-Verbose will print messages only when the Verbose parameter is explicitly added to the command, e.g. 'Search-DatastoreItem -Expression '*centos*' -Verbose'. More information on output streams here.

Having explained Write-Verbose, the remaining lines in this block are the core of the function, this is the code that actually performs the file search and sends the results to the output. This would be the sequence of commands:

  1. First the function changes the location to the vmstore provider at the datacenter level. The Get-Datacenter cmdlet returns the datacenter name. IMPORTANT: This assumes there is only one datacenter in the vCenter, if there is more than one some edits will be required, Datacenter may have to become a parameter - Command: vmstore:\$(Get-Datacenter)
  2. Then the location is again changed, this time to the datastore that is being processed by the foreach loop. We use the DS variable for this. Remember DS is an item of the DSNames collection. - Command: Set-Location $DS
  3. The function's Expression parameter value is assigned to the Filter parameter value of Get-ChildItem. This is the single most important line of code because it performs the search in each datastore. It also formats the output using Select-Object and Format-Table. Notice the use of the Recurse parameter, without it, PowerShell / PowerCLI would only look for the file in the root of each datastore, and would not look into each datastore folder - Command: Get-ChildItem -Filter $Expression -Recurse | Select-Object Name, FolderPath, ItemType | Format-Table -AutoSize
  4. Once the search is over, the function changes back to the parent directory (datacenter), so that we can start over and do the same with the next datastore. For this we use .. (two periods) which points to the parent location of the current folder. - Command: Set-Location ..
  5. Finally the Counter variable is increased by one, so the next iteration will reflect an updated progress in our progress bar. This means one datastore search was completed. - Command: $Counter++

Once the function completes the search in the last datastore it exits the foreach loop and the last command is executed. This command uses the HOMEDRIVE environmental variable which normally refers to C:\, it is the letter of the drive where the user's files are located. Adding a period in front of this variable gives PowerShell the instruction to change the location to the letter designated to HOMEDRIVE. The purpose of this line is to exit the vimdatastore Provider and go back to the FileSystem provider - Command: .$env:HOMEDRIVE

Finally, the END block is empty, so there is not much to comment about it, let's now look at the complete code for the function.

Function Search-DatastoreItem {

    [CmdletBinding()]
    param (
        [Parameter(Mandatory=$true,
                   Position=0)]
        [string]$Expression,
        
        [Parameter(ValueFromPipeline,
                   Position=1)]
        [Alias('DS')]
        [string[]]$Datastore = '*'
    )

    BEGIN {

        $Counter = 0

    }
    
    PROCESS {

        $DSNames = Get-Datastore -Name $Datastore | Where-Object {$_.Name -notlike 'ma-ds-*'} | Sort-Object Name | Select-Object -ExpandProperty Name

        foreach ($DS in $DSNames) {
            
            $Percent = '{0:n2}' -f (100/$DSNames.Count*$Counter)
            Write-Progress -Activity 'Searching datastores' -PercentComplete $Percent -CurrentOperation "Searching $DS" -Status "$Percent% Complete - $Counter of $($DSNames.Count) datastore(s) completed"

            Set-Location "vmstore:\$(Get-Datacenter)"
            
            Write-Verbose "Searching $DS"

            Set-Location $DS

            Get-ChildItem -Filter $Expression -Recurse | Select-Object Name,FolderPath,ItemType | Format-Table -AutoSize

            Write-Verbose "Done with $DS"

            Set-Location ..
            $Counter ++
        }

        .$env:HOMEDRIVE

    }

    END {}

}

 

Output

This image shows a sample of the output of the function, after searching for '*office*' in 3 different datastores.

So, if you ever need to quickly find a file in your datastores, feel free to give this a try. I hope you find this post useful. Feel free to leave comments and questions below and thanks for reading this far.

The full and commented source code for this post is available in my GitHub repository. It also includes some comment-based help.