From a very simplistic point of view, in vSphere datastores are like "hard drives" where VM files are stored. Normally, there is very little direct interaction between users or administrators with datastore files, they are managed via API by vCenter, ESXi and ultimately by the users who log into the vSphere Web Client. However, there may be situations where you need to access a datastore directly to manage or look for files, these are two potential scenarios:
- You upload an ISO, OVA / OVF, or any other file that makes sense, then you forget where it is.
- A VM is unregistered from inventory, then you need to either re-add it or delete the orphaned files but you can't remember in which datastore it originally was. Assuming there are at least 10 datastores, this might be a little cumbersome.
For times like these, there is a very convenient way to access datastores using a PSProvider, vimdatastore. This provider installed with PowerCLI allows to browse datastores just like cmd (and PowerShell) navigates the filesystem, as opposed to the GUI method which requires plenty of clicking and scrolling.
So, today I would like to share a simple function that takes advantage of the ability provided by vimdatastore to search for files using the Filter parameter of the Get-ChildItem cmdlet. As a prerequisite, make sure PowerCLI is installed and that you have an active connection to a vCenter. If PowerCLI is installed and correctly loaded by PowerShell, Get-PSProvider should return something similar to this:
Working with vimdatastore
To access the datastores via vimdatastore provider, we need to type the following:
Set-Location vmstore:
The prompt changes to 'PS vmstore:\>'. From here on, everything is case sensitive until we leave the vimdatastore provider, in other words, 'MyDS' is not the same as 'myds'.
After that, we would use Set-Location or its alias, 'cd' to get to the datastores, but the first level is the datacenter, so we need to type cd <datacenter name>, e.g. 'cd MyDC'. At the datacenter level, if we run the ls command (alias for Get-ChildItem) all the datastores would be listed. Now you we can simply type cd <datastore name>, e.g. 'cd MyDS' and the location would change to the root of the datastore. Let's take a look at how the output of these commands would look in the console.
PS C:\> Get-Datacenter
Name
----
MyDC
PS C:\> cd vmstore:
PS vmstore:\> cd MyDC
PS vmstore:\MyDC> ls
Name Type Id
---- ---- --
ma-ds-52558a80-2928b538-73f... VmfsDatastore Datastore-da...
ma-ds-52e72c34-693e519a-30a... VmfsDatastore Datastore-da...
Datastore_01 VmfsDatastore Datastore-da...
Datastore_02 VmfsDatastore Datastore-da...
Datastore_03 VmfsDatastore Datastore-da...
Datastore_04 VmfsDatastore Datastore-da...
Datastore_05 VmfsDatastore Datastore-da...
Datastore_06 VmfsDatastore Datastore-da...
Datastore_07 VmfsDatastore Datastore-da...
Datastore_08 VmfsDatastore Datastore-da...
Datastore_09 VmfsDatastore Datastore-da...
Datastore_10 VmfsDatastore Datastore-da...
Datastore_11 VmfsDatastore Datastore-da...
Datastore_12 VmfsDatastore Datastore-da...
Datastore_13 VmfsDatastore Datastore-da...
Datastore_14 VmfsDatastore Datastore-da...
Datastore_15 VmfsDatastore Datastore-da...
PS vmstore:\MyDC> cd 'Datastore_10'
PS vmstore:\MyDC\Datastore_10> ls
Name Type Id
---- ---- --
.sdd.sf DatastoreFolder
DSFolder103A DatastoreFolder
DSFolder127-10 DatastoreFolder
DSFolder037-01 DatastoreFolder
DSFolderRC100-A08 DatastoreFolder
DSFolderAP013B DatastoreFolder
DSFolderPP102 DatastoreFolder
079 DatastoreFolder
1231XCRWL_3_9_2016 DatastoreFolder
DSFolderVS164-09 DatastoreFolder
DSFolderRCL077 DatastoreFolder
DSFolderAP110A DatastoreFolder
DSFolderOPI098B DatastoreFolder
DSFolderAP098B DatastoreFolder
DSFolderAP098C DatastoreFolder
DSFolderAP098D DatastoreFolder
DSFolderAP098E DatastoreFolder
DSFolderAP098F DatastoreFolder
DSFolderOPI098A DatastoreFolder
DSFolderPP098 DatastoreFolder
DSFolderAP098A DatastoreFolder
DSFolderOPI098C DatastoreFolder
DSFolderOPI098D DatastoreFolder
SLI22_A00.iso DatastoreFile
Building the Function
As a first step for our Search-DatastoreItem function - it seems an appropriate Verb-Noun combination - we are going to build the parameter block inside the function opening and closing curly brackets.
Function Search-DatastoreFile {
[CmdletBinding()]
param (
[Parameter(Mandatory=$true,
Position=0)]
[string]$Expression,
[Parameter(ValueFromPipeline,
Position=1)]
[Alias('DS')]
[string[]]$Datastore = '*'
)
}
We start by adding the CmdletBinding attribute, this will give our function access to common parameters, with the added benefit of being able to add some useful output to the Verbose or Information streams, see about_common_parameters for more info.
Then we add the parameters:
- A mandatory parameter, in position 0, of type string and with name Expression. The search string or expression will be the value for this parameter.
- An optional parameter, in position 1, of type string (array), with alias DS, named Datastore and with default value of '*'. From this we can conclude that if the value for Datastore is omitted, all datastores are searched, since * is a wildcard.
Based on the information above we can easily tell that any of the following is correct:
- Search-DatastoreItem -DS 'MyDS1, 'MyDS2, 'MyDS3' -Expression '*sql*'
- Get-Datastore 'MyDS1', 'MyDS2, 'MyDS3' | Search-DatastoreItem '*SQL*'
- Search-DatastoreItem '*SQL*'
- Get-Content 'MyDSList' | Search-DatastoreItem '*sql*'
- Search-DatastoreItem '*SQL*' 'MyDS'
- Etc.
Remember that in the vimdatastore space, everything is case sensitive, however, when typing the command to execute the Search-DatastoreItem function, capitalization for the value of the Expression parameter is not relevant. We will use the Filter parameter of Get-ChildItem which is NOT case sensitive.
If the syntax of any of the commands in the list above is confusing or not clear enough let me know in the comments and I will try to elaborate. The idea is to show how the function's parameters we have defined can be utilized in many different ways.
By the way, wildcard characters (*) means anything, so we are searching for anything that has SQL in the middle, or not, because * also means nothing. Therefore, for a match to happen, it is not mandatory to have a string before or after SQL, the position of the * matches anything, including no characters at all. So, 'SQL' is a match of ''*SQL*', and so are 'My-SQL', 'SQL-installer' and 'My-SQL-installer'.
OK. Back to the code! Here are the BEGIN and PROCESS blocks.
BEGIN {
$Counter = 0
}
PROCESS {
$DSNames = Get-Datastore -Name $Datastore | Sort-Object Name | Select-Object -ExpandProperty Name
foreach ($DS in $DSNames) {
$Percent = '{0:n2}' -f (100/$DSNames.Count*$Counter)
Write-Progress -Activity 'Searching datastores' -PercentComplete $Percent -CurrentOperation "Searching $DS" -Status "$Percent% Complete - $Counter of $($DSNames.Count) datastore(s) completed"
Write-Verbose "Searching $DS"
Set-Location "vmstore:\$(Get-Datacenter)"
Set-Location $DS
Get-ChildItem -Filter $Expression -Recurse | Select-Object Name,FolderPath,ItemType | Format-Table -AutoSize
Write-Verbose "Done with $DS"
Set-Location ..
$Counter ++
}
In the BEGIN block there is only one line that creates the $Counter variable and assigns 0 as its value. This variable will be used later to count how many times a foreach loop is executed, this will help us calculate progress by comparing the number of times the loop has executed vs the total amount of objects that must to be processed.
Within the PROCESS block, we start by building a collection of datastore objects which are the ones that will be "scanned" for files, the Get-Datastore cmdlet finds the objects based on the value initially entered for the Datastore parameter. Remember the default value for this parameter is *, meaning all datastores that are accessible to the connected server.
Next, there is a foreach loop, which contains most of the lines of code that get the job done. It processes all datastores from the previous line (stored in the DSNames variable). We are going to break down this foreach construct in two blocks. The first one consists of just two lines:
$Percent = '{0:n2}' -f (100/$DSNames.Count*$Counter)
Write-Progress -Activity 'Searching datastores' -PercentComplete $Percent -CurrentOperation "Searching $DS" -Status "$Percent% Complete - $Counter of $($DSNames.Count) datastore(s) completed"
The first line creates a variable whose value is basically the result of the division of 100 by the total number of datastores to process, multiplied by the value of the Counter variable (which will be 0 until we change it). This returns the percentage of progress based on the total amount of datastores and the amount that have been searched. At the same time, the -f operator along with the 0:n2 format string ensures that we get the result formatted with two decimals, this is recommended because there might be numbers with many decimals which will not look good in the output.
The second line (line 3 in the code above) uses the Write-Progress cmdlet to "draw" and display a progress bar that will look like the one in the image below.
This would be the second block:
Write-Verbose "Searching $DS"
Set-Location "vmstore:\$(Get-Datacenter)"
Set-Location $DS
Get-ChildItem -Filter $Expression -Recurse | Select-Object Name,FolderPath,ItemType | Format-Table -AutoSize
Write-Verbose "Done with $DS"
Set-Location ..
$Counter ++
Here we are taking advantage of the PowerShell output streams and adding some informational messages with Write-Verbose. Verbose output is available thanks to the CmdletBinding attribute added at the beginning of our function which enables common parameters, including the Verbose parameter. Write-Verbose will print messages only when the Verbose parameter is explicitly added to the command, e.g. 'Search-DatastoreItem -Expression '*centos*' -Verbose'. More information on output streams here.
Having explained Write-Verbose, the remaining lines in this block are the core of the function, this is the code that actually performs the file search and sends the results to the output. This would be the sequence of commands:
- First the function changes the location to the vmstore provider at the datacenter level. The Get-Datacenter cmdlet returns the datacenter name. IMPORTANT: This assumes there is only one datacenter in the vCenter, if there is more than one some edits will be required, Datacenter may have to become a parameter - Command: vmstore:\$(Get-Datacenter)
- Then the location is again changed, this time to the datastore that is being processed by the foreach loop. We use the DS variable for this. Remember DS is an item of the DSNames collection. - Command: Set-Location $DS
- The function's Expression parameter value is assigned to the Filter parameter value of Get-ChildItem. This is the single most important line of code because it performs the search in each datastore. It also formats the output using Select-Object and Format-Table. Notice the use of the Recurse parameter, without it, PowerShell / PowerCLI would only look for the file in the root of each datastore, and would not look into each datastore folder - Command: Get-ChildItem -Filter $Expression -Recurse | Select-Object Name, FolderPath, ItemType | Format-Table -AutoSize
- Once the search is over, the function changes back to the parent directory (datacenter), so that we can start over and do the same with the next datastore. For this we use .. (two periods) which points to the parent location of the current folder. - Command: Set-Location ..
- Finally the Counter variable is increased by one, so the next iteration will reflect an updated progress in our progress bar. This means one datastore search was completed. - Command: $Counter++
Once the function completes the search in the last datastore it exits the foreach loop and the last command is executed. This command uses the HOMEDRIVE environmental variable which normally refers to C:\, it is the letter of the drive where the user's files are located. Adding a period in front of this variable gives PowerShell the instruction to change the location to the letter designated to HOMEDRIVE. The purpose of this line is to exit the vimdatastore Provider and go back to the FileSystem provider - Command: .$env:HOMEDRIVE
Finally, the END block is empty, so there is not much to comment about it, let's now look at the complete code for the function.
Function Search-DatastoreItem {
[CmdletBinding()]
param (
[Parameter(Mandatory=$true,
Position=0)]
[string]$Expression,
[Parameter(ValueFromPipeline,
Position=1)]
[Alias('DS')]
[string[]]$Datastore = '*'
)
BEGIN {
$Counter = 0
}
PROCESS {
$DSNames = Get-Datastore -Name $Datastore | Where-Object {$_.Name -notlike 'ma-ds-*'} | Sort-Object Name | Select-Object -ExpandProperty Name
foreach ($DS in $DSNames) {
$Percent = '{0:n2}' -f (100/$DSNames.Count*$Counter)
Write-Progress -Activity 'Searching datastores' -PercentComplete $Percent -CurrentOperation "Searching $DS" -Status "$Percent% Complete - $Counter of $($DSNames.Count) datastore(s) completed"
Set-Location "vmstore:\$(Get-Datacenter)"
Write-Verbose "Searching $DS"
Set-Location $DS
Get-ChildItem -Filter $Expression -Recurse | Select-Object Name,FolderPath,ItemType | Format-Table -AutoSize
Write-Verbose "Done with $DS"
Set-Location ..
$Counter ++
}
.$env:HOMEDRIVE
}
END {}
}
Output
This image shows a sample of the output of the function, after searching for '*office*' in 3 different datastores.
So, if you ever need to quickly find a file in your datastores, feel free to give this a try. I hope you find this post useful. Feel free to leave comments and questions below and thanks for reading this far.
The full and commented source code for this post is available in my GitHub repository. It also includes some comment-based help.