Tuesday, March 19, 2013

Looking for memory leaks in svchost

I'm running across an issue with multiple 2008R2 systems that are having memory leaks somewhere in a service. We see svchost processes building up to hundreds of MB or even several GB of RAM utilization. So as first step, I wanted to come up with a list of systems that may be presenting this problem, and pull up a list of services in that process. So, using powershell, I came up with this interesting custom PSObject construct to work on. This is some code that you can pipe in the names of your machines to, and then deal with the output in other ways later:


new-object psobject -property @{
 Computer=$_
 Services=((gwmi win32_service -computername $_ -filter "processid = $((get-process -computername $_ svchost|sort ws -Descending|select -first 1| tee-object -variable temp).id)"| select name |convertto-csv -notypeinformation) -join ",")
 Memory="{0,-22:n2}" -f ($temp.ws/1MB)
}


For those that are not that familiar with what is happening here, we are creating a new object on the first line, and the @{} is used for "splatting" to add attributes to the object. Our second line creates an attribute "Computer" with the name you piped in.


The next line creates an attribute "Service" which uses get-wmiobject on WIN32_Service to find a specific process PID value. The PID value is obtained by the Get-Process commandlet in a subexpression which goes through multiple pipelines, first finding all svchost processes, sorting them by workingset size, then picking the largest, and finally storing the result in variable $temp along with outputing the PID value into a pseudo variable for our -filter parameter of GWMI. All of that GWMI result is then reduced to the names of the services and converted to CSV. The -join operator is put around all of that to make it a since CSV style line.


After that, we create an attribute called "Memory", which takes our $temp variable that we created with tee-object, and puts it through the format operator -f, to make it a 2 decimal place value. We have a subexpression here to change the working set into MegaBytes by using the 1MB shortcut for calcuation.


The output isn't too beautiful, but for an adhoc look at a large list of machines, its usable. With some further manipulation or sorting we can go further with this.

Services : 

"name","AeLookupSvc","BITS","Browser","CertPropSvc","gpsvc","IKEEXT","iphlpsvc","LanmanServer","ProfSvc","Schedule","SENS","SessionEnv","Winmgmt","wuauserv"
Memory   : 1,713.16
Computer : MyServer.contoso.com

Googling around a bit shows there has been some known issues with gpsvc and iphlpsvc, as well as some people complaining about winmgmt causing issues. We can use sc.exe config option to set these to run in their own memory space (sc.exe \\computername config svcname type= own) to see if that isolates the problem. Stopping the various services did not clear up the problem, however gpsvc (group policy) is normally blocked for all users except the SYSTEM, so its not stoppable. For further investigations, get-childitem with .versioninfo.fileversion attributes on the results in powershell can get us version details of files, so we can look for version differences between a good machine and bad machine and see if we may be missing a patch somewhere. If we see growing handles, we can do some fun pipelining in powershell to use the handle.exe sysinternal tool in order to see if we can find a pattern there. Maybe something like:

.\handle.exe -a -p  | where {$_ -match ": "} | % {$_.substring($_.indexof(": "))} |group-object |sort count|select count,name -last 10.


                        Count Name
                        ----- ----
                           51 : ALPC Port
                           61 : Semaphore
                          106 : EtwRegistration
                          845 : File  (---)   \Device\Afd
                         1862 : Event

No comments:

Post a Comment