Monitoring and killing smbd
30 Jun 2023
We have several clients using macOS to share data from their Xsans. There have been occasional instances where the smbd process spins up to over 100% and seems to get hung up. Connected users report slow file browsing or not able to navigate to new folders. New connections timeout before being prompted for authentication.
We have found that killing the smbd process resolves the issue and connected users don’t lose the mount, but if files are open and being accessed, i.e. video content in Adobe Premiere, the application generally crashes and unsaved changes are lost.
We haven’t been able to find a specific cause for this issue, but we have automated the restart of smbd with a launchdaemon and script. The launchdaemon runs this script every 60 seconds.
The script checks for smbd to be running and then captures the cpu utilization for the process. This gets written to a file that keeps 10 results. These 10 results are summed and if they are above a threshold, then the smbd process is killed. We have found that a threshold of 1050 has worked well in most environments running macOS < 13.4. This would mean that the smbd process is using more than 100% of a cpu for at least 10 minutes.
With macOS 13.4, it seems that smbd has some multithreading capabilities and can handle higher sustained loads. We have used a threshold of 3500 (i.e. smbd using more that 350% over 10 minutes) in these cases and have seen smbd to be more reliable. Has anyone else noticed smb server performance differences in macOS 13.4?
The script also logs the 10 minute sums and notes when a smbd kill occurs. The record can be found in /Users/Shared/smbdMonitor.log by default.
The script and launchdaemon are available on our GitHub.
A related side note, at one location we’ve had issues with DNS records, particularly reverse look ups (PTR Records), going missing. We definitely saw more slow downs and connection problems while the DNS records were gone. This likely has to do with the file servers needing to do directory service lookups for permissions and ownership. So definitely double check DNS if you are seeing smbd issues.