Azure Session Hosts SMB crash BSOD - PAGE_FAULT_IN_NONPAGED_AREA (50) - mrxsmb20

James Simpson 0 Reputation points
2025-11-14T01:30:45.9366667+00:00

Hi Everyone,

 

We have an AVD environment for a customer. For the past 18 months, it's been rock-solid. Really, really reliable. We recently "upgraded" them from 2 x large Session Host VMs to 4 x smaller session hosts for load balancing, redundancy and future scale-out economies. We built the new VMs on the Windows 24H2 image (Windows 11 multi-session) from the AVD gallery. The entire rest of the environment (file server, app server, AD DC's, vNetwork, etc) remained the same. The session hosts are Hybrid Joined to both the “local” Active Directory as well as Intune/Entra.

 

Since moving their users onto the freshly-built VMs, we've been having problems with random blue-screens on the session hosts. It's ALWAYS the same error - kernel bug check and reboot caused by a memory page fault (0x50 - "page fault in nonpaged area") in mrxsmb20.sys (the SMB2/SMB3 system driver) when it tries to do a memory copy function. Of the 4x hosts, they have all exhibited the problem at various times, but the frequency is random. One has crashed once in six weeks. A different host had been running perfectly, then got stuck on a Friday and crashed out 4 times in 2 hours - it's been faultless since (with no other changes made). 

 

Since then we've tried a number of things including changing the version of FSLogix on the session hosts, changing the RDP settings on the host pool, even all the way up to completely rebuilding the 4 x session hosts using the latest Windows 25H2 image. However the Bluescreening keeps happening.

 

I strongly suspect it's to do with the FSLogix profile containerisation, because that operates over SMB to their internal file server (unchanged for 18 months), but I can't prove that. I suspect it's user-induced in the sense that something in one of the user sessions attempts to do something and that freaks it out and crashes it. That said, I don't think it's something that a user is knowingly doing (there would have been no one working at 3am in the morning, but there would have been disconnected sessions still logged in). It doesn't appear to be load-related (again, no one was working at 3AM). It's not uptime related (a host can crash, start back up, and then crash again). We have been focussing on the Session Hosts because they are the thing that has changed, but now we're looking wider. 

  

Some Other things that has been tried. We have done the normal SFC scan, Dism scans, chkdsk and memory testing. We have even moved all the sessions hosts to new azure hosts, so should be all new hardware. Crashes were still happening. We have even fully rebuilt the servers onto the latest

 We have ensured all drivers are up to date and latest windows updates are installed. We are at a loss at what is the trigger for this SMB redirector crash. Any assistance or guidance with this would be appreciated. I have added the Debug report below for those to look at.

Bugcheck Analysis

*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except.
Typically the address is just plain bad or it is pointing at freed memory.
Arguments:
Arg1: ffffb8825c90e000, memory referenced.
Arg2: 0000000000000000, X64: bit 0 set if the fault was due to a not-present PTE.
                bit 1 is set if the fault was due to a write, clear if a read.
                bit 3 is set if the processor decided the fault was due to a corrupted PTE.
                bit 4 is set if the fault was due to attempted execute of a no-execute PTE.
                - ARM64: bit 1 is set if the fault was due to a write, clear if a read.
                bit 3 is set if the fault was due to attempted execute of a no-execute PTE.
Arg3: fffff8012ef9690e, If non-zero, the instruction address which referenced the bad memory
                address.
Arg4: 0000000000000000, (reserved)

Debugging Details:
------------------


KEY_VALUES_STRING: 1

    Key  : AV.Type
    Value: Read

    Key  : Analysis.CPU.mSec
    Value: 3453

    Key  : Analysis.Elapsed.mSec
    Value: 39469

    Key  : Analysis.IO.Other.Mb
    Value: 0

    Key  : Analysis.IO.Read.Mb
    Value: 1

    Key  : Analysis.IO.Write.Mb
    Value: 33

    Key  : Analysis.Init.CPU.mSec
    Value: 1140

    Key  : Analysis.Init.Elapsed.mSec
    Value: 11168

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 134

    Key  : Analysis.Version.DbgEng
    Value: 10.0.29457.1000

    Key  : Analysis.Version.Description
    Value: 10.2506.23.01 amd64fre

    Key  : Analysis.Version.Ext
    Value: 1.2506.23.1

    Key  : Bugcheck.Code.KiBugCheckData
    Value: 0x50

    Key  : Bugcheck.Code.LegacyAPI
    Value: 0x50

    Key  : Bugcheck.Code.TargetModel
    Value: 0x50

    Key  : Dump.Attributes.AsUlong
    Value: 0x20800

    Key  : Failure.Bucket
    Value: AV_R_(null)_mrxsmb20!memcpy

    Key  : Failure.Exception.IP.Address
    Value: 0xfffff8012ef9690e

    Key  : Failure.Exception.IP.Module
    Value: mrxsmb20

    Key  : Failure.Exception.IP.Offset
    Value: 0x3690e

    Key  : Failure.Hash
    Value: {a5546a08-4f6a-9f06-ba62-dfbeba1e8028}

    Key  : Hypervisor.Enlightenments.ValueHex
    Value: 0x2090ebf4

    Key  : Hypervisor.Flags.AnyHypervisorPresent
    Value: 1

    Key  : Hypervisor.Flags.ApicEnlightened
    Value: 1

    Key  : Hypervisor.Flags.ApicVirtualizationAvailable
    Value: 0

    Key  : Hypervisor.Flags.AsyncMemoryHint
    Value: 0

    Key  : Hypervisor.Flags.CoreSchedulerRequested
    Value: 0

    Key  : Hypervisor.Flags.CpuManager
    Value: 0

    Key  : Hypervisor.Flags.DeprecateAutoEoi
    Value: 0

    Key  : Hypervisor.Flags.DynamicCpuDisabled
    Value: 1

    Key  : Hypervisor.Flags.Epf
    Value: 0

    Key  : Hypervisor.Flags.ExtendedProcessorMasks
    Value: 1

    Key  : Hypervisor.Flags.HardwareMbecAvailable
    Value: 1

    Key  : Hypervisor.Flags.MaxBankNumber
    Value: 0

    Key  : Hypervisor.Flags.MemoryZeroingControl
    Value: 0

    Key  : Hypervisor.Flags.NoExtendedRangeFlush
    Value: 0

    Key  : Hypervisor.Flags.NoNonArchCoreSharing
    Value: 1

    Key  : Hypervisor.Flags.Phase0InitDone
    Value: 1

    Key  : Hypervisor.Flags.PowerSchedulerQos
    Value: 0

    Key  : Hypervisor.Flags.RootScheduler
    Value: 0

    Key  : Hypervisor.Flags.SynicAvailable
    Value: 1

    Key  : Hypervisor.Flags.UseQpcBias
    Value: 0

    Key  : Hypervisor.Flags.Value
    Value: 4853997

    Key  : Hypervisor.Flags.ValueHex
    Value: 0x4a10ed

    Key  : Hypervisor.Flags.VpAssistPage
    Value: 1

    Key  : Hypervisor.Flags.VsmAvailable
    Value: 1

    Key  : Hypervisor.RootFlags.AccessStats
    Value: 0

    Key  : Hypervisor.RootFlags.CrashdumpEnlightened
    Value: 0

    Key  : Hypervisor.RootFlags.CreateVirtualProcessor
    Value: 0

    Key  : Hypervisor.RootFlags.DisableHyperthreading
    Value: 0

    Key  : Hypervisor.RootFlags.HostTimelineSync
    Value: 0

    Key  : Hypervisor.RootFlags.HypervisorDebuggingEnabled
    Value: 0

    Key  : Hypervisor.RootFlags.IsHyperV
    Value: 0

    Key  : Hypervisor.RootFlags.LivedumpEnlightened
    Value: 0

    Key  : Hypervisor.RootFlags.MapDeviceInterrupt
    Value: 0

    Key  : Hypervisor.RootFlags.MceEnlightened
    Value: 0

    Key  : Hypervisor.RootFlags.Nested
    Value: 0

    Key  : Hypervisor.RootFlags.StartLogicalProcessor
    Value: 0

    Key  : Hypervisor.RootFlags.Value
    Value: 0

    Key  : Hypervisor.RootFlags.ValueHex
    Value: 0x0

    Key  : SecureKernel.HalpHvciEnabled
    Value: 0

    Key  : WER.OS.Branch
    Value: ge_release

    Key  : WER.OS.Version
    Value: 10.0.26100.1

    Key  : WER.System.BIOSRevision
    Value: 4.1.0.0


BUGCHECK_CODE:  50

BUGCHECK_P1: ffffb8825c90e000

BUGCHECK_P2: 0

BUGCHECK_P3: fffff8012ef9690e

BUGCHECK_P4: 0

FILE_IN_CAB:  MEMORY.DMP

VIRTUAL_MACHINE:  HyperV

DUMP_FILE_ATTRIBUTES: 0x20800

FAULTING_THREAD:  ffff828b21fb64c0

READ_ADDRESS:  ffffb8825c90e000 

MM_INTERNAL_CODE:  0

BLACKBOXBSD: 1 (!blackboxbsd)


BLACKBOXNTFS: 1 (!blackboxntfs)


BLACKBOXPNP: 1 (!blackboxpnp)


BLACKBOXWINLOGON: 1 (!blackboxwinlogon)


PROCESS_NAME:  System

STACK_TEXT:  
ffff9280`971364f8 fffff801`98ad654f     : 00000000`00000050 ffffb882`5c90e000 00000000`00000000 ffff9280`97136760 : nt!KeBugCheckEx
ffff9280`97136500 fffff801`98640510     : 00000000`00000000 ffff8000`00000000 ffffb882`5c90e000 0000007f`fffffff8 : nt!MiSystemFault+0x3053a3
ffff9280`971365f0 fffff801`98aacfcb     : 00000000`00000000 00000000`00002070 ffff828b`126e0460 ffff828b`0ea3f890 : nt!MmAccessFault+0x630
ffff9280`97136760 fffff801`2ef9690e     : fffff801`2ef716bb ffff828b`256f9c38 ffff828b`256f9818 ffffd710`16d08137 : nt!KiPageFault+0x38b
ffff9280`971368f8 fffff801`2ef716bb     : ffff828b`256f9c38 ffff828b`256f9818 ffffd710`16d08137 fffff801`2ee78cc2 : mrxsmb20!memcpy+0x10e
ffff9280`97136900 fffff801`2ee7c471     : 00000000`00000000 00000000`00000000 00000000`00000103 ffff828b`4a3fe800 : mrxsmb20!Smb2Write_Start+0x60b
ffff9280`97136a10 fffff801`2efb299a     : ffff9280`97136b01 00000000`00000000 ffff828b`00000000 ffff828b`00000000 : mrxsmb!SmbCeInitiateExchange+0xbf1
ffff9280`97136af0 fffff801`2ee97ca4     : ffff828b`4a3fe818 00000000`00000000 ffffe209`d661e240 ffff828b`126e0460 : mrxsmb20!MRxSmb2Write+0x1da
ffff9280`97136b60 fffff801`2da86168     : ffffe209`d74dc9e8 ffff828b`126e0460 ffff9280`97136c29 ffffe20a`0df7e7d0 : mrxsmb!SmbShellWrite+0x24
ffff9280`97136b90 fffff801`2da49d32     : ffff828b`126e0460 ffff828b`126e0460 00000000`00000000 00000000`00000001 : csc!CscWrite+0x298
ffff9280`97136c90 fffff801`2da481ca     : ffff828b`1f46e1b8 ffffe209`d661e240 fffff801`2da14048 ffff828b`1f46e010 : rdbss!RxLowIoSubmit+0x282
ffff9280`97136d00 fffff801`2d9e7f7a     : ffff828b`20e5b043 ffff828b`3e08c401 fffff801`2da14048 fffff801`2da14048 : rdbss!RxLowIoWriteShell+0x8a
ffff9280`97136d30 fffff801`2da512b7     : fffff801`2da16880 ffff828b`3e08c401 ffff828b`1f46e010 00000000`00000000 : rdbss!RxCommonFileWrite+0x8ba
ffff9280`97136f20 fffff801`2d9e31fb     : ffff828b`126e0460 ffff828b`1f46e010 ffff828b`3e08c400 ffff828b`00000000 : rdbss!RxCommonWrite+0xd7
ffff9280`97136f50 fffff801`2da4be04     : ffff828b`3220f300 fffff801`2a01877f 00000000`00000000 fffff801`986e9752 : rdbss!RxFsdCommonDispatch+0x69b
ffff9280`97137120 fffff801`2eef7886     : 00000000`c0410002 ffff828b`3e53b3b0 ffffe20a`0df7ecd0 00000000`00000000 : rdbss!RxFsdDispatch+0x84
ffff9280`97137170 fffff801`987ab63d     : fffff801`2c62a010 ffff828b`3e08c460 ffff828b`1f46e200 ffffe209`b7655710 : mrxsmb!MRxSmbFsdDispatch+0xa6
ffff9280`971371b0 fffff801`2c639f03     : ffff828b`3e08c468 ffff828b`1f46e010 ffff828b`1f46e010 ffff828b`1f46e200 : nt!IofCallDriver+0xcd
ffff9280`971371f0 fffff801`2c639b89     : ffffe209`b7655710 00000000`00000000 00000000`00000000 ffff828b`1f46e010 : mup!MupStateMachine+0x1b3
ffff9280`97137270 fffff801`987ab63d     : ffff828b`0c678b20 00000000`00000000 ffff828b`45cbf150 ffff828b`1f46e010 : mup!MupFsdIrpPassThrough+0xd9
ffff9280`971372e0 fffff801`2a018d8d     : ffff828b`3e53b3b0 ffff828b`3220f300 ffff9280`971373f0 fffff801`2a02c72f : nt!IofCallDriver+0xcd
ffff9280`97137320 fffff801`2a02c1a0     : ffff9280`971373f0 ffff828b`00000000 ffff828b`0c678b00 00000000`00000000 : FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x12d
ffff9280`97137390 fffff801`987ab63d     : 00000000`00000001 ffff828b`1f46e010 ffff828b`24d6c048 ffff828b`24d6c160 : FLTMGR!FltpDispatch+0x280
ffff9280`97137430 fffff801`306d4513     : ffff828b`1f46e010 00000000`00000000 ffff828b`24d6c8c0 fffff801`986e9752 : nt!IofCallDriver+0xcd
ffff9280`97137470 fffff801`306d3c48     : ffff9280`971378ac ffff9280`97137660 00000000`00000000 ffff828b`1f46e010 : vhdmp!VhdmpiCallDriverForEnteredSafeFileReference+0x1f3
ffff9280`971374f0 fffff801`306ddfe1     : ffff828b`24d6c000 fffff801`306dddc4 00000000`00000007 ffff828b`12c2db00 : vhdmp!VhdmpiFileWrapperCallDriver+0x78
ffff9280`97137520 fffff801`306ddbee     : ffff828b`216554b0 ffff828b`24d6c000 ffff9280`97137660 fffff801`988ecfae : vhdmp!VhdmpiCallDriverWithoutBlocking+0x111
ffff9280`97137580 fffff801`306d93a7     : ffff9280`971376b0 ffff828b`24d6c8c0 00000000`00000001 00000000`00000000 : vhdmp!VhdmpiVhd2FastPathSubIoRoutineEx+0xde
ffff9280`971375b0 fffff801`306d8eb1     : ffff9280`97137730 ffff828b`216554b0 00000000`00000001 ffff828b`21fb64c0 : vhdmp!Vhd2iIssueReadWriteInitialized+0x1b7
ffff9280`97137630 fffff801`306dedb1     : 00000000`00000000 00000000`00000000 ffff828b`08698020 ffff828b`216554b0 : vhdmp!VhdmpiVhd2FastPathIo+0x181
ffff9280`97137900 fffff801`306de980     : 00000000`00000000 fffff801`986dc03b 00000000`00000000 ffff9280`97137990 : vhdmp!VhdmpiStartSrbExtensionAfterRct+0x1a1
ffff9280`97137940 fffff801`307ab9f6     : ffff9280`97137ad0 fffff801`993cfb00 00000000`00000000 ffff828b`08698020 : vhdmp!VhdmpiStartSrbExtensionAndRelease+0x280
ffff9280`971379a0 fffff801`986db7ec     : ffff828b`21fb64c0 ffff828b`21fb6400 ffff9280`97137a00 ffff828b`08698020 : vhdmp!VhdmpiSrbExtensionWorkerRoutine+0x36
ffff9280`971379d0 fffff801`98881afa     : ffff828b`21fb64c0 ffff828b`21fb64c0 fffff801`986db200 ffff828b`08698020 : nt!ExpWorkerThread+0x5ec
ffff9280`97137bb0 fffff801`98a9ef84     : fffff801`28483180 ffff828b`21fb64c0 fffff801`98881aa0 00000000`00000072 : nt!PspSystemThreadStartup+0x5a
ffff9280`97137c00 00000000`00000000     : ffff9280`97138000 ffff9280`97131000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x34


SYMBOL_NAME:  mrxsmb20!memcpy+10e

MODULE_NAME: mrxsmb20

IMAGE_NAME:  mrxsmb20.sys

STACK_COMMAND: .process /r /p 0xffff828b086a9040; .thread 0xffff828b21fb64c0 ; kb

BUCKET_ID_FUNC_OFFSET:  10e

FAILURE_BUCKET_ID:  AV_R_(null)_mrxsmb20!memcpy

OS_VERSION:  10.0.26100.1

BUILDLAB_STR:  ge_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {a5546a08-4f6a-9f06-ba62-dfbeba1e8028}

Followup:     MachineOwner
---------

Azure Virtual Desktop
Azure Virtual Desktop
A Microsoft desktop and app virtualization service that runs on Azure. Previously known as Windows Virtual Desktop.
{count} votes

5 answers

Sort by: Most helpful
  1. Matt Russell 41 Reputation points
    2025-11-21T00:12:11.7066667+00:00

    Hi all,

    Closing the loop on this one - I believe we have identified the cause and have remediated it. Posting here for future reference in case anyone comes up with the same issue.

    Reviewing the stack trace above shows involvement of csc, which is the Client Side Caching / Offline Files functionality of Windows Server file shares. We determined that through a config error, the Offline Files / Client Side Caching was enabled on the file share that houses the FSLogix profile containers. This is a massive no-no for an FSLogix deployment. From what we can tell, it's 100% my fault - I created that share about 18 months ago and for whatever reason I did not disable the Offline Files capability.

    It has been running fine in this state for the past 16 months with only 2 x session host VMs and has given no problem. However once we upgraded to the 4 x new session host VMs, the issues started. I don't know if the CSC engaged due to the additional load of the new VMs, or whether the newer OS version was preferring the CSC whereas the previous OS version was ignoring it. Either way, it was clearly the problem.

    We have disabled Offline Files on the share, and we have also disabled (via GPO) the Client-Side-Caching on the session host VMs. The environment has now run in this state for an entire week without a crash. We're still monitoring the situation - if we can get through another week without any issues then I'll be happy to call it done.

    Whilst I'm frustrated that it was my misconfiguration that caused the issue, I'm relieved that we identified the issue and have fixed it, and that the customer environment is now stable once more.

    For anyone else seeing similar behaviour, check the stack trace for any involvement of CSC, and check the state of the Offline Files settings on the SMB share that houses the FSLogix VHD containers.

    Cheers,
    Matt

    1 person found this answer helpful.
    0 comments No comments

  2. John Wielemaker 16 Reputation points
    2025-11-14T10:25:03.11+00:00

    After debugging we find out that we have this problem not in our validation environment. We dive deeper in this and we found out the problem starts when a there was a update in the s&s network stack. We use the following version:

    Production Pool  1.0.2507.25425 installed existing Production environment 25-10 due to scheduled updates in the weekend. This pool has issues.

    Validation Pool    1.0.2507.25700 installed existing Validation environment 13-11

                                1.0.2507.25450 installed existing Validation environment 31-10 

    On this moment we are installing the 25700 version manually on a couple of servers in our production to check if the problem is resolved.

    0 comments No comments

  3. Matt Russell 41 Reputation points
    2025-12-02T00:45:57.67+00:00

    Update to this - we haven't fixed it.

    We had 2 weeks without any noticeable impact to the customer, but late last week 3 of the AVD session hosts crashed out within hours of each other. It's the exact same bugcheck, the exact same faulting module and operation. the only difference as far as I can see in the crashdumps is that we now are not seeing the Client Side Caching present in the stack trace.

    So we're back actively working on trying to figure out what's going on here.

    We've got a support job open with our Pax8 team. They currently suspect ThreatLocker (for no other reason I believe than it has a filter driver in the stack), so we're working through the troubleshooting with both them and ThreatLocker on that front.

    We'll continue to update this thread with more info as it comes to hand. If anyone else has any other info please feel free to add it in here and we'll see if we can't figure out what's causing this.

    It's doing our heads in, and obviously is quite disruptive to the customer and their team.


  4. Q&A Assist
    2025-11-14T01:30:59.2933333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    The issue you are encountering with the Azure Virtual Desktop (AVD) environment, specifically the blue screen error with the code PAGE_FAULT_IN_NONPAGED_AREA (50) related to mrxsmb20.sys, indicates a problem with memory access that is typically associated with driver issues or memory corruption. Given that this has started occurring after migrating to new session hosts, here are some troubleshooting steps and considerations:

    1. Driver Issues: Ensure that all drivers, particularly network drivers and any related to SMB, are up-to-date. Since the error is linked to mrxsmb20.sys, it may be beneficial to check for updates or known issues with this driver in the context of Windows 11 multi-session.
    2. FSLogix Configuration: Since you suspect FSLogix might be involved, review the FSLogix configuration settings. Ensure that the profile containers are correctly configured and that there are no issues with the SMB shares being used. You might also want to test with a different profile container setup or temporarily disable FSLogix to see if the crashes persist.
    3. Memory Diagnostics: Although you mentioned running memory tests, consider running a more thorough diagnostic tool or using Windows Memory Diagnostic to rule out any hardware memory issues that could lead to these faults.
    4. Event Logs: Check the Windows Event Viewer for any warnings or errors that occur around the time of the crashes. This may provide additional context or clues about what processes or actions are triggering the blue screens.
    5. Load Testing: Since the issue seems to be sporadic and not directly related to user load, consider simulating load on the session hosts to see if you can reproduce the issue under controlled conditions. This could help identify if specific user actions or workloads are contributing to the crashes.
    6. Azure Support: If the issue persists despite these efforts, consider reaching out to Azure Support for assistance. They can provide deeper insights into the Azure infrastructure and help diagnose potential issues related to the virtual machine setup.
    7. Debugging Tools: Utilize debugging tools to analyze the memory dump files generated during the crashes. This can provide more detailed information about the state of the system at the time of the crash and help pinpoint the exact cause.

    By following these steps, you should be able to gather more information and potentially resolve the blue screen issues affecting your AVD environment.


  5. John Wielemaker 16 Reputation points
    2025-11-14T09:22:43.8566667+00:00

    james, We have the same issues at a customer site. The problem is started at the site on 28 oktober and we troubleshoot a lot of things to find the issue. The Azure virtual desktop machines are running at a azure local cluster and we rollout the last 24H2 multisession image from the marketplace. What we tried to troubleshoot the issue is:
    Disabled antivirus, Read the dump file (exact the same result as your dump file), Change the underlaying host (azure local cluster), Reinstall the os with the latest 24H2 build, Check out the latency on the underlaying storage (no issues), Check disabled smb1. I also opened a case for this. Maybe we can share the details about the progress of the case so we can find out as quickly as possible what the problem is. We suspect the Azure virtuel desktop agent updates.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.