https://github.com/ditrixnew/windows_gpu_recovery
Flutter Windows plugin that recovers from `EGL_CONTEXT_LOST` / `DXGI_ERROR_DEVICE_REMOVED` after system sleep, Hyper-V save/restore, TDR timeout, or GPU driver reset.
https://github.com/ditrixnew/windows_gpu_recovery
angle d3d11 flutter gpu windows
Last synced: 2 months ago
JSON representation
Flutter Windows plugin that recovers from `EGL_CONTEXT_LOST` / `DXGI_ERROR_DEVICE_REMOVED` after system sleep, Hyper-V save/restore, TDR timeout, or GPU driver reset.
- Host: GitHub
- URL: https://github.com/ditrixnew/windows_gpu_recovery
- Owner: DitriXNew
- License: mit
- Created: 2026-04-02T03:23:49.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2026-04-02T03:50:48.000Z (3 months ago)
- Last Synced: 2026-04-04T10:39:23.387Z (3 months ago)
- Topics: angle, d3d11, flutter, gpu, windows
- Language: C++
- Homepage:
- Size: 77.1 KB
- Stars: 5
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# windows_gpu_recovery
[](https://pub.dev/packages/windows_gpu_recovery)
[](https://pub.dev/packages/windows_gpu_recovery/score)
[](LICENSE)
[](https://pub.dev/packages/windows_gpu_recovery)
Flutter Windows plugin that recovers from `EGL_CONTEXT_LOST` / `DXGI_ERROR_DEVICE_REMOVED` after system sleep, Hyper-V save/restore, TDR timeout, or GPU driver reset. Without this plugin, the app freezes permanently with a white screen.
> **The recovery does NOT work when a debugger is attached** (VS Code `flutter run`, `F5`, or any external debugger). This is a Win32 debugging model limitation — the debugger intercepts exceptions before the plugin can handle them. **Run the compiled exe directly for testing and production.**
## Demo
https://github.com/user-attachments/assets/b769f469-1701-487c-9459-facba1dde675
## What it does
After GPU device loss, a Flutter Windows app freezes forever — rendering stops, the window goes white, Windows marks it "Not Responding". CPU logic (Dart isolate, timers, method channels) continues working, but no frames are drawn.
This plugin:
1. **Detects** GPU device loss via a persistent sentinel D3D11 device (2-second polling)
2. **Destroys** the dead Flutter engine with crash protection (Vectored Exception Handler catches ANGLE cleanup crashes)
3. **Recreates** a fresh `FlutterViewController` with new ANGLE display, new D3D device, new Skia context
4. **Preserves state** — app saves state to SharedPreferences before recreation, restores on startup
The window stays open. The process stays alive. User sees a brief white flash (~1-3 seconds), then the app is back with restored state.
## Integration
### 1. Add the dependency
```yaml
dependencies:
windows_gpu_recovery:
path: ../windows_gpu_recovery # or git URL
```
### 2. Modify `windows/runner/flutter_window.cpp`
Add the `WM_GPU_RECOVERY` handler. This is the only C++ change needed:
```cpp
#include
// In FlutterWindow::MessageHandler, BEFORE the Flutter message dispatch:
if (message == WM_GPU_RECOVERY) {
// Write a recovery marker file for the new Dart VM.
// (Your state-saving logic here — or use SharedPreferences from Dart side)
// Destroy old engine — VEH catches ANGLE cleanup crashes.
// eglTerminate clears the per-process Display singleton.
flutter_controller_.reset();
// Poll until GPU is ready (adapter recovered).
// D3D11CreateDevice every 200ms, max 10 seconds.
// (See example_with_recovery for full polling implementation)
Sleep(3000); // simple version
// Create fresh engine.
RECT frame = GetClientArea();
flutter_controller_ = std::make_unique(
frame.right - frame.left, frame.bottom - frame.top, project_);
RegisterPlugins(flutter_controller_->engine());
SetChildContent(flutter_controller_->view()->GetNativeWindow());
flutter_controller_->ForceRedraw();
return 0;
}
```
### 3. Persist state on Dart side
The Dart VM restarts on recreation (new isolate). SharedPreferences (on disk) survives:
```dart
// Save state continuously (every significant change):
final prefs = await SharedPreferences.getInstance();
await prefs.setInt('counter', _counter);
// On startup, check for recovery marker:
final exeDir = File(Platform.resolvedExecutable).parent.path;
final marker = File('$exeDir/gpu_recovery.marker');
if (marker.existsSync()) {
marker.deleteSync();
final savedCounter = prefs.getInt('counter') ?? 0;
// Restore state, show banner, etc.
}
```
### 4. No Dart-side initialization needed
The plugin works entirely at the native C++ level. Just adding the dependency is enough — the sentinel device and VEH are set up automatically on plugin registration.
## Testing
> Recovery does NOT work under a debugger. Always test by running the exe directly.
### Triggering GPU device loss
**`dxcap -forcetdr`** (recommended — clean, repeatable):
```cmd
:: Run as Administrator. Part of Windows SDK / Visual Studio.
dxcap -forcetdr
```
Install: `C:\Program Files (x86)\Windows Kits\10\bin\\x64\dxcap.exe`.
Other methods:
- **System sleep** — close laptop lid or Start > Sleep (real hardware only, not VMs)
- **Hyper-V** — Save VM + Start VM from host (no `WM_POWERBROADCAST` delivered)
- **Device Manager** — Disable/Enable GPU adapter (not over RDP)
- **PowerShell** — `Disable-PnpDevice` / `Enable-PnpDevice` (as Administrator)
### A/B test procedure
1. `flutter build windows --release` in both `example` and `example_without_recovery`
2. Launch both exe files directly (not via VS Code)
3. Click the counter several times
4. Run `dxcap -forcetdr` as Administrator
5. Observe:
- **Without plugin** (red theme): frozen white screen, counter lost
- **With plugin** (green theme): brief white flash, counter restored, green banner
### Logs
The `example` app writes to `gpu_recovery.log` next to the exe:
```
[GPU_RECOVERY] Sentinel D3D device created
[GPU_RECOVERY] Plugin initialized
[GPU_RECOVERY] Watchdog started (2 s interval)
[GPU_RECOVERY] Device loss detected — activating exception handler
[GPU_RECOVERY] Posting WM_GPU_RECOVERY
[GPU_RECOVERY] WM_GPU_RECOVERY received
[GPU_RECOVERY] Destroying old engine...
[GPU_RECOVERY] Old engine destroyed
[GPU_RECOVERY] GPU ready after 200 ms
[GPU_RECOVERY] Creating new engine...
[GPU_RECOVERY] Plugin initialized
[GPU_RECOVERY] Engine recreated successfully
```
## What survives vs what restarts
```
SURVIVES (same process): RESTARTS (new engine):
Win32 window (same HWND) Dart VM (new isolate)
C++ process (same PID) Riverpod/Provider state
SharedPreferences (on disk) Navigation stack
FlutterSecureStorage (on disk) In-memory caches
Files, network connections Stream subscriptions
```
## Project structure
```
windows_gpu_recovery/
lib/windows_gpu_recovery.dart — library declaration (no Dart API needed)
windows/
windows_gpu_recovery_plugin.cpp — sentinel device + VEH + watchdog
windows_gpu_recovery_plugin.h
include/.../gpu_recovery_message.h — WM_GPU_RECOVERY constant
CMakeLists.txt — links dxgi, d3d11
example/ — counter app + plugin + state restore
example_without_recovery/ — counter app, no plugin (freezes)
```
---
How it works internally
## Detection: sentinel D3D11 device
A persistent D3D11 device is created on the same DXGI adapter that ANGLE uses (`FlutterDesktopViewGetGraphicsAdapter`). This sentinel device stays alive for the plugin's lifetime. When the GPU resets, `GetDeviceRemovedReason()` on the sentinel returns `DXGI_ERROR_DEVICE_REMOVED` — permanently, even after the adapter recovers (~100ms).
A freshly created test device would succeed (the adapter is back), so polling with temporary devices gives false negatives. The sentinel device shares the fate of ANGLE's internal device — both get the same `DEVICE_REMOVED` status.
## Recovery: VEH-protected engine recreation
### The problem
`flutter_controller_.reset()` calls the engine destructor → `eglTerminate` → ANGLE cleanup → `Renderer11::release()` → tries to `Release()` dead D3D COM objects → `ACCESS_VIOLATION` crash.
Without crash protection, the process dies. With `.release()` (leak without destructor), `eglTerminate` is never called → the per-process ANGLE `EGLDisplay` singleton stays in `mInitialized = true` state → new engine gets the same dead display.
### The solution
A Vectored Exception Handler (VEH) is installed via `AddVectoredExceptionHandler`. It catches `EXCEPTION_ACCESS_VIOLATION` inside `flutter_windows.dll`'s address range and skips the faulting instruction:
1. Decodes x64 instruction length (handles REX prefixes, ModRM, SIB, displacements)
2. Advances RIP past the crashing instruction
3. Sets RAX=0, RDX=0 (fake null/success return values)
4. Sets EFLAGS ZF=1 (dead pointers treated as null in subsequent conditional jumps)
With VEH active, `flutter_controller_.reset()` runs the full destructor. ANGLE's cleanup partially crashes (VEH catches), but `eglTerminate` reaches `Display::mInitialized = false` — clearing the singleton. The new `FlutterViewController` calls `eglInitialize`, which sees `mInitialized == false` and performs a full reinitialization: new `Renderer11`, new `D3D11Device`, new EGL contexts.
### Why approach 6 (without VEH) failed
In approach 6, `.release()` was used to avoid crashes. The engine was leaked, `eglTerminate` never ran, the Display singleton stayed poisoned. The VEH (developed for approach 9's binary patching experiments) turned out to be the critical enabler — it lets the destructor run safely, which clears the singleton.
### Debugger limitation
When a debugger is attached, Windows sends first-chance exceptions to the debugger **before** the VEH. The Dart debug adapter treats any unhandled exception as fatal and terminates the process. This is a Win32 debugging model constraint, not a plugin bug. The plugin works correctly when no debugger is attached.
---
Investigation log — for those who enjoy a detective story
## The problem
Flutter Windows applications become permanently unresponsive after any event that triggers a D3D11 device removal. The Flutter engine has no recovery path. Issue [flutter#124194](https://github.com/flutter/flutter/issues/124194) has been open since April 2023 at P2 priority with no fix or assigned engineer.
We conducted an exhaustive investigation: 11 approaches, 5 binary patches to `flutter_windows.dll`, a custom x64 instruction length decoder, a Vectored Exception Handler, and interactive reverse engineering with x64dbg.
## The rendering stack
```
Layer 7: Vsync scheduler (DWM) — DEAD (swap chain gone)
Layer 6: Flutter frame scheduling — IDLE (no vsync = no frames)
Layer 5: Skia GrDirectContext — ABANDONED (auto-detected)
Layer 4: EGL Context (mContextLost) — PERMANENTLY LOST
Layer 3: EGL Display (mDeviceLost) — PERMANENTLY LOST
Layer 2: ANGLE Renderer11 (D3D11) — REMOVED
Layer 1: D3D11 adapter (hardware) — RECOVERED in ~100ms
```
## Approach 1: InvalidateRect
Force repaint. Result: `eglMakeCurrent` returns `EGL_CONTEXT_LOST`. Can't paint with dead context.
## Approach 2: WM_POWERBROADCAST + WM_SIZE
Intercept resume from sleep, force resize. Result: swap chain resized but EGL context is dead. Also, `WM_POWERBROADCAST` not delivered in Hyper-V.
## Approach 3: RunOnSeparateThread
`UIThreadPolicy::RunOnSeparateThread`. Result: fixes performance regression (#178916), not context loss.
## Approach 4: SW_HIDE/SW_SHOW on Flutter child HWND
Plugin accesses `FlutterDesktopViewGetHWND`. Hide/show triggers surface recreation. Result: new surface on dead context → still fails.
## Approach 5: Sentinel D3D11 device
Create persistent device on ANGLE's adapter, poll `GetDeviceRemovedReason()`. **Works for detection.** Temporary test devices give false negatives (adapter recovers in ~100ms).
## Approach 6: FlutterViewController recreation
Destroy old controller, create new. Finding: `.reset()` crashes (ANGLE cleanup). `.release()` leaks but new engine gets same dead `EGLDisplay` singleton. Dead end without VEH.
## Approach 7: Binary patching — ANGLE recovery (5 patches)
ANGLE has `restoreLostDevice()` infrastructure. Blocked by two checks:
**Patches 1-2:** NOP `isResetNotificationEnabled` check in `restoreLostDevice`:
```
Pattern: 89 c1 b0 01 ba 0e 30 00 00 84 c9 75
Patch: 75 XX → 90 90
```
**Patches 3-5:** Skip `isContextLost` checks in `ValidateContext` (block before `ValidateDisplay`):
```
Patch 3: 0x00b33247: 74 → EB
Patch 4: 0x00b3585b: 74 → EB
Patch 5: 0x00b387f5: 74 → EB
```
Result: `restoreLostDevice` reached but `Renderer11::resetDevice()` → `release()` crashes on dead COM objects.
## Approach 8: ForceRedraw loop
After device loss, DWM stops vsync. `ForceRedraw` bypasses via `ScheduleFrame`. Result: frames scheduled but all fail on dead ANGLE context. With patches, `release()` crashes.
## Approach 9: Vectored Exception Handler
Custom x64 instruction decoder catches ACCESS_VIOLATION inside `flutter_windows.dll`. Skips faulting instructions with ZF=1. Result: 300+ AVs caught, app stays alive, but `release()` + `initialize()` compiler-inlined → skipping release skips initialize too.
## Approach 10: x64dbg reverse engineering
Found `GpuSurfaceGLSkia::AcquireFrame` by error string reference. Traced to `GrDirectContext*` at `[this+0x10]`. Found `fAbandoned` at `[sub+0xD8]` — already true. Found `releaseResourcesAndAbandonContext` — never called. Manually invoked via register injection — early return (already abandoned). Nobody creates replacement `GrDirectContext`.
## Approach 10b-c: Code cave + JMP patch
Found `release()` + `initialize()` inlined into 630-byte function. Crash at +431 (`cmp [r13+0x38E0], 0`). Found conditional `je +0x23` at +424 that skips to initialize. Patched to `jmp` (one byte). Result: release skipped, but initialize reads dead Renderer11 fields → VEH returns 0 → `D3D11CreateDevice` gets wrong params.
## 9 barriers deep
| # | Barrier | How we broke it |
|---|---------|----------------|
| 1 | restoreLostDevice blocked | Patch: NOP jnz |
| 2 | isContextLost blocks first | Patch: je → jmp |
| 3 | Vsync dead | ForceRedraw loop |
| 4 | release() crashes | VEH |
| 5 | release+initialize inlined | Patch: skip release |
| 6 | initialize reads dead fields | VEH returns 0 |
| 7 | Skia auto-abandoned | x64dbg confirmed |
| 8 | EGLDisplay singleton | ANGLE architecture |
| 9 | Renderer11 layout unknown | No PDB/RTTI |
## Approach 11: The breakthrough
Instead of fixing ANGLE from below (9 barriers), attack from above: destroy the entire engine and create a new one. Approaches 6 failed because `.release()` skipped `eglTerminate`. But the VEH from approach 9 lets `.reset()` run safely — the destructor calls `eglTerminate`, which clears `Display::mInitialized`, which lets the new engine reinitialize ANGLE from scratch.
The VEH that failed for approach 9 (couldn't fix ANGLE internally) became the critical enabler for approach 11 (protects the destructor during engine recreation).
---
## What a proper Flutter engine fix would look like
This plugin is a workaround — it restarts the Dart VM. A proper engine fix (~200 lines) would preserve the VM:
1. `egl/surface.cc` — detect `EGL_CONTEXT_LOST` specifically, signal to engine
2. `egl/manager.cc` — add `Reinitialize()` with SEH-protected `eglTerminate`
3. `flutter_windows_engine.cc` — `HandleDeviceLost()`: suspend raster thread, `GrDirectContext::abandonContext()`, reinitialize EGL, recreate compositor
4. `flutter_window.cc` — handle `WM_POWERBROADCAST`
## References
- [flutter#124194](https://github.com/flutter/flutter/issues/124194) — App freezes when GPU disabled (P2, open, no fix)
- [flutter#88649](https://github.com/flutter/flutter/issues/88649) — Desktop app freezes after sleep (closed, no fix)
- [flutter#163732](https://github.com/flutter/flutter/issues/163732) — Black/white screen after sleep in Parallels
- [ANGLE commit (2016)](https://chromium.googlesource.com/angle/angle/+/2823) — "device-loss at the display level is non-recoverable"
- [ANGLE Renderer11.cpp](https://chromium.googlesource.com/angle/angle/+/refs/heads/main/src/libANGLE/renderer/d3d/d3d11/Renderer11.cpp)
- [Microsoft: Handle device removed scenarios](https://learn.microsoft.com/en-us/windows/uwp/gaming/handling-device-lost-scenarios)
## License
MIT