We’re self-hosting the words-cloud docker image (version 25.6, also testing out 26.5) with a ~2GB memory resource limit. A managed OOM is being thrown (System.OutOfMemoryException) from the compare online API while the container memory peak is at ~1.5GiB. This seems to line up with .NET’s default 75% hard limit.
I also observed the default ceiling is 1536MiB with GC.GetGCMemoryInfo().TotalAvailableMemoryBytes. Setting DOTNET_GCHeapHardLimitPercent=0x5A (i.e. 90%) raises it to 1843MiB. With the hard limit set, concurrent compares that previously OOM at around 5-parallelism now run cleanly up to 6. So this change seems to help at the margin.
However, at 90% the peak usage is right up to the ~2GB memory limit so this leaves little room for native memory. While I’m seeing graceful per-request OOM, the concern is the risk of hard kernel OOM-kills causing full pod restarts. For example, on 26.5 at least, I saw a one-off SIGSEGV under a test with high concurrency.
Do you recommend configuring DOTNET_GCHeapHardLimitPercent or DOTNET_GCHeapHardLimit? Or would you advise that we don’t override the default? If so, what value would you suggest? Some guidance on the safe headroom between the GC heap hard limit and the container memory limit would be helpful.
Thanks.