问题背景
应用使用G1 GC,在某一时刻后,开始出现频繁的full gc
排查
通过gc 日志获取一些线索
存在大量大对象分配,大部分触发GC的原因为大对象分配,相关日志片段:[GC pause (G1 Humongous Allocation)
存在转移失败(to-space exhausted),转移失败一般发生在young gc后survivor区和老年代空间没有足够空间容纳存活对象时,转移失败会导致较长的young gc耗时。转移失败时,回收器会将未成功复制的分区全部置为老年代分区,并且一般之后会紧跟着一次full gc进行全堆的回收。但是分析日志后并不是每次转移失败后都会进行full gc,这里猜测是因为JDK8u60之后,在年轻代回收的最后(对象转移之后),会进行巨型对象的回收。因此释放了内存,内存占用回到可使用的状态
年轻代大小过大,G1中一般不需要设置年轻代大小,根据预设的停顿时间,G1的停顿预测模型会动态的调整年轻代的大小。阈值为全堆大小的15% ~ 60%。观察到发生转移失败时,年轻代大小被调整到了全堆60%的上限值。过大的年轻代,以及大量的大对象分区,挤占了survivor区和老年代的空间,导致没有足够的空间容纳存活对象(过大的年轻代也有可能导致存活对象变多),最终转移失败。
调优建议
- 调整分区大小,让原本的大对象变为正常的年轻代对象
- 可以限制年轻代大小,降低年轻代大小,提高年轻代回收次数,变相增加大对象的回收次数
- 从应用层面避免过多的大对象分配
解决
通过代码review和内存dump,发现大对象均为用于记日志的序列化string,并且这些日志并不是必需的。最终通过废弃这些记日志的代码来解决问题。
转移失败的GC日志
2020-10-03T23:03:29.457+0800: 330.855: [GC pause (G1 Humongous Allocation) (young) (to-space exhausted), 4.0495836 secs]
[Parallel Time: 1773.8 ms, GC Workers: 8]
[GC Worker Start (ms): Min: 330857.0, Avg: 330857.1, Max: 330857.1, Diff: 0.2]
[Ext Root Scanning (ms): Min: 1.1, Avg: 1.8, Max: 5.4, Diff: 4.3, Sum: 14.2]
[Update RS (ms): Min: 0.0, Avg: 3.4, Max: 4.5, Diff: 4.5, Sum: 27.6]
[Processed Buffers: Min: 0, Avg: 50.9, Max: 81, Diff: 81, Sum: 407]
[Scan RS (ms): Min: 25.9, Avg: 28.7, Max: 29.7, Diff: 3.7, Sum: 229.3]
[Code Root Scanning (ms): Min: 0.0, Avg: 1.2, Max: 3.5, Diff: 3.5, Sum: 9.2]
[Object Copy (ms): Min: 1736.7, Avg: 1738.4, Max: 1738.9, Diff: 2.2, Sum: 13907.0] // 对象复制转移
[Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.5]
[Termination Attempts: Min: 1, Avg: 1.9, Max: 4, Diff: 3, Sum: 15]
[GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.6]
[GC Worker Total (ms): Min: 1773.5, Avg: 1773.6, Max: 1773.7, Diff: 0.2, Sum: 14188.5]
[GC Worker End (ms): Min: 332630.6, Avg: 332630.6, Max: 332630.7, Diff: 0.1]
[Code Root Fixup: 0.3 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 2.1 ms]
[Other: 2273.5 ms]
[Evacuation Failure: 2257.7 ms] // 转移失败的额外耗时
[Choose CSet: 0.0 ms]
[Ref Proc: 0.2 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.7 ms]
[Humongous Register: 1.1 ms]
[Humongous Reclaim: 9.9 ms] // 大对象分区回收
[Free CSet: 2.7 ms]
[Eden: 14.4G(15.4G)->0.0B(13.9G) Survivors: 184.0M->8192.0K Heap: 22.9G(26.0G)->9683.8M(26.0G)] // 年轻代大小15.4G
转载请注明:学时网 » G1 to-space exhausted 的问题