一、背景
java.util.concurrent.TimeoutException: android.content.res.AssetManager$AssetInputStream.finalize() timed out after 10 seconds
at android.content.res.AssetManager$AssetInputStream.close(AssetManager.java:812)
at android.content.res.AssetManager$AssetInputStream.finalize(AssetManager.java:845)
at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:202)
at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:185)
at java.lang.Thread.run(Thread.java:833)
这是一个高频问题,必须要诊治一下了,去查看一下这个crash在我们发出去的最新版本上的上报情况,看到所有重写finalize方法的对象,都有可能发生这个异常,堆栈有几十种,其中AssetManager是个大户,发生了289次,ThreadedRenderer发生了33次。其余的类大部分都是我们自己的,比如CursorWindow,SQliteCursor等
二、初步分析
android.content.res.AssetManager$AssetInputStream.finalize() timed out after 10 seconds,从报错的直观意思上来看,是由于finalize方法超时了,下面来模拟一下。
class MyObject extends View {
public MyObject(Context context) {
super(context);
}
@Override
protected void finalize() throws Throwable {
System.out.println("WANG finalize begain");
try{
Thread.sleep(2000);
}finally {
super.finalize();
System.out.println("WANG finalize end");
}
}
}
view.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
for (int i = 0; i < 1000; i++) {
MyObject object= new MyObject(MainActivity.this);
}
}
});
当每次被GC的时候,MyObject可以被正常的回收,如下图。
看一下FinalizerWatchdogDaemon线程的trace,Sleeping状态,可能抓的时候,正处于sleep。
"FinalizerWatchdogDaemon" daemon prio=5 tid=5 Sleeping
| group="system" sCount=1 dsCount=0 flags=1 obj=0x12f40350 self=0x6fad268c00
| sysTid=1347 nice=4 cgrp=default sched=0/0 handle=0x6f95d264f0
| state=S schedstat=( 118415628 218405939 1140 ) utm=7 stm=4 core=6 HZ=100
| stack=0x6f95c23000-0x6f95c25000 stackSize=1041KB
| held mutexes=
at java.lang.Thread.sleep(Native method)
- sleeping on <0x0edad885> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:373)
- locked <0x0edad885> (a java.lang.Object)
at java.lang.Thread.sleep(Thread.java:314)
at java.lang.Daemons$FinalizerWatchdogDaemon.sleepFor(Daemons.java:344)
at java.lang.Daemons$FinalizerWatchdogDaemon.waitForFinalization(Daemons.java:366)
at java.lang.Daemons$FinalizerWatchdogDaemon.runInternal(Daemons.java:283)
at java.lang.Daemons$Daemon.run(Daemons.java:105)
at java.lang.Thread.run(Thread.java:764)
现在修改一下程序,把睡眠2s改成睡眠20s,前台运行一段时间之后,突然崩溃了(和机型相关,有一些新出来的机型,系统已经优化,这个问题就不会再出现),FinalizerWatchdogDaemon线程报错了。
04-23 15:27:57.153 26407-26416/com.example.myapplication E/AndroidRuntime: FATAL EXCEPTION: FinalizerWatchdogDaemon
Process: com.example.myapplication, PID: 26407
java.util.concurrent.TimeoutException: com.example.myapplication.MainActivity$MyObject.finalize() timed out after 10 seconds
at java.lang.Thread.sleep(Native Method)
at java.lang.Thread.sleep(Thread.java:1031)
at java.lang.Thread.sleep(Thread.java:985)
at com.example.myapplication.MainActivity$MyObject.finalize(MainActivity.java:36)
at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:202)
at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:185)
at java.lang.Thread.run(Thread.java:818)
所以我们知道,这个crash和finalize的耗时有一定关系,那我们在看看我们的代码是怎么写的?
256 /**
257 * Release the native resources, if they haven't been released yet.
258 */
259 @Override
260 protected void finalize() {
261 try {
262 // if the cursor hasn't been closed yet, close it first
263 if (mWindow != null) {
264 close();
265 }
266 } finally {
267 super.finalize();
268 }
269 }
138 @Override
139 protected void finalize() throws Throwable {
140 try {
141 dispose();
142 } finally {
143 super.finalize();
144 }
145 }
146
147 private void dispose() {
148 if (mWindowPtr != 0) {
149 recordClosingOfWindow(mWindowPtr);
150 nativeDispose(mWindowPtr);
151 mWindowPtr = 0;
152 }
153 }
154
由此可见,finalize中都做一些额外的工作,正是这些额外的任务,在系统资源紧张,GC的时候很繁忙导致Finalizer对象回收超时,触发了App的crash。
三、深入分析
FinalizerWatchdogDaemon线程顾名思义,带有一个watchdog,说明和一个看门狗的性质是一样的,超过一定的时候不喂狗,就会被狗咬,看看源码是不是这个样子的。
http://androidxref.com/9.0.0_r3/xref/libcore/libart/src/main/java/java/lang/Daemons.java#37
30/**
31 * Calls Object.finalize() on objects in the finalizer reference queue. The VM
32 * will abort if any finalize() call takes more than the maximum finalize time
33 * to complete.
34 *
35 * @hide
36 */
37public final class Daemons {
38 private static final int NANOS_PER_MILLI = 1000 * 1000;
39 private static final int NANOS_PER_SECOND = NANOS_PER_MILLI * 1000;
40 private static final long MAX_FINALIZE_NANOS = 10L * NANOS_PER_SECOND;
41
//开启四个线程
42 public static void start() {
43 ReferenceQueueDaemon.INSTANCE.start();
44 FinalizerDaemon.INSTANCE.start();
45 FinalizerWatchdogDaemon.INSTANCE.start();
46 HeapTaskDaemon.INSTANCE.start();
47 }
48
55 //停止4个线程
56 public static void stop() {
57 HeapTaskDaemon.INSTANCE.stop();
58 ReferenceQueueDaemon.INSTANCE.stop();
59 FinalizerDaemon.INSTANCE.stop();
60 FinalizerWatchdogDaemon.INSTANCE.stop();
61 }
......
62}
Daemons类的开头注释很清楚,从finalizer reference queue(终结引用队列)中取出对象,调用对象的finalize()方法的时候.如果超出了最大终结时间(一般为MAX_FINALIZE_NANOS,值是10秒),JVM就会中止。这个类中开启了4个线程。
FinalizerDaemon:对于重写了成员函数finalize的对象,它们被GC决定回收时,并没有马上被回收,而是被放入到一个队列中,等待FinalizerDaemon守护线程去调用它们的成员函数finalize,然后再被回收。
FinalizerWatchdogDaemon:与上面的线程对应,用来监控FinalizerDaemon线程的执行。一旦检测那些重写了finalize的对象在执行成员函数finalize时超出一定时间,那么就会退出JVM。
我们来看看FinalizerDaemon线程的实现。
193 private static class FinalizerDaemon extends Daemon {
194 private static final FinalizerDaemon INSTANCE = new FinalizerDaemon();
195 private final ReferenceQueue<Object> queue = FinalizerReference.queue;
196 private final AtomicInteger progressCounter = new AtomicInteger(0);
197 // Object (not reference!) being finalized. Accesses may race!
198 private Object finalizingObject = null;
199
200 FinalizerDaemon() {
201 super("FinalizerDaemon");
202 }
203
204 @Override public void runInternal() {
205 // This loop may be performance critical, since we need to keep up with mutator
206 // generation of finalizable objects.
207 // We minimize the amount of work we do per finalizable object. For example, we avoid
208 // reading the current time here, since that involves a kernel call per object. We
209 // limit fast path communication with FinalizerWatchDogDaemon to what's unavoidable: A
210 // non-volatile store to communicate the current finalizable object, e.g. for
211 // reporting, and a release store (lazySet) to a counter.
212 // We do stop the FinalizerWatchDogDaemon if we have nothing to do for a
213 // potentially extended period. This prevents the device from waking up regularly
214 // during idle times.
215
216 // Local copy of progressCounter; saves a fence per increment on ARM and MIPS.
217 int localProgressCounter = progressCounter.get();
218
219 while (isRunning()) {
220 try {
221 // Use non-blocking poll to avoid FinalizerWatchdogDaemon communication
222 // when busy.
223 FinalizerReference<?> finalizingReference = (FinalizerReference<?>)queue.poll();
224 if (finalizingReference != null) {
225 finalizingObject = finalizingReference.get();
226 progressCounter.lazySet(++localProgressCounter);
227 } else {
228 finalizingObject = null;
229 progressCounter.lazySet(++localProgressCounter);
230 // Slow path; block.
231 FinalizerWatchdogDaemon.INSTANCE.goToSleep();
232 finalizingReference = (FinalizerReference<?>)queue.remove();
233 finalizingObject = finalizingReference.get();
234 progressCounter.set(++localProgressCounter);
235 FinalizerWatchdogDaemon.INSTANCE.wakeUp();
236 }
237 doFinalize(finalizingReference);
238 } catch (InterruptedException ignored) {
239 } catch (OutOfMemoryError ignored) {
240 }
241 }
242 }
243
244 @FindBugsSuppressWarnings("FI_EXPLICIT_INVOCATION")
245 private void doFinalize(FinalizerReference<?> reference) {
246 FinalizerReference.remove(reference);
247 Object object = reference.get();
248 reference.clear();
249 try {
//上面从reference取出对象,调用它的finalize
250 object.finalize();
251 } catch (Throwable ex) {
252 // The RI silently swallows these, but Android has always logged.
253 System.logE("Uncaught exception thrown by finalizer", ex);
254 } finally {
255 // Done finalizing, stop holding the object as live.
256 finalizingObject = null;
257 }
258 }
在看看FinalizerWatchdogDaemon线程的实现
261 /**
262 * The watchdog exits the VM if the finalizer ever gets stuck. We consider
263 * the finalizer to be stuck if it spends more than MAX_FINALIZATION_MILLIS
264 * on one instance.
265 */
266 private static class FinalizerWatchdogDaemon extends Daemon {
267 private static final FinalizerWatchdogDaemon INSTANCE = new FinalizerWatchdogDaemon();
268
269 private boolean needToWork = true; // Only accessed in synchronized methods.
270
271 FinalizerWatchdogDaemon() {
272 super("FinalizerWatchdogDaemon");
273 }
274
275 @Override public void runInternal() {
276 while (isRunning()) {
277 if (!sleepUntilNeeded()) {
278 // We have been interrupted, need to see if this daemon has been stopped.
279 continue;
280 }
281 final Object finalizing = waitForFinalization();
282 if (finalizing != null && !VMRuntime.getRuntime().isDebuggerActive()) {
283 finalizerTimedOut(finalizing);
284 break;
285 }
286 }
287 }
waitForFinalization方法用来监控FinalizerDaemon线程的回收,如果超时waitForFinalization方法返回值就不为空,触发了finalizerTimedOut方法的执行,进程被杀死,JVM退出。
396 private static void finalizerTimedOut(Object object) {
397 // The current object has exceeded the finalization deadline; abort!
398 String message = object.getClass().getName() + ".finalize() timed out after "
399 + (MAX_FINALIZE_NANOS / NANOS_PER_SECOND) + " seconds";
400 Exception syntheticException = new TimeoutException(message);
401 // We use the stack from where finalize() was running to show where it was stuck.
402 syntheticException.setStackTrace(FinalizerDaemon.INSTANCE.getStackTrace());
403
404 // Send SIGQUIT to get native stack traces.
405 try {
//杀死进程
406 Os.kill(Os.getpid(), OsConstants.SIGQUIT);
407 // Sleep a few seconds to let the stack traces print.
408 Thread.sleep(5000);
409 } catch (Exception e) {
410 System.logE("failed to send SIGQUIT", e);
411 } catch (OutOfMemoryError ignored) {
412 // May occur while trying to allocate the exception.
413 }
426 if (Thread.getUncaughtExceptionPreHandler() == null &&
427 Thread.getDefaultUncaughtExceptionHandler() == null) {
428 // If we have no handler, log and exit.
429 System.logE(message, syntheticException);
430 System.exit(2);
431 }
436 Thread.currentThread().dispatchUncaughtException(syntheticException);
437 }
438 }
从Android 5.0开始,每个View都包含了一个或者多个的Finalizer对象,如果页面的View比较多,甚至还有Activity等对象泄漏的话,那么FinalizerDaemon线程需要检测的对象越来越多,负??上攵诘投松璞干?,有可能就会来不及回收而引起性能和稳定性问题,给了FinalizerWatchdogDaemon线程可乘之机.
四、解决方案
从第三部分我们知道这个问题的RootCasue是对象回收超时,那么修复的方法基本上有两种,一种是破坏FinalizerWatchdogDaemon线程,使之stop。
public void fix1() {
try {
Class clazz = Class.forName("java.lang.Daemons$FinalizerWatchdogDaemon");
Method method = clazz.getSuperclass().getDeclaredMethod("stop");
method.setAccessible(true);
Field field = clazz.getDeclaredField("INSTANCE");
field.setAccessible(true);
method.invoke(field.get(null));
} catch (Throwable e) {
e.printStackTrace();
}
}
这种方案从DEMO中确实可以解决timeout的异常问题,但是风险未知。
另外想法一种是加长时间到60s,如下
private static final int NANOS_PER_MILLI = 1000 * 1000;
private static final int NANOS_PER_SECOND = NANOS_PER_MILLI * 1000;
private static final long MAX_FINALIZE_NANOS = 10L * NANOS_PER_SECOND;
public void fix2() {
try {
Class<?> c = Class.forName("java.lang.Daemons");
Field maxField = c.getDeclaredField("MAX_FINALIZE_NANOS");
maxField.setAccessible(true);
maxField.set(null, MAX_FINALIZE_NANOS * 6);
} catch (ClassNotFoundException e) {
e.printStackTrace();
} catch (NoSuchFieldException e) {
e.printStackTrace();
} catch (IllegalAccessException e) {
e.printStackTrace();
}
}
第二种方案我在做ROM的时候,改过Daemons.java的源码,直接改成了60秒,全机型,未发现风险,这可能也是为什么出来的错误有的是timed out after 30 seconds的原因,但是由于MAX_FINALIZE_NANOS是final的string常量,并不能用反射,这种思路在App中行不通。
综上,我修复方案是,对于已经国内ROM已经把MAX_FINALIZE_NANOS改变大于30s的的情况,暂时不动。对于仍然是10s的,stop掉FinalizerWatchdogDaemon线程,为了风险可控,加上了云控开关,可以随时控制关闭FinalizerWatchdogDaemon线程这个功能是否打开。
最后给出个人的几个建议
1、不到万不得已,不要重写finalize方法
2、做好内存优化,减少GC的频繁调用