[sakai-dev] Re: GC memory leak with java.lang.ref.Finalizer / BaseAnnouncementService

classic Classic list List threaded Threaded
1 message Options
Ashley Willis Ashley Willis
Reply | Threaded
Open this post in threaded view
|

[sakai-dev] Re: GC memory leak with java.lang.ref.Finalizer / BaseAnnouncementService

I think you might be on to something with the Base 64 image stuff. We have been having similar issues the last 2 weeks. We had a sever run out of heap this afternoon. GC seemed to be in a loop with near constant major collections that were getting longer and longer. It looked to us like a problem with the Blog tool. Looking at the heap, roughly 50% of the heap appears to be Strings that are copy/paste probably from Word (they were full of styles) into the Blog tool and they contain img tags <img alt="" src=".....

The reason this sounds really suspicious to me is that we saw something similar with a tool a year or 2 ago, I think it was Melete. The tool had a method to sanitize Strings by removing tags etc. One professor was using the tool for class photos. I don't remember all the details but it was essentially trying to process the images as Strings and running out of heap. I wonder if something similar is going on here.

We are on
Sakai 11.2
java 1.8.0_144
Tomcat 8.0.32
I don't have the GC settings handy but I can get them if needed.

Ashley



On Wednesday, February 7, 2018 at 3:15:14 PM UTC-5, Austin wrote:
Hello Sakai Devs,

Yesterday we experienced an issue where GC was looping over and over causing some performance issues on one of our servers.

We're running:

Sakai 11.4
Tomcat 8.0.47
Java 1.8.0_151
MySQL 5.6
RHEL 7

with these Java opts (these aren't all of them, but the ones that are mem. related):

-Xmx10280m 
-Xms10280m 
-XX:MaxMetaspaceSize=1024m 
-Xmn2500m 
-XX:+UseConcMarkSweepGC 
-XX:+UseParNewGC 
-XX:CMSInitiatingOccupancyFraction=80 
-XX:+DisableExplicitGC 
-XX:+DoEscapeAnalysis 
-verbose:gc 
-XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps 

The error in the logs that keep looping over and over are:

47863.032: [CMS-concurrent-preclean: 5.037/5.037 secs] [Times: user=5.35 sys=0.04, real=5.04 secs] 
47863.032: [CMS-concurrent-abortable-preclean-start]
47863.032: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
47863.033: [GC (CMS Final Remark) [YG occupancy: 1542954 K (2304000 K)]47863.033: [Rescan (parallel) , 0.3890201 secs]47863.423: [weak refs processing, 0.0000592 secs]47863.423: [class unloading, 0.2006707 secs]47863.623: [scrub symbol table, 0.0485713 secs]47863.672: [scrub string table, 0.0086044 secs][1 CMS-remark: 7966719K(7966720K)] 9509674K(10270720K), 0.6472521 secs] [Times: user=2.06 sys=0.02, real=0.65 secs] 
47863.682: [CMS-concurrent-sweep-start]
47865.226: [CMS-concurrent-sweep: 1.544/1.544 secs] [Times: user=1.58 sys=0.01, real=1.54 secs] 
47865.226: [CMS-concurrent-reset-start]
47865.241: [CMS-concurrent-reset: 0.015/0.015 secs] [Times: user=0.02 sys=0.00, real=0.02 secs] 
06-Feb-2018 16:40:56.125 DEBUG [ajp-bio-8009-exec-309] edu.gatech.sakai.user.impl.GaTechUserDirectoryServiceImpl.authorizeUserLogin authorizeUserLogin; user.eid; gasconm; user.email: <a href="javascript:" target="_blank" gdf-obfuscated-mailto="DCgWi_u8CAAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">gas...@...
47867.243: [GC (CMS Initial Mark) [1 CMS-initial-mark: 7966695K(7966720K)] 9965851K(10270720K), 0.3508320 secs] [Times: user=0.93 sys=0.00, real=0.35 secs] 
47867.594: [CMS-concurrent-mark-start]
47871.695: [CMS-concurrent-mark: 4.099/4.101 secs] [Times: user=8.48 sys=0.03, real=4.10 secs] 
47871.695: [CMS-concurrent-preclean-start]
47871.739: [CMS-concurrent-preclean: 0.041/0.044 secs] [Times: user=0.05 sys=0.00, real=0.05 secs] 
47871.739: [CMS-concurrent-abortable-preclean-start]
47871.739: [CMS-concurrent-abortable-preclean: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
47871.740: [GC (CMS Final Remark) [YG occupancy: 2083498 K (2304000 K)]47871.741: [Rescan (parallel) , 0.5782041 secs]47872.319: [weak refs processing, 0.0000354 secs]47872.319: [class unloading, 0.1000239 secs]47872.419: [scrub symbol table, 0.0407029 secs]47872.460: [scrub string table, 0.0067250 secs][1 CMS-remark: 7966695K(7966720K)] 10050194K(10270720K), 0.7260444 secs] [Times: user=2.95 sys=0.01, real=0.72 secs] 

While the problem was happening I took a heap dump with jmap and upon analyzing it with Eclipse Memory Analyzer, it said that there was a leak with

java.lang.ref.Finalizer

and it was taking up 61% of the heap.  Digging a little deeper, the first 10 or 20 of those objects (out of 8million?) were coming from

BaseAnnouncementService

and of those 10 or 20 they all seemed to coming from the same site, which appeared to have an announcement that included an <img> tag(s) with base64 image data.  I'm not sure if the content of the announcement had anything to do with it or if the instructor did something specific and the content is irrelevant. 

Has anyone seen anything like this?  Is it a memory leak in the Announcement tool?  Or do we need to tune our Memory settings to improve the GC?

Thanks,

Austin

--
You received this message because you are subscribed to the Google Groups "Sakai Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/a/apereo.org/group/sakai-dev/.