a few of you also replied by private mail, thanks. Supplying the -Xloggc option worked, although I don't understand where the info went before, and the log rotation option doesn't rotate anything but just truncates the log file on every JVM restart, which sucks. But ok, I have some data.
Below you find the log of one session. Users reported bad response times that correspond with the Full GC runs, but also very bad response times from 17:00 onward where there is often no GC activity logged at the times that users experienced waiting time. So I'm not entirely sure if this is all caused by GC.
Anyway, heap sizes of up to 60 GByte are apparently not so common, at least not with WebObjects. I found a helpful article [1] about very large JVM heaps, including the hint to the Zing Azul JVM, which features a stopless garbage collector [3], but costs $8000 per server and year. In the end, we may actually be trying that if other options fail, but first I need to understand more about what's in memory. I'm currently approaching that question via JProfiler, especially to find out if that's all snapshot cache (how would I limit that in size?) or some other homemade memory leak.
I also found one strange event today where the application appeared stuck, but continued after a minute or two, while the gc log file was truncated at that time, without the JVM restarting. I can't see how that's supposed to happen. Here's how that looked in my shell running "tail -F gc.log":
[PSYoungGen: 15256082K->3431947K(16962048K)] 25580475K->13801928K(27895808K), 1.1505415 secs] [Times: user=6.00 sys=0.03, real=1.15 secs]
2016-01-26T12:11:38.640+0100: 26136.407: [GC (Allocation Failure)
Desired survivor size 4679794688 bytes, new threshold 15 (max 15)
[PSYoungGen: 15791627K->3414349K(16958976K)] 26161608K->13935770K(27892736K), 1.0761646 secs] [Times: user=6.10 sys=0.17, real=1.08 secs]
2016-01-26T12:16:10.835+0100: 26408.603: [GC (Allocation Failure)
Desired survivor size 4657250304 bytes, new threshold 15 (max 15)
[PSYoungGen: 15774029K->3449938K(17397760K)] 26295450K->14128338K(28331520K), 1.3285101 secs] [Times: user=6.24 sys=1.39, real=1.33 secs]
2016-01-26T12:16:12.164+0100: 26409.931: [Full GC (Ergonomics) [PSYoungGen: 3449938K->2355719K(17397760K)] [ParOldGen: 10678400K->10933508K(13649408K)] 14128338K->13289227K(31047168K), [Metaspace: 84345K->84345K(94208K)], 12.1170523 secs] [Times: user=67.98 sys=1.03, real=12.11 secs]
2016-01-26T12:21:18.990+0100: 26716.757: [GC (Allocation Failure)
Desired survivor size 4644667392 bytes, new threshold 15 (max 15)
[PSYoungGen: 15218183K->2398851K(17410560K)] 26151691K->13458637K(31059968K), 1.3389149 secs] [Times: user=8.72 sys=0.17, real=1.34 secs]
tail: gc.log: file truncated
Java HotSpot(TM) 64-Bit Server VM (25.66-b17) for linux-amd64 JRE (1.8.0_66-b17), built on Oct 6 2015 17:28:34 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 99008820k(13655516k free), swap 67108860k(66990156k free)
CommandLine flags: -XX:InitialHeapSize=1584141120 -XX:+ManagementServer -XX:MaxHeapSize=90194313216 -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:-UseGCLogFileRotation -XX:+UseParallelGC
2016-01-26T12:33:36.029+0100: 27453.796: [GC (Allocation Failure)
Desired survivor size 4588568576 bytes, new threshold 15 (max 15)
[PSYoungGen: 15261315K->2538940K(17720832K)] 26321101K->13670347K(31370240K), 0.7936427 secs] [Times: user=4.62 sys=0.03, real=0.80 secs]
2016-01-26T12:39:29.156+0100: 27806.923: [GC (Allocation Failure)
And here's the above-mentioned full gc log of 4 hours during the most critical usage time where users experienced waits. Notable wait times according to user's notes occurred at least at the following times: 15:07, 16:05, 16:57, 17:00-17:35, 17:59.
2016-01-25T15:06:54.344+0100: 36634.642: [GC (Allocation Failure) [PSYoungGen: 18701268K->2058545K(23079424K)] 33613120K->17079754K(41686016K), 0.8635088 secs] [Times: user=4.22 sys=0.50, real=0.86 secs]
2016-01-25T15:15:48.438+0100: 37168.736: [GC (Allocation Failure) [PSYoungGen: 19683633K->2231668K(23157248K)] 34704842K->17370085K(41763840K), 0.9898285 secs] [Times: user=4.98 sys=0.05, real=0.99 secs]
2016-01-25T15:24:54.891+0100: 37715.189: [GC (Allocation Failure) [PSYoungGen: 19856756K->2254238K(23623168K)] 34995173K->17626274K(42229760K), 1.0030974 secs] [Times: user=4.91 sys=0.30, real=1.00 secs]
2016-01-25T15:33:58.784+0100: 38259.082: [GC (Allocation Failure) [PSYoungGen: 20552606K->2502793K(23740416K)] 35924642K->17914660K(42347008K), 1.0203825 secs] [Times: user=5.32 sys=0.24, real=1.02 secs]
2016-01-25T15:43:12.959+0100: 38813.256: [GC (Allocation Failure) [PSYoungGen: 20801161K->2677482K(23608320K)] 36213028K->18138909K(42214912K), 1.2300001 secs] [Times: user=5.53 sys=0.22, real=1.23 secs]
2016-01-25T15:51:16.686+0100: 39296.983: [GC (Allocation Failure) [PSYoungGen: 21276394K->2800542K(23760384K)] 36737821K->18411609K(42366976K), 1.1514720 secs] [Times: user=6.19 sys=0.15, real=1.15 secs]
2016-01-25T15:58:35.696+0100: 39735.994: [GC (Allocation Failure) [PSYoungGen: 21399454K->2938402K(23849472K)] 37010521K->18700024K(42456064K), 2.8563665 secs] [Times: user=6.23 sys=8.99, real=2.86 secs]
2016-01-25T16:05:33.310+0100: 40153.607: [GC (System.gc()) [PSYoungGen: 20526288K->2986997K(22197760K)] 36287910K->18889778K(40804352K), 1.2626979 secs] [Times: user=6.22 sys=1.00, real=1.27 secs]
2016-01-25T16:05:34.572+0100: 40154.870: [Full GC (System.gc()) [PSYoungGen: 2986997K->0K(22197760K)] [ParOldGen: 15902781K->16935926K(18606592K)] 18889778K->16935926K(40804352K), [Metaspace: 90226K->90226K(100352K)], 15.0685505 secs] [Times: user=86.67 sys=7.55, real=15.06 secs]
2016-01-25T16:13:35.766+0100: 40636.064: [GC (Allocation Failure) [PSYoungGen: 19210752K->419289K(23559168K)] 36146678K->17355224K(42165760K), 0.5276242 secs] [Times: user=2.41 sys=0.00, real=0.53 secs]
2016-01-25T16:22:46.735+0100: 41187.033: [GC (Allocation Failure) [PSYoungGen: 19353561K->740457K(23835648K)] 36289496K->17676391K(42442240K), 0.6634034 secs] [Times: user=2.52 sys=0.00, real=0.66 secs]
2016-01-25T16:30:31.503+0100: 41651.801: [GC (Allocation Failure) [PSYoungGen: 19674729K->1023572K(24313856K)] 36610663K->17959507K(42920448K), 0.8099960 secs] [Times: user=2.72 sys=0.08, real=0.81 secs]
2016-01-25T16:37:51.095+0100: 42091.393: [GC (Allocation Failure) [PSYoungGen: 20311636K->1263580K(20551680K)] 37247571K->18199515K(39158272K), 0.7992434 secs] [Times: user=3.32 sys=0.01, real=0.80 secs]
2016-01-25T16:49:34.671+0100: 42794.969: [GC (Allocation Failure) [PSYoungGen: 20551644K->1565656K(24408576K)] 37487579K->18501598K(43015168K), 0.8773248 secs] [Times: user=3.74 sys=0.00, real=0.87 secs]
2016-01-25T16:56:34.541+0100: 43214.839: [GC (Allocation Failure) [PSYoungGen: 21011416K->1791681K(21237760K)] 37947358K->18727623K(39844352K), 0.7816826 secs] [Times: user=4.03 sys=0.00, real=0.79 secs]
2016-01-25T17:01:31.494+0100: 43511.792: [GC (Allocation Failure) [PSYoungGen: 21237441K->1997601K(24417280K)] 38173383K->18933551K(43023872K), 0.8769351 secs] [Times: user=4.45 sys=0.09, real=0.88 secs]
2016-01-25T17:07:13.856+0100: 43854.154: [GC (Allocation Failure) [PSYoungGen: 21797153K->2080272K(24207360K)] 38733103K->19016222K(42813952K), 0.8543820 secs] [Times: user=4.50 sys=0.00, real=0.86 secs]
2016-01-25T17:12:37.866+0100: 44178.164: [GC (Allocation Failure) [PSYoungGen: 21879824K->2170000K(24282112K)] 38815774K->19105950K(42888704K), 0.8236605 secs] [Times: user=4.92 sys=0.00, real=0.82 secs]
2016-01-25T17:18:05.977+0100: 44506.275: [GC (Allocation Failure) [PSYoungGen: 22576272K->2299192K(24404992K)] 39512222K->19235143K(43011584K), 0.7759199 secs] [Times: user=4.77 sys=0.00, real=0.78 secs]
2016-01-25T17:28:11.288+0100: 45111.586: [GC (Allocation Failure) [PSYoungGen: 22705464K->2550874K(24838656K)] 39641415K->19486825K(43445248K), 1.3544459 secs] [Times: user=5.52 sys=1.93, real=1.36 secs]
2016-01-25T17:35:39.350+0100: 45559.648: [GC (Allocation Failure) [PSYoungGen: 23578202K->2665756K(24871424K)] 40514153K->19601715K(43478016K), 1.0373681 secs] [Times: user=5.67 sys=0.04, real=1.04 secs]
2016-01-25T17:43:31.775+0100: 46032.073: [GC (Allocation Failure) [PSYoungGen: 23693084K->2891569K(25351168K)] 40629043K->19827535K(43957760K), 2.8873970 secs] [Times: user=5.28 sys=11.09, real=2.89 secs]
2016-01-25T17:51:22.218+0100: 46502.516: [GC (Allocation Failure) [PSYoungGen: 24390449K->3020322K(24519680K)] 41326415K->19956296K(43126272K), 1.0685985 secs] [Times: user=6.17 sys=0.00, real=1.07 secs]
2016-01-25T18:00:19.997+0100: 47040.295: [GC (Allocation Failure) [PSYoungGen: 24519202K->3177498K(25069568K)] 41455176K->20113480K(43676160K), 1.3560052 secs] [Times: user=6.39 sys=0.34, real=1.36 secs]
2016-01-25T18:09:14.251+0100: 47574.548: [GC (Allocation Failure) [PSYoungGen: 24215578K->3059878K(24098304K)] 41151560K->20219258K(42704896K), 1.5082240 secs] [Times: user=6.62 sys=2.86, real=1.50 secs]
2016-01-25T18:15:01.783+0100: 47922.080: [GC (Allocation Failure) [PSYoungGen: 24097958K->2809652K(25209856K)] 41257338K->20232628K(43816448K), 1.8871482 secs] [Times: user=6.38 sys=6.59, real=1.88 secs]
2016-01-25T18:24:10.926+0100: 48471.223: [GC (Allocation Failure) [PSYoungGen: 23806772K->2757610K(23754752K)] 41229748K->20472641K(42361344K), 1.8807114 secs] [Times: user=6.50 sys=6.05, real=1.88 secs]
2016-01-25T18:32:11.039+0100: 48951.336: [GC (Allocation Failure) [PSYoungGen: 23754730K->2664985K(25179136K)] 41469761K->20610485K(43785728K), 1.6722986 secs] [Times: user=5.99 sys=4.89, real=1.68 secs]
2016-01-25T18:32:33.693+0100: 48973.991: [GC (System.gc()) [PSYoungGen: 5342511K->2261788K(25064448K)] 23288012K->20489350K(43671040K), 1.4310250 secs] [Times: user=5.08 sys=5.69, real=1.43 secs]
2016-01-25T18:32:35.124+0100: 48975.422: [Full GC (System.gc()) [PSYoungGen: 2261788K->373549K(25064448K)] [ParOldGen: 18227562K->18606102K(18606592K)] 20489350K->18979651K(43671040K), [Metaspace: 94654K->94654K(104448K)], 17.7869839 secs] [Times: user=109.81 sys=4.10, real=17.79 secs]
2016-01-25T18:38:01.155+0100: 49301.452: [Full GC (Ergonomics) [PSYoungGen: 21485357K->0K(25064448K)] [ParOldGen: 18606102K->17803897K(22484992K)] 40091459K->17803897K(47549440K), [Metaspace: 94728K->94728K(104448K)], 11.2804123 secs] [Times: user=71.57 sys=1.84, real=11.28 secs]
2016-01-25T18:49:54.421+0100: 50014.719: [GC (Allocation Failure) [PSYoungGen: 21111808K->310417K(24994816K)] 38915705K->18114322K(47479808K), 0.4822526 secs] [Times: user=1.49 sys=0.00, real=0.49 secs]
2016-01-25T19:02:11.033+0100: 50751.331: [GC (Allocation Failure) [PSYoungGen: 21435537K->650681K(21775872K)] 39239442K->18454586K(44260864K), 0.4177362 secs] [Times: user=1.93 sys=0.01, real=0.42 secs]
2016-01-25T19:09:54.731+0100: 51215.028: [GC (Allocation Failure) [PSYoungGen: 21775801K->1405472K(25216000K)] 39579706K->19209377K(47700992K), 0.6537367 secs] [Times: user=3.92 sys=0.00, real=0.65 secs]
As always, I'm thankful for any more hints. Please reply publicly so everyone else can benefit from the answers in the future.