Problem:
Solaris 10 ZFS ARC Cache configured as default can gradually impact NetBackup performance at Memory level, forcing NBU to use a lot of Swap memory even when there are several Gig's of RAM "Available", in the following Solaris 10 server we initially see that 61% of the memory is own by ZFS File Data (ARC Cache)
# echo ::memstat | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 1960930 15319 24%
ZFS File Data 5006389 39112 61%
Anon 746499 5832 9%
Exec and libs 37006 289 0%
Page cache 22838 178 0%
Free (cachelist) 342814 2678 4%
Free (freelist) 103593 809 1%
Total 8220069 64219
Physical 8214591 64176
Using ARChits.sh script we can see how often the OS hits or requests memory from ARC Cache, in our sample is in a 100%, meaning we have a middle man between NBU and the Physical Memory.
# ./ARChits.sh
HITS MISSES HITRATE
2147483647 692982 99.99%
518 4 99.23%
2139 0 100.00%
2865 0 100.00%
727 0 100.00%
515 0 100.00%
700 0 100.00%
2032 0 100.00%
4529 0 100.00%
1040 0 100.00%
…
…
To know which processes are the ones hitting ARC Cache or requesting memory we use dtrace to count the number of positive and missed hits.
# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @[execname] = count() }'
...
...
nbproxy 1099
nbpem 1447
nscd 1649
bpstsinfo 1785
find 1806
fsflush 2065
bpclntcmd 2257
bpcompatd 2394
perl 2945
bpimagelist 4019
bprd 4268
avrd 8899
grep 9249
dbsrv11 20782
bpdbm 37955
As We can see dbsrv11 and bpdbm and the main consumers of ARC Cache memory. Next step is to know the memory requests sizes in order to measure the impact of ARC Cache to NBU requests this because ARC Cache nature of slicing the memory in small blocks.
# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @["bytes"] = quantize(((arc_buf_hdr_t *)arg0)->b_size); }'
bytes
value ------------- Distribution ------------- count
256 | 0
512 |@@@@@ 10934
1024 | 1146
2048 | 467
4096 | 518
8192 |@@@@ 9485
16384 |@ 1506
32768 | 139
65536 | 356
131072 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 67561
262144 | 0
Majority of memory requests are 128KB (131072) block sizes and some few are very small, this when there are no major requests at NBU level. Things change when a lot of NBU requests come in, suddenly raising small blocks requests. Following output shows a Master pulling some data running several vmquery commands.
# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @["bytes"] = quantize(((arc_buf_hdr_t *)arg0)->b_size); }'
bytes
value ------------- Distribution ------------- count
256 | 0
512 |@@@@@@@@@@@@ 78938
1024 |@ 7944
2048 | 1812
4096 |@ 3751
8192 |@@@@@@@@@@@@ 76053
16384 |@ 9030
32768 | 322
65536 | 992
131072 |@@@@@@@@@@@@ 77239
262144 | 0
vmquery drains all the memory requests plus the OS is force to rehydrate the memory in to bigger blocks in order to meet NBU block sizes requirements, impacting the application performance mainly at NBDB or EMMDB levels.
# dtrace -n 'sdt:zfs::arc-hit,sdt:zfs::arc-miss { @[execname] = count() }'
...
...
avrd 1210
bpimagelist 2865
dbsrv11 2970
grep 4971
bpdbm 6662
vmquery 94161
The memory rehydration forces the OS to use a lot of Swap memory even when there is a lot available under "ZFS File Data"
# vmstat 1
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s1 s2 s3 s4 in sy cs us sy id
0 0 0 19244016 11342680 432 1518 566 604 596 0 0 8 -687 8 -18 8484 30088 9210 10 5 84
0 2 0 11441128 3746680 44 51 8 23 23 0 0 0 0 0 0 6822 19737 7929 9 3 88
0 1 0 11436168 3745440 14 440 8 23 23 0 0 0 0 0 0 6460 18428 7038 9 4 87
0 2 0 11440808 3746856 6 0 15 170 155 0 0 0 0 0 0 6463 18163 6996 9 4 87
0 2 0 11440808 3747000 295 822 15 147 147 0 0 0 0 0 0 7604 27577 8989 11 5 84
0 1 0 11440552 3746872 122 683 8 70 70 0 0 0 0 0 0 5926 20430 6444 9 3 88
In this case there are 39GB of RAM Allocated for ZFS File Data (ARC Cache) that are supposed to be free in case any App needs it, problem is ARC Cache nature to slice the memory in small pieces and when the OS takes away some of the memory it takes long time to respond to any memory request.
# echo ::memstat | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 1960930 15319 24%
ZFS File Data 5006389 39112 61%
Anon 746499 5832 9%
Exec and libs 37006 289 0%
Page cache 22838 178 0%
Free (cachelist) 342814 2678 4%
Free (freelist) 103593 809 1%
Total 8220069 64219
Physical 8214591 64176
When the Master is rebooted initially there is no ZFS File Data allocation and NBU runs perfectly, the master performance degrades slowly depending on how fast the ARC Cache eats the memory.
# echo ::memstat | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 479738 3747 6%
Anon 422140 3297 5%
Exec and libs 45443 355 1%
Page cache 83530 652 1%
Free (cachelist) 2200908 17194 27%
Free (freelist) 4988310 38971 61%
Total 8220069 64219
Physical 8214603 64176
Solution:
We ran into this problem quite often with heavily loaded systems using Solaris 10 & ZFS. To address the problem, we limited the ZFS ARC cache on each problematic system. To determine the limit value we followed below procedure.
NOTE: As with any changes of this nature, please bear in mind that the setting may have to be tweaked to accommodate additional load and/or memory changes. Just monitor and adjust as needed.
1. After system is fully loaded and running backups, sample the total memory use:
Example:
prstat -s size -–a
NPROC USERNAME SWAP RSS MEMORY TIME CPU
32 sybase 96G 96G 75% 42:38:04 0.2%
72 root 367M 341M 0.3% 9:38:11 0.0%
6 daemon 7144K 9160K 0.0% 0:01:01 0.0%
4 v024875 9944K 9760K 0.0% 0:00:00 0.0%
1 smmsp 2048K 6144K 0.0% 0:00:22 0.0%
2. Compare percentage of memory in use to total physical memory:
prtdiag | grep -i Memory
Memory size: 131072 Megabytes
3. In the above example, approx 75% of the physical memory is used under typical load. Add a few percent for headroom (let’s call it 80).
4. 20% of 128GB is 26GB = 27917287424 bytes
5. Configure ZFS ARC Cache limit in /etc/system
set zfs:zfs_arc_max=27917287424
6) Reboot system
References:
https://forums.oracle.com/thread/2340011
http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Limiting_the_ARC_Cache
http://dtrace.org/blogs/brendan/2012/01/09/activity-of-the-zfs-arc/