Houdini was a master in getting out of these things. Out of a swimming pool with a Straight-Jacket on I mean. This week I had an analogy to his problem. At a customer site Brocade Network Advisor (BNA) became totally unresponsive up to a point where it just stopped.
BNA has had a long history of development and is the evolved version of DCFM (Data Center Fabric Manager) which in itself was a merger of McData EFCM (Enterprise Fabric Connectivity Manager) and Brocade Fabric Manager. So lots of different code-bases each with different dependencies etc. With version 12 the entire IP stack from the Foundry side plus new developments from the converged platforms (VDX) was also included. Now, you can imagine that when you have to manage a large SAN this app has something to do. Not only providing a pretty picture but also keeping track of events, status, thresholds, notifications etc. This requires a pretty beefy piece of hardware in terms of CPU and memory.
BNA itself comes in a couple of flavors from Professional-Plus to Enterprise whether with SAN, IP or combined each with different options and capabilities. This particular customer has the SAN Enterprise version which he needed because there were little over 190 SAN switches under management. (Yes, pretty large environment). The issue was, despite the fact the server had 8 CPU’s and 32GB of memory to its disposal, that the Java Virtual Machine was limited of using on 4GB of that. The -Xmx was set to 4096MB which more or less did put BNA into a straight-jacket. It takes more than Houdini to get this sorted because the only way (normally) to resolved this is adjust the memory heap size the JVM can use but in order to do that you need to change it via the BNA client and select the Server -> Options menu item. Since BNA was unable to start, and more busy doing garbage collection than anything useful, connecting to it was not possible. Obviously the JVM is started via a wrapper which sets the JRE with all sorts of options. This wrapper gets its information from a file called dcmsvc.conf located in the <install-dir>\conf directory. In that file the two entries responsible for the server memory part are:
The values of %MIN_HEAP_SIZE% and %MAX_HEAP_SIZE% are set in the Environment variable block located a bit earlier in that file.
set.MAX_HEAP_SIZE = xxxxm
set.MIN_HEAP_SIZE = yyyym
The default is 4096m (or 4GB). Since this particular environment scratched the edges of what BNA was designed to do I adjusted this to 16384m so that 16GB could be use by the JVM. This started things kicking and moving and the 8 CPU had enough work to do. The only problem is that when java garbage collection takes place BNA can become unresponsive for a while but with some serious grunt in the form of 8 CPU’s this should be doable. If you encounter the same issue in your environment you might want to play around with these values a bit to get to the java sweet-spot regarding max heap size. There are numerous articles around the web which touch on the max_heap_size but none of them seem to provide the silver bullet.
Hope this helps a bit sorting out sluggish BNA behaviour.
Erwin van Londen