With the transition to Broadcom the Brocade entity now operating under that flag has release it’s first major release 8.2.0.
The release notes show a somewhat strange title: “Fabric OS v8.2.0 for Brocade SAN Product Family“. As if FOS ever was intended to run on any other platform than SAN switches ?!?!?!… Weird..
“Brocade – A Broadcom Limited Company” as it is now known has with this release provided a similar nifty piece of code as with 7.4 (See here) I was, or still am, extremely happy with that release mainly because of a full fledged monitoring platform in the form of Fabric Vision.
The same excitement I now have with version 8.2.0. With a major release you get al the usual support of new kit based on the same generation of chipsets. The Condor 4 is still the predominant ASIC for this release to support all 32G equipment but also the previous generation is still very much under the hammer. Finally the 8-16E, 8-32E and 8-64E blades have been removed. Theses were only there to support older 2G end-device kit even on 16G chassis such as the 8510. Has not been a real success primarily due to performance related impacts if these blades did contain slower connections and caused back-pressure into fabrics. Anyway, different story.
With the ever increasing feeds and speeds the problem of congestion and latency becomes more apparent by the day. The scale of compute systems delivering hyperconverged infrastructures is constantly narrowing the room for errors. I very tiny hick-up in an operating system or application sitting in a VM alongside dozens of others may result in disastrous flow-on effects affecting large parts of entire fabrics. It is therefore that imperative that administrators become aware and proficient in the use of advanced performance monitoring and management of traffic flows with MAPS and Flow-vision. With FOS 8.2.0 numerous areas have been improved and extended so please make good use of this. As I’ve said many times before the Fabric-Vision license is the most valuable license for SAN administrators. If you have the budget buy them; if not make sure you get the budget. 🙂
I have been a RAS guy my entire career. If any piece of equipment or software did show lack of any of the Reliability, Availability and Service-ability components they were pretty much done with me right away. I think I have that from a colonel I had in the army who once had an appointment with a sales-rep who sold night vision equipment. The colonel took a look at it, threw it on the ground and asked where the protective suitcase was. The sales-rep totally baffled that his F45.000 (guilders before the Euro) night vision equipment was crushes asked why he did that. The colonel answered that if you try to sell anything that needs to be used by combat soldiers who in the middle of the night are under fire it better be working, rugged and serviceable in the field. Don’t show up with equipment in a plastic bag.
So to extend this story to FOS 8.2.0 numerous enhancements have been made in the troubleshooting area where I and many of my colleagues play. One of which is the addition of a fair amount of additional audit log entries in the supportsave. In previous versions numerous actions performed via webtools were not recorded so it was hard to see what changes could’ve resulted in a particular problem.
I’m not going further and list the entire release notes here. You can obtain that from Broadcom or any of the OEM’s. Check with your sales or service rep.
There is one thing that I want to highlight and that for me is the icing on the cake for this release. Since the dawn of time (ok since Brocade started then….) you had to configure, manage and monitor switches via three options. CLI, Webtools or DCFM/BNA. As webtools are relatively restricted in the sense you actually have to sit behind a browser, annoy yourself mucking around with Java and do some stuff, this was, and still is, not a very productive way of operating a SAN.
Then you have the CLI. If you’re not born in the “character” generation and are more accustomed to buttons and windows the CLI may be somewhat daunting. For those who are however preferring something actually readable they became very creative in writing scripts to execute commands remotely and parse the output to a local file for further processing.
How to kill your SAN.
To me this always has been a “How to kill your SAN option”. I can recall a company called Onaro who created a monitoring tool doing just that. Kicking of a large set of CLI commands and get the output to present it in a graphical format. That killed many SAN’s due to the uncontrolled effect it has on switches. Executing a cli command like “sfpshow -f” will actually use a very simple internal management bus and poll each individual SFP for the current status and counter values. If you have ~500 SFP’s in your switch this may take a while. If you do this interactively there is not really a problem but doing this programmatically where many subsequent commands are kicked off over ever so much new telnet/ssh sessions you will very quickly hit resource limits in the switch.
A third option is BNA (Brocade Network Advisor) which may for some companies be a financial stumble block to justify the purchase cost. It has a plethora of functions and features of which you almost can get a bachelors degree if you can master them all. It is very diverse but may be a bit overkill for some.
Now, there are companies who think they are smarter then the rest of us and simply use everything at their disposal and monitor their environment to death. This will for sure have a severe negative impact not only on the switch itself but also on FC traffic and fabric operations. Be aware that the switch CPU is responsible for all non-data traffic in a switch/fabric. It is executing processes that are responsible for creating routing tables based on FSPF calculations, it needs to maintain the name-server, distribute the zone-database, and many more. If that is not possible because the CPU is busy with something else you will see that operational FC processes like fabric-logins, port-logins etc may have troubles and simply deny host to storage operations. I.e. you don’t see disks.
So why did I come to this……??? Ahhh, yes REST.
For many years we’ve requested Brocade to provide an alternative method to have external tools obtain information and allow customers to manage/operate their SAN’s in a more programmable fashion. The first method do be able to do this was via BNA which contained a REST API. This did however require BNA to be licensed as when you just have a few switches you don;t really care about BNA but if you want to write scripts against 60 switches it requires an enterprise license.
With FOS 8.2.0 there is now a internal REST API interface that allows you to manage a switch via any programming language you can think of. Yes even curl and bash will do just fine. That being said thereare however some limitations. The API does not provide JSON output not does it accept that as a data portion of a command. The data structure is based on YANG (RFC7950 -which is also the case in BNA. RFC 7951 describes the JSON extension so it may come in a future version of FOS). The REST API also provides a method of throttling the sessions. You can configure to not allow more than X number of simultaneous sessions to be used and not to allow more than Y commands in Z number of seconds.
Brocade have published a new guide (FOS API Reference) to explain all this and if you’re interested I would highly recommend you to use this method and abandon any CLI based scripts you may have.