Managing Equipment Tips
Managing Equipment in PIPE Networks’ Co-location facilities
When equipment is stored at a co-location centre there are many things customers can do to ensure that it is manageable in the true sense of the word. As the equipment will not be physically available to turn on or off when required etc certain equipment can be deployed to help mitigate issues as they arise.
Important to note is that all equipment fails. The methods and techniques in this document address how to recover quickly from these incidents. How much or little of the following is implemented will dictate the time between failure and recovery of the equipment in question. It is not only equipment failure that sees remote access being required, it could be incorrect software configuration or greater Internet problems that sees your traditional paths and means of management restricted or removed.
Please note that the information provided below is provided with NO warranty implied or otherwise and is the sum of the knowledge at PIPE Networks with respect to how we deploy remote sites for the level of customer satisfaction and uptime that we desire.
In-Band access is whereby you manage a piece of equipment by an IP link. This may be using Telnet, Web or SNMP. If the IP link fails (anywhere along the path it takes between your main site and co-location site) so does your management circuit to your equipment.
This is where you always have another way to access and manage equipment in the case of a failure of your standard management link. This also comes down to selecting the correct equipment to be installed in a data centre. Equipment that does not have a serial console for configuration or emergency access should be considered as unsuitable for deployment to a remote data centre.
Servers that have serial console ports or can have serial console redirection enabled in the BIOS should be deeply considered. This will allow a remote staff member much more access to the system in case of some form of catastrophe with the server. Even the enemy of serial port access, Windows based operating systems, have serial ‘Emergency’ access. This can be seen by doing a search of the Micorsoft knowledge base for :
Emergency Management Services console redirection
Linux/FreeBSD and Unix based variant have serial support built in as the default but it may need to be activated etc.
Pick Servers with BIOS over serial
Note there is a distinction between having an operating system that supports a serial console and having the BIOS available via serial during the boot process. Dell Servers quite often have this feature, as do many others. Shop around for this feature - it could be a really handy thing to have.
All networking equipment that you chose should have a serial console port as well. This generally means getting equipment that brags about being ‘managed’. Network equipment (switches, routers etc) should have a serial port for management.
Console port server/router
Once you have serial ports you will need to connect them to something to aid in the managing of these devices. We recommend a new or second hand (see EBAY) Cisco 2511 16 port Async router. These devices are bullet proof, have 16 serial ports for management and a 10BT (half duplex) interface. They also have 2 × 2 Mbit/sec sync. serial interfaces - if required. If you get the non RJ-11 version you will need to ensure that you also get the octopus cables. Please see the Cisco Systems website for more details on these devices.
Connect all serial ports on your equipment back to the Cisco 2511. Now you have one device from where you can manage all others via a path other than its network connection.
Getting Access when the net is down
Now, in order to access your equipment in times of network outage where, for some reason, you cannot access the equipment via normal management protocols - SSH, HTTP, Telnet etc you should install a Telstra analogue service - sometimes called POTS (plain old telephone service). On the end of this place a nice and crusty old 28.8/33.6k modem so you can dial into your equipment. Use an old, reliable modem - this modem will most likely be supporting an out-of-band management data stream of 9600 baud VT100.
We use POTS because it is the one service that is everywhere (Telstra must install it), cheap to run (about $30 per month), cheap to install ($299 or thereabouts) and can be accessed from anywhere on the planet.
Plug the modem into the AUX port of the router (there are many issues for configuring the console port and a modem with respect to security) and configure it for dial-in. Once you have dialed the 2511 router via the modem you will have out-of-band access to all equipment connected to it.
What do you do when a server, switch or router etc will not respond? It needs to be reset/rebooted/reloaded - the only way to do this is to recycle the power - turn it off and then on again.
The APC Masterswitch from APC does just this. It is a device that has 1 power inlet and 8 power outlets. It has an ethernet port for management via Telnet, Web and SNMP as well as a serial port for out-of-band management. From this device you can turn off, on, and reboot any device connected to it.
This is the device that allows you to sleep really well at night!
Servers should have their soft power setting tested prior to deployment, there’s nothing worse than trying to reboot a computer only to find out that it defaults to staying switched off due to BIOS settings.
Search Google for information on:
- Remote Serial Console HOWTO for getting Linux console configured.
- cisco modem aux port for configuring a modem on a cisco AUX port.