www.Information-Cascade.co.uk
Societe International de Telecommunication Aeronautique, (www.sita.int) was founded 50 years ago, as a not-for-profit cooperative by agreement between the worlds airlines, to manage their networks, and do various things at airports (weather, lost baggage, ...). The name is not familiar to Joe Public, but if you've been through an international airport, you've depended on SITA services. SITA is now moving into a more commercial portfolio.
This particular data center focuses on the clean running of the machines, with the usual issues about new projects, new problems, sizing, upgrades, contingency, targets, migrations, and keeping the systems at "mundane" control, with pro-active tests to see that things are working to spec, reactive maintenance when a router fails, follow-ups to accumulate what is learned, and extrapolate that knowledge to things that didn't happen (and therefore won't).
|
|
|||||||||
| | ||||||||||
Tape RobotI installed a StorageTek SCSI tape robot, (pre SAN). This had several different intermittant teething problems, which required all the SCSI cards in the room to be upgraded, the floor to be lifted a few times, the SCSI-Multiplexors to be upgraded, and a number of system down times (external SCSI is not hot-pluggable). Whilst backups were unreliable, the main system backups were done on an external DLT. This exposed problems with the fact that HP do not support two devices connected to the same SCSI card (never mind active), and diagnostic support tactics like: can you try swapping the two cards over, in-case its the card ... The project was plagued by the VAR abandoning all involvement from day-1, and HP claiming that they didn't support that _particular_ model (so we eventually got it swapped for one they did support, remarkably similar, but more slots). Plus every wire change that touches a system, requires that system to be offline (and a centralised backup touches every system ...). At some point, things came together, it became reliable and it went live. Since then the only problems have been week-08 and 09 not being valid octal numbers (all others are). |
OmniBackThe UNIX backups are done through OmniBack, which does individual tasks very well, but is not always ideal. For-example, OmniBack sort of assumes that you will allocate many tapes into a pool, and let it manage which ones are used. However the site had a specific requirement, that each days tapes be easily identified, and moved to the firesafe, each day. OmniBack (in its own presumed configuration), uses loose allocation of tapes, but that uses the maximum number of tapes (each task gets its own tape), so if you have 4 drives you need 4 tapes, PLUS if any tape fills up, you need another to guarantee there is always one available, plus if OmniBack fails (for another reason), it blames the tape, and needs yet another tape. That makes 6 tapes, when 3 was more then enough. The only real way to avoid this is to use Strict Allocation, but that causes problems with two machines wanting the same tape - and one fails immediately. That meant that I had to script up our own set of custom backup sequencers. I also scripted up a Tk GUI monitor, to show the results of all previous backups (lots of red/green boxes, with access to the messages), and some GUI screens to do tape management (list all tapes pools, move tape pool to door, move tapes beyond OmniBacks reach, recycle an old pool, etc, ) |
|||||||||