Archive for July, 2008

When you touch that server you touch me

Friday, July 25th, 2008

They turned off the HPSIM and general management and alerting server this morning, or at least, unplugged it, cause it was causing this huge network spike at a remote site

I know for a fact that no one besides myself knows what it is exactly that machine does, as its only usefull to me and what I do.

That doesnt mean it isnt explained in the server list in Sharepoint that I made and painstakingly try to keep up to date, that no one bothers to ever look at.

And of course no one bothered to ask during the day what exactly the impact is that they unplugged the server.

I mean, who cares about hardware and remote monitoring of servers anyway. It is, after all, only the most basic part of my job.

That made me feel really appreciated.

HPSIM was reinstalled a few weeks go by one of my collegues. When I explained it took me 2 days to set it up last time I installed it, he was suprised.

I will admit, it doesnt need to take that long. But it was new software to me at the time, and I was carefull, and ran into some awkward service account issues.

Its a very messy collection of software, basicly, so you need to be carefull and precise.

I read the manuals first.

I ended up needing 3 different service accounts. With different levels of rights and access.

He reinstalled HPSIM in about 1 hour. Its his way, he loves to impress with how fast he can do things.

I havent logged on to it in the meantime, because my time was needed elsewhere for the last few weeks. Build activities that go first. Project. Bids. Money.

I warned them in a long email 2 weeks ago, that no one was now doing any active systems administration. No one was keeping an eye on things. No one was cuting the grass.

Fast forward to this morning…

So, I cant dispute that HPSIM or something on that server killed that sites 2mbit WAN line for an hour, daily, between 10 and 11.

I went in over the ILO to have a look, after I asked them to at least plug -that- back in.

HPSIM service wouldnt start, as it couldnt authenticate its domain service account, cause it had no network. This was expected.

What wasn’t expected, was the fact that it was using this collegues domain admin account to start.

And so was the OpenSSH service.

And so was the Sofware update repository service.

I curse myself for not having reinstalled it myself, for one. And I curse myself for not having managed that server myself the past few weeks.

They ask me now, wtf was that server doing? I honestly dont know. I havent managed it for the past few weeks, due to me being allocated to build activities, as they well know.

I hate it. I hate the fact that I dont know.

Even though I have no need to feel responsible, I so very much do. This server was mine, it did this on my watch, at least that is how it feels.

I cant be sure what caused the network spike, and I will never know because they wont let me plug the server back into the network.

This weekend I will reinstall HPSim on a different server. A server that I had racked as spare, for this exact kind of scenario.

It will be reintalled slowly, carefully, with the appropriate documentation at hand, as I did last time.

It will be stable. It will be secure. It will be managed.

It will be beautifull.

And I am not gonna let anyone else on that server. If it ever misbehaves again, they can hold me personally acountable, I want them to, god knows I want them to.

There is only one person in my department with a sense of responsibility for our enviroment.

There is only one person in my department who actually cares things are done correctly.

Every time I place my trust in another technical person, I am dissapointed.

No one else is touching that server from now on.

Happy Sysadmin day.

Big Bang Servermove succesfull

Monday, July 7th, 2008

This Saterday, we switched the subnet over, and moved all the remain criticle systems over to a new serverroom across the country.

I didnt get to make as many pics as I liked, and no video, this was mainly cause I was so bussy of course. 😀

Half the pics below are curtesy of Arnold.

IMG_3656
Back of the Digital Alpha box, refernece sho for recabling.

IMG_3657
Packing up the servers in special locked crates. You could see these movers where the right stuff, big burly guys, but they handled the servers like feathers.

IMG_3658
Justin felt at home at the new location.

Big Bang Photo 005
x3800 waiting to be converted.

Big Bang Photo 001
We layed out the servers in the hall in the order we where building them into the racks.

IMG_3660
Our project operation center, round the corner of the serverroom. From here the managers of the project coordinated the downtime, and the business on-site testing.

IMG_3659
Tom, our WAN guy, on the phone with Mohamed, our Unix guy, who was supporting us remotely.

IMG_3661

Big Bang Photo 004

Big Bang Photo 003
Very good food and snacks where provided by Arnold and Jan. Thank you guys! You know the best way to an engineers heart is through his stomach!

IMG_3658

Big Bang Photo 006
Justin loves Legos, so this pictures seems to make sense.

Big Bang Photo 007
Justin working on the rack conversion kit for the IBM systemx 3800

Big Bang Photo 008
Tom and me discussing the Proxy server and internet line out. Old proxy, new Internet line, and some firewall rules where needed.

Big Bang Photo 010
Arnold:“2 pizza please!” ;  Mustafa is thinking: “I dont like Pizza”…

Big Bang Photo 011
Paul being technical. We hold our collective breath.

Big Bang Photo 009
Sliding in a server into its new home.

IMG_3662
Bottom of racks SR3 and SR4

IMG_3666
Top Left rack, SR4

IMG_3665
A suprise box. This witebox FTP server turned out to be running 4 essential ustomer FTP/EDI flows. We had space for it thankfully, resting on the IBM x3800. Hopefully it will be gone in 2  weeks, but nothing is temporary.  Made this picture of it for the Visio rack diagram.

IMG_3668


IMG_3669
One of the 2 redundant FTP servers failed on transport. Justin and Paul spend 2 hours putting a new one together out of spare parts of old ones. HP Netservers here, P2 machines, a decade old.

IMG_3670
Behind locked door and locked racks, all servers humm quitely, content in their new home.

A day later, back in the first location:

IMG_3674
Empty racks moved out of the server rooms

IMG_3675
Mail Cluster going to get sent back to UK

IMG_3677

IMG_3678
Gertjan visibilly enjoying dismanteling the place. 6 years of mental burden of supporting tihs stuff being dealt with here 😉


IMG_3680
The NAS, all the data for the Netherlands, with all volumes deleted and now unplugged, ready for storage.

Videos of “demolition”:


Mustafa and Gertjan have taken apart the entire second server room in 1 day. (click here for link if you cant see the embed above)


Oooh.. Nobs! Lovely nobs!!  (click here for link if you cant see the embed above)

Meanwhile, back at the old location

Friday, July 4th, 2008

A tour of the current state of the office, and one of the server rooms, where most racks are now empty or near-empty.


An Epic moment. Gert-Jan uninstalls Citrix on the last 2 servers in the farm, effectively ending 6 years of the 2000-user, 60-server Metaframe XP Citrix farm that served our Netherlands users. I would have had Marcel do this, but he has gone on holiday.


Various IBM servers and almost all the blades lie ready to be moved to the new location. We don’t have a use for any of these currently, so they will go into storage.

Various IBM servers and almost all the blades lie ready to be moved to the new location. We don’t have a use for any of these currently, so they will go into storage.

IMG_0695 IMG_0684
Remember when we where young and the world (servers) where new?


Both blade centers are going into storage, we have no use for them.

IMG_0909
Blade Centers in they Hayday

The racks are becoming quite bare now.

Before:
IMG_0144

After:

Before:
IMG_0143

After:

Before:
IMG_0987

After:

Before:
IMG_0149

After:


The pile for the garbage container crows and grows.


Marcel posing with all off the DL360 G3 severs (going into storage). he built the farm all those years ago, now he bears witness to its demise. Its a little sad for all of us that spent so years maintaining it all.


I managed to get my hands on one of my favorite servers, the IBM systemx 3650 with the 8 SAS disks in it. We reinstalled it as a new SQL2005 system, that will host, amungst other things, the HPSIM, IBM Director, Websense and Sharepoint databases. All for internal IT use.

T-Minus 1 day to Big Bang serverroom move

Thursday, July 3rd, 2008

Spent today going over some cabling details mostly. Its strange how stressed we all became over these little details, while we will have larger issues to worry about on the day.


(high res)

During the move, we will be moving the Voip servers, RF-Controllers (used by various sites to do handheld-scanning), an Authentication server, the Wyse-Terminal (thin client) management server, and 1 PDC.

We are also moving 2 very criticle FTP/EDI (Electronic Data Interchange) servers, that handle lots of customer-related FTP flows and internal EDI between warehouse management systems.

All the above systems (in the rack diagram in RED) rely on keeping their old IP adress for now, and thus we are moving the entire subnet over to the new site. This is why we call it the big bang. After that, the old site will no longer be routable to the rest of the network, and all that remains is stripping it and storing the hardware that is left.

Also the former Domain Controllers / DNS servers are an issue. Their IP adresses have long since been used in clients all over the place, the config of many of which we cant control centerally (for example through DHCP). Therefore I am moving 1 of the old PDC’s, and its keeping 2 of the 3 legacy IP adresses we need to keep alive.

We spent yesterday doing last cabling work and other such things in preperation for Saterday (see pics below). I still have some things to do, there are some management/reporting scripts still running on the old site I need to migrate tomorow, and I have yet to get round to re-installing HPSIM again. I might start on it after this post, actually.

Here are yesterdays pics:

Starting to look more like the planned diagram now. As you can see we, we replaced the Sun Storagetek with the HP Storageworks MSL9000 that we salvaged from one of the Warehouse Management Systems that was migrated to Prague last week. We dont know if we can use it though, its kinda overkill for the amount of data we now need to backup anyway.

Mustafa is set to become my new right-hand man after his project, he certainly has got the right attitude 😉

Oh thats Justin, he has been added to this project to take some of the load of me. He really knows his stuff and is a wirlwind of highly-opinionated energy 😉

Ninja-Consultant. Implements ESX-virtualisation solutions when you are not looking!

Ser is our resident LAN guy.  Here we created a patching/switchport diagram for the racks (on the laptop screen), and he is tightening up the VLAN configs on the stacked core switches. Justin and me then quickly repatched everything before anyone noticed there was a service interruption 😉 (dont worry, the really criticle stuff goes in during the Big Bang)