They turned off the HPSIM and general management and alerting server this morning, or at least, unplugged it, cause it was causing this huge network spike at a remote site
I know for a fact that no one besides myself knows what it is exactly that machine does, as its only usefull to me and what I do.
That doesnt mean it isnt explained in the server list in Sharepoint that I made and painstakingly try to keep up to date, that no one bothers to ever look at.
And of course no one bothered to ask during the day what exactly the impact is that they unplugged the server.
I mean, who cares about hardware and remote monitoring of servers anyway. It is, after all, only the most basic part of my job.
That made me feel really appreciated.
HPSIM was reinstalled a few weeks go by one of my collegues. When I explained it took me 2 days to set it up last time I installed it, he was suprised.
I will admit, it doesnt need to take that long. But it was new software to me at the time, and I was carefull, and ran into some awkward service account issues.
Its a very messy collection of software, basicly, so you need to be carefull and precise.
I read the manuals first.
I ended up needing 3 different service accounts. With different levels of rights and access.
He reinstalled HPSIM in about 1 hour. Its his way, he loves to impress with how fast he can do things.
I havent logged on to it in the meantime, because my time was needed elsewhere for the last few weeks. Build activities that go first. Project. Bids. Money.
I warned them in a long email 2 weeks ago, that no one was now doing any active systems administration. No one was keeping an eye on things. No one was cuting the grass.
Fast forward to this morning…
So, I cant dispute that HPSIM or something on that server killed that sites 2mbit WAN line for an hour, daily, between 10 and 11.
I went in over the ILO to have a look, after I asked them to at least plug -that- back in.
HPSIM service wouldnt start, as it couldnt authenticate its domain service account, cause it had no network. This was expected.
What wasn’t expected, was the fact that it was using this collegues domain admin account to start.
And so was the OpenSSH service.
And so was the Sofware update repository service.
I curse myself for not having reinstalled it myself, for one. And I curse myself for not having managed that server myself the past few weeks.
They ask me now, wtf was that server doing? I honestly dont know. I havent managed it for the past few weeks, due to me being allocated to build activities, as they well know.
I hate it. I hate the fact that I dont know.
Even though I have no need to feel responsible, I so very much do. This server was mine, it did this on my watch, at least that is how it feels.
I cant be sure what caused the network spike, and I will never know because they wont let me plug the server back into the network.
This weekend I will reinstall HPSim on a different server. A server that I had racked as spare, for this exact kind of scenario.
It will be reintalled slowly, carefully, with the appropriate documentation at hand, as I did last time.
It will be stable. It will be secure. It will be managed.
It will be beautifull.
And I am not gonna let anyone else on that server. If it ever misbehaves again, they can hold me personally acountable, I want them to, god knows I want them to.
There is only one person in my department with a sense of responsibility for our enviroment.
There is only one person in my department who actually cares things are done correctly.
Every time I place my trust in another technical person, I am dissapointed.
No one else is touching that server from now on.
Happy Sysadmin day.