The week from hell

avatar

IMG_20200212_075857.jpg


Today concludes what I would absolutely consider the week from hell. It is very fitting that tomorrow is Thanksgiving and we get a short break after the time that I have put in over the past week. In case you missed it, I came into work a week ago to find that our primary domain controller had failed to boot back up after a power outage.

Unfortunately, this server was not only our domain controller, but it also hosted our DHCP server, DNS, and a handful of other services. That pretty much meant if you didn't have a static IP address (which is most people in the organization), then you had no Internet.

Wireless was down because the access points and wireless networks rely on DHCP to get addresses on the network. Phones were down for the same reason as well as hard wired machines. File services was running on a separate piece of hardware, so none of our data was lost.

Virtualization is an amazing thing and it saved my butt in this case.

With the server down and everyone looking to me for a solution, I started attacking the problem from three different directions. I was hoping that at least one of them would net me some results.

The key issue with the server was the fact that although the drives in my RAID array had been healthy two days earlier when I looked a them, after the power outage two of them had failed. With RAID 5 you are usually okay if one fails, but if two fail, then you are hosed.

I threw a couple of spare drives into the array, but that didn't seem to help. My other two options were to start from scratch with a new virtual server or access our backup appliance that had a bare metal backup of the server on it and spin that up in the cloud.

It took forever to coordinate with the organization where our backup appliance is hosted, and once they did get us set up, I was able to boot the server into the cloud, but I had no connectivity to it.

So really my only option at this point was to start from scratch which I had been working on between all of the other things.

Creating a new server is pretty easy. Like I said, virtualization is amazing. Within half an hour I had a fully running Windows 2022 server ready to go on one of my virtual hosts. I was able to use my backups and pull our DHCP configuration from the old server and apply it to the new one.

That got us 80% back up and running.

With the primary domain controller down though, people still weren't able to authenticate to the network and access their files. Once again, in a stroke of luck, we had our active directory replicated down at the same place we host our backups. After a considerable amount of time, they were able to promote their replica to the master and then I added my new server as a replica.

Long story short, we got back up and running. It took all day though and looking back, if I could have cut out all the waiting on the other parties to do what I needed them to do, I think I could have had it fixed in two to three hours.

After a week from hell, we are finally back up and running 100%. I was able to clean up the remaining services that were broken over the weekend and earlier this week. Moving forward, I am going to have my own replica on site running on separate hardware. I don't want to have to rely on anyone else if this happens again.

I also stored a copy of our DHCP configuration in a safe place so I can easily access it in the future should I need to spin up a new server on the fly.

Plus, I get to buy a new server, so that is pretty cool. I had been talking to my business manager about a new server anyway. She was hesitant, but after this happened, she basically said "buy whatever you need". I have a shiny new Dell PowerEdge server being custom built as we speak.

I am super excited about getting it in place and starting to set up our in house redundancies.

After a week like this, I take peace in the fact that tomorrow is Thanksgiving. Weeks like this past one are honestly few and far between in my job. It's relatively stress-less and when things are working they just work.

I think weeks like this past one remind us why all the other ones are just so special.


Sports Talk Social - @bozz.sports


TEAMUSAhive_footer_bozz.jpg


All pictures/screenshots taken by myself or @mrsbozz unless otherwise sourced



0
0
0.000
15 comments
avatar

Great work with the recovery, I can imagine it was stresssful but that sounds like you did a great job there. The new hardware sounds exciting!! Man im looking at some new kit myself! lol

0
0
0.000
avatar

It was one of the few times that I found myself really stressed in this job. I am excited about the server, but given the supply chain issues we are still dealing with, I am not going to have it in my office until 2023 sometime. That makes me quite sad...

0
0
0.000
avatar

Oh wow! Thats a crazy lead time!! I am chasing up suppliers who are telling me things are still taking over a year to deliver.. i cant understand what is going on..

0
0
0.000
avatar

Yeah, I had some projectors that took over a year to get to me. You wouldn't think it would be so long, but things are still just taking forever.

0
0
0.000
avatar

Oh no!!!
Don't you have a tech department to handle those in the school district that you work at?
Or are you in the tech department? Hahaha.
Great job, @bozz!!!

0
0
0.000
avatar
(Edited)

I am the tech department! :) Along with @iikrypticsii. I am the technology director for the district and he is my worker.

0
0
0.000
avatar

Hey~ that was some story you got there. I can already feel the anxiety and stress that surfaces as you are chugging through the problems. A solutions creator is what you are my friend. :)

By the way, the thing I never understand about computers in general is the fact that its all good until the moment you most need it. I had so many times where I was trying to bring up a presentation slide on powerpoint and I just couldn't find the damn file!! and I can find files in a split second when I most dont need it!!

Thanks for your work story~ I as well need to get back into the scene. and it starts today!!! I have been under the radar for some work related events in the past two months and finally I am trying to make a come back.

Cheers!!!

0
0
0.000
avatar

Live demos are the worst. It seems like everything that can go wrong does when you are dealing with one of those.

0
0
0.000
avatar

A shitty week indeed!
On the plus it is turkey day for you guys tomorrow, and me too by default, and you are getting that shiny new custom-built server albeit not anytime soon!

0
0
0.000
avatar

Yeah, I am looking forward to Thanksgiving and my server! Have a great one with your family if you celebrate.

0
0
0.000
avatar

I could imagine how bad the week was when things were in trouble. Good thing that you were able to resolve it in time with a back up of the configuration on a more secured space.

I could actually relate to this although not the same type of device, and I could say that it had brought a lot of stress and pressures to me and it seemed that I do not want to end the day without fixing the issues.

0
0
0.000
avatar

Yeah, it was quite stressful and frustrating. I am glad it is pretty rare that I have to deal with emergencies like this.

0
0
0.000
avatar

What matters is now, and surely it taught you many lessons, Sir. Sometimes, it is through being stressed and pressured wherein we become more productive and resilient. I believe you are so clever too as you figured it out before the processes become too complex. Once again, I am glad you made it. Have a nice day and keep safe.

0
0
0.000
avatar

Thanks! You as well!

0
0
0.000