Do nothing for the loss

The last few weeks, several of us in The Matrix have been noticing that one site we support has simply begun dumping tickets on us they simply refuse to support. They all seem to have the same theme, financial and insurance programs.

Today was no different. In this case, it was involving purchase orders put into the system. The local techs sat on this particular ticket for an entire month, and it was a simple one too. The ticket involved purchase orders being moved in to the appropriate section within their system. Seems simple enough, right?

Well, I looked at the ticket history, and there were 3 full pages of nothing but 12 and 24-hour notification emails indicating nobody had so much as touched the ticket until today, when one of the local techs dumped the ticket on us, asking someone in The Matrix to fix the problem, for what was obviously a local issue.

I quickly turned around and reassigned the ticket back to the tech who kicked it to us with a not-so-subtle note in the ticket, stating in no uncertain terms that the site had an entire month, and yet no troubleshooting of any kind was done, nor anything else for that matter, and if they wanted to kick it to us, to at least “try” to troubleshoot it.

The site didn’t kick it back to us.

Exsqueeze me?

Earlier this week, whilst jacked into the Matrix, we suddenly lost all connections to our SharePoint servers. At almost the exact moment, a call came in from the local site hosting our SharePoint servers, saying their network just went down… hard.

Since I got the call, it fell to me to get the ball rolling on calling all the particular people to get help for the site. I managed to get a hold of both regional and local LAN folks, and we immediately jumped on a con call with some of the other local people to figure out what the heck happened to cause the network to go down so suddenly. The regional LAN team did some digging and discovered that one of the VLAN’s was getting an awful lot of duplicate IP addresses tracing back to a single MAC address. At first they were stymied as to why this was, and how this would take the network down the way it did, especially since this was happening on one of the Cisco router cores, and not the other.

After a bit more digging, one of the local people on-site found something that made all of us cringe. He discovered an ethernet cable plugged in on one port of the core switch to a second port of the core switch, which just happened to be on the same VLAN. It was causing it to flap, and for some ungodly reason, causing all the duplicate addresses. Almost as soon as he unplugged it and shut both ports (neither were labeled as actually being in use), things began to improve… or so we thought.

The LAN team also took the offending core down, since it was the only one programmed with the VLAN that was flapping and causing all the duplicate IP’s. Almost as soon as that happened, things went back to normal on the network, since the first Core seemed to be OK. There was one catch. Someone at the site, in their infinite wisdom, decided to have half of the wireless access points hosted on one Core, and the other half on the other Core. So as soon as the offending Core went down, half the AP’s went with it. The Regional LAN team was scratching their heads, as to who in their right mind would set up the P controllers this way, and not dual-link them to both cores for redundancy. When they discovered this, they brought the 2nd Core back up, the gremlins came back out and everything they brought back up, came RIGHT back down again.

They managed to get the config’s from the 2nd core, since not only were the wireless AP’s tied there, but some of the VLAN’s as well. The LAN team was slowly realizing this issue was really starting to involve the words “massive” and “cluster”. They power cycled the first core back after taking the 2nd core down again, and injected the configs in, and some things began slowly coming up again. A bunch of servers and other things had to be rebooted because the constant flapping and switching between cores effectively made the servers go “Bah, screw it!” and go offline. So several other teams had to be brought in to go in, reboot the servers, and get all their services started back up again. After 5 hours, my team finally got their SharePoint back, as well as all the other Tier 3 teams, and they by and large were happy.

This didn’t mean the site itself was out of the woods just yet. Only 72 of the site’s over 1000 wireless AP’s were active, and the LAN team tried to figure out why. They figured out some of the scopes weren’t working, so they restarted a couple, and got about 60% of them back up and distributing IP’s. As for the rest, it took a while, but they eventually figured out that power cycling the switches the remaining down AP’s were attached to with PoE did the trick. Simply doing a shut/no shut didn’t do anything, power cycling seemed to be the fix since not only did it bring the AP’s back up, the switches went to the good core instead of their default of the bad one which we had since taken down to let Cisco handle.

All told, this whole bit of “fun” ended up taking over 21 hours to fix and get the site back into a largely working state as far as their LAN. They’re still working off one Core, and they’re going to let Cisco go through with a fine-toothed comb to see why it essentially went insane when the cable was plugged into two ports on the same VLAN.

Gamer hell

This occurred to me when trying to play a game on a laptop. I believe that the ultimate gamer hell may be to have every game every made available to you, but only be able to play them with a touch-pad.

Free for life is unacceptable!

One of the places I worked for many moons ago was a major University, and being such a big University, we had a lot of emeritus professors. One of the perks they continued to receive was free dialup for life up to 7200 minutes a month (this was just prior to broadband becoming popular).

One day, one of these emeritus professors called in and said his dialup wasn’t working. He was well known as the bane of the help desk, and all the techs ran for the nearest foxhole at the mere mention of his name because of how difficult he was to work with, both in person and on the phone. It was my turn to fall on the sword and help this guy, so I looked up all the standard things. Looking at his account, it was clear that there was no activity whatsoever, so on a hunch, I asked him which service he used to dial up. His response was “MSN.”

Come to find out, he got one of those MSN CD’s in the mail, one of the ones that offered 6 months free, after which people would have to begin paying. What he said next made everything clear suddenly. He said that he had stated using it six months prior. I tried in vain to tell him that we, in no way supported commercial dialup systems, nor were we about to pay for his dial up service through MSN or any other service except our own. It was then I made note of the fact his account with us was still valid and active since he was an emeritus professor, and he could use it for free (provided he stayed under the 7200 minute limit every month). This was unacceptable to him, and refused to use our service, or be billed for it in any way, despite telling him several times that up to 7200 minutes a month, his account was free. After about 35 minutes of going back and forth like this, the professor gave up and said he had to go eat dinner, and never called back that day.

And pray tell, what was the professor a PhD in? Economics…

The Good Doctor

During my time working at the Happy Hunting Grounds, one of the more “special” users I encountered was a doctor. He felt that because he was a doctor, this somehow exempted him from the normal rules concerning laptop usage.

My first interaction with him occurred soon after I took over laptop duties from my predecessor. I got a call from the doctor’s secretary, saying he was getting pop up’s constantly on his laptop, and needed them taken care of. So I wrote up a ticket for bookkeeping purposes while on the phone with $secretary, and swung by his office to get the laptop. I fired up the laptop after getting back to my office, and did my usual forensics on it. Pretty quickly, I noticed several problems. First off, $doctor’s profile was HUGE (the GB’s of space used was well into the double digits), and also saw there were a number of toolbars, children’s games, and other unauthorized programs installed, as well as a children’s movie in the DVD drive.

So I went to work, deleting all the toolbars and unauthorized programs from the computer, and also ran several cleaners to get rid of the temp internet files in all the profiles, which only seemed to make a small dent in $doctor’s profile size. On a hunch, I went into his user folder to see what he had there, and discovered he was basically using this laptop as the family computer. There were shortcuts and favorites for the whole family, and several gigabytes of personal photos, videos and other memories from several vacations.

Continuing with my hunch, I also checked if he had administrative rights on the laptop, since it seemed a little hinky that someone from my department would willingly install any of the programs I had just removed, and sure enough, $doctor did. I went into the laptop’s logs, discovering that my predecessor (the one who was fired for pirating DVD’s at work) was the one who gave him admin rights, which is a huge security violation. I promptly took a screencap of it and sent it off in an email to $CIO and the information security officer, along with all the other information I found.

Understandably, $CIO was livid when he read the email, since he had suspected that my predecessor had done these sorts of things, but never had the proof until now. He said he wished someone had come across this sooner, but would talk to the chief of staff and $hospitalDirector regarding $doctor’s flagrant violation of the security rules, and to hold onto the laptop in the meantime.

A couple days later, I got an email from $doctor, asking when he could pick up his laptop, since he really needed it, and replied with a couple of leading questions about what he used the laptop. His responses to my questions almost caused my jaw to hit the floor. He admitted he allowed his children to use the laptop instead of buying a computer for them, and claimed that his wife refused to let him spend the money on a new computer for the family, so he requested a laptop from us to circumvent this. $doctor also said he (rather easily) managed to convince my predecessor into giving him admin rights to install programs for the kids and whatever else he wanted on there. I bcc’d my boss in my reply to this, and said that this was a major security violation, and per policy, I’d have to report it.

$doctor got mad, demanding that I not do so (not knowing that I already had), and that he needed the laptop because his kids needed to do their homework, and had nothing else to use. My response was simple: The paperwork $doctor signed explicitly stated any equipment we gave him was to only be used for things directly relating to his job, and that he was forbidden from allowing anyone else, including family members, to have access the device for any reason whatsoever, even going so far as to find a scanned copy of the form he signed, highlighting the sections where it spells that out, again, adding $CIO to the email.

After a few more emails back and forth like that, $CIO sent me a separate email saying to just re-image the laptop, and put our stuff back on to re-encrypt it, and to only give the laptop back to $doctor if $hospitalDirector approved it, and even then, only after he took all the appropriate security classes again. $doctor also had to go to the store and purchase a computer for his kids and come into our office with the receipt.

All that took about another 10 days, but the good doctor finally got his laptop back, and we never had a problem with him again.