Archive for the 'vmware' Category

Fully implemented VMware DR solution

Thursday, June 25th, 2009

This is a pretty sight:

Off-site VM's

In 10 minutes I was able to promote a replicated LUN on our off-site DR SAN, rescan for volumes on our off-site DR ESX host, and then add the VM to the inventory and power it up. Given our replication scheduling and speeds, I should be able to bring online a fully-functional VM in a state no more than 2 hours behind what the production server was. I’ll take that :-) Finishing off the documentation for the infrastructure and can move on to other areas, such as migrating more of our physical servers in to the environment – I’ve already decommissioned 7, and currently running 17 production and 5 dev VM’s.

SAN replication and MacBook updates

Wednesday, June 24th, 2009

On Monday I implemented some routing and switching changes with the help of AT&T and our awesome network support guys from Integrated Logic. This finally let our iSCSI traffic for our Equallogic SAN units talk to each other between the district office and our off site location, Gladys Jung, across a 100Mb fiber line. To say I was a happy monkey upon seeing the LUN replications firing up is an understatement! Tomorrow I plan on bringing one of the LUN’s online, connecting to the DR VMware ESX host and seeing if I can actually bring the virtual machine online. At that stage, I will definately be beyond happy. We started planning our VMware infrastructure at the end of September, and has been a long project to get to the finish line, but totally worth it in terms of the set up we are now running, especially for the middle of no-where Alaska!

Today I bought the bullet and plumped for OS X and iLife updates, taking my MacBook from OS X 10.4 to 10.5 and iLife ’06 to ’09. Was literally 3 years ago last week I got the little guy, and I have to say, after a 2Gb RAM upgrade pretty much right away for under $80 and then a hard drive replacement, also around $80, after 18 months of abuse (literally – more than a dozen major airline flights, bush airlines, and 2 month road trip!), it’s been flawless and still powers everything I need without leaving me wanting. It’s probably the most impressive piece of computer equipment I’ve ever bought (I would say electronics in general, bit I do love my Canon EOS 50D!). Given I’ve seen hundreds of new machines passing through the office and workshops this past week for imaging and inventory, I know what the current crop of MacBooks run at, and for general day to use, this 3 year old baby is holding it’s own still!

Camp Harris

Sunday, June 7th, 2009

I’m now camped out at Jeff + Angel’s dog sitting Niko and Suka for the next week :) Is a nice little change, though was weird driving them up to the airport and then coming back myself and dealing with the joys of AC store parking lot! Never imagined I’d be driving round Bethel. Still, there were a few little errands I was wanting to get done in the next few days, so will be a help having a car to use.

The last couple of weeks have slowly been easing up at work with the bulk of the teachers and associated staff winding down. I am feeling kinda like a lost puppy with so many of my play buddies gone though now! I have the last of the VMware SAN replication to do this week with a consultant on-site Tuesday + Wednesday to move our second Equallogic unit off site. I already moved an ESX host to our off site location which works nicely given we have a 100Mb fiber line out there too.

I have a couple of website projects lined up, and a few photography projects to think about over the summer. Nice to have contacts now ;-) Some of those could turn out to be quite fun and a good experience. I scanned in 4 rolls of 35mm film the last couple of days too which gives me plenty of post-production work to do this week whilst hanging out here with the dogs. There’s a right mix of photos from the last few months, so will post some when I’m happy with the results.

PowerSchool migration to VMware

Wednesday, May 27th, 2009

The last couple of days have been busy with VMware and PowerSchool. We’ve had a consultant in working with us, and it was the perfect opportunity to migrate our student information system (SIS), PowerSchool, in to our VMware environment. I was planning on leaving PowerSchool as one of the our last physical servers to move in to the virtual environment, but given we had the experience on site to do it and I was confident in how VMware has been running, I thought we may as well give it a go. If nothing else, could simply roll back to the single physical server.

But, it all actually went fairly smoothly. I built up a template for Win 2k3 Enterprise with the base config and software, then deployed to 4 new virtual machines. One of these is running Oracle for the backend database, two are running the PowerSchool application node, with one being designated for general staff + parent logins, and the other for teachers and grading, and a final server dedicated to serving images, scripting and PHP reports. I have also snap-shot’d the database VM and one of the application nodes to be used for testing reports and in training sessions. Add in the new SIF ZIS which will be being deployed by the state over the summer, and that gives 7 virtual servers for PowerSchool, a far cry from the reliance and strain on a single physical server.

I’m really happy with how the migration went, as it really showed the power our VMware environment provides in terms of flexibility and resources. It also takes a huge weight off my shoulders, as we’ve never been able to successfully recover from a simulated failure using the backups due to the complexity of the integration between components, so with using straight vRanger Pro snapshots of the entire virtual machines, I can recover in minutes. I can also easily duplicate entire servers for testing updates, new releases (such as the upcoming PowerSchool 6), or for training purposes. Given PowerSchool is such a core system alongside FileMaker, both of which now run in our VMware environment, my management work load and stress levels should hopefully ease up considerably!

We still have a little work to do tomorrow – I’d like to automate a snapshot of the Oracle VM to a test VM that can be used by staff for building reports or whatever, though due to the way the database is tied in to the host IP, will need a little scripting. I’d also like to duplicate one of the application nodes and set it aside for testing the upgrade to PowerSchool 6. Is all positive stuff though, and giving me a lot of confidence in systems moving forward.

APC PCNS setup for VMware

Monday, March 16th, 2009

There’s a lot of confusion of APC’s PCNS software for VMware. Basically, ignore the generic instructions provided on the CD ;) This was tested on multiple ESX 3.5 Update 4 hosts and worked perfectly connected to a pair 2200′s. From what we could tell, the pay-for version is basically just PCNS 2.2.3 but with compatible VMware components. Nothing on the CD tells you this, and the instructions just add to the confusion. These setup steps figured out in conjunction with Kurt Bunker from GCS and posted here for reference.

Prior to configuring the PCNS software you need to configure the SNMP card with the following (note that the admin password and passphrase need to be the same on multiple APC units):

1. IP address
2. Administrative account password
3. Administrator Passphrase (minimum 15 char, Max 32)
4. Set APC SNMP settings for shutdown: UPS tab -> Configuration -> Shutdown -> Low Battery Duration

With the CD inserted, from the command line of your ESX host:

mount /dev/cdrom
cd /mnt/cdrom
./install.sh

During the install:

Instance '1'
Default install directory
Hit enter to automatically install JRE

Once install finished, open require firewall ports *before* running PCNSConfig.sh – you can only run the script once – even if you cancel out, it won’t run again unless you run the full uninstall script. If you’re unhappy about opening all these firewall rules, browse the forums to see exactly what they’re doing. We found they all needed to be open to function correctly.

esxcfg-firewall -o 80,tcp,out,"APC PowerChute Port 80"
esxcfg-firewall -o 2161,tcp,out,"APC PowerChute Port 2161"
esxcfg-firewall -o 2161,tcp,in,"APC PowerChute Port 2161"
esxcfg-firewall -o 2161,udp,out,"APC PowerChute Port 2161"
esxcfg-firewall -o 2161,udp,in,"APC PowerChute Port 2161"
esxcfg-firewall -o 3052,tcp,out,"APC PowerChute Port 3052"
esxcfg-firewall -o 3052,tcp,in,"APC PowerChute Port 3052"
esxcfg-firewall -o 3052,udp,out,"APC PowerChute Port 3052"
esxcfg-firewall -o 3052,udp,in,"APC PowerChute Port 3052"
esxcfg-firewall -o 6547,tcp,out,"APC PowerChute Port 6547"
esxcfg-firewall -o 6547,tcp,in,"APC PowerChute Port 6547"
esxcfg-firewall -o 6547,udp,out,"APC PowerChute Port 6547"
esxcfg-firewall -o 6547,udp,in,"APC PowerChute Port 6547"
esxcfg-firewall -o 6548,tcp,out,"APC PowerChute Port 6548"
esxcfg-firewall -o 6548,tcp,in,"APC PowerChute Port 6548"
esxcfg-firewall -o 6548,udp,out,"APC PowerChute Port 6548"
esxcfg-firewall -o 6548,udp,in,"APC PowerChute Port 6548"

With the ports open, configure the PCNS software:

cd /opt/APC/PowerChute/group1
./PCNSConfig.sh

During the configuration:

Select option '3' - Configure for multiple Smart-UPS devices
Enter IP address, port 80 (default), username, password, authentication passphrase
'Yes' to register settings
Enter IP of second card - username, password + phrase set already based on previous details
'No' to register another card
'Yes' to start the PCNS service

Note: you can start/stop/check status of thePCNS service in /etc/rc.d/init.d at any time such as /etc/rc.d/init.d/PowerChute start/status/stop

Can then load up http://esx-hostname:3052 in web browser and:

Configure Events - scroll down to UPS: On Battery
Click fourth column from the end
Check box for 'Yes, I want to shut down the system' and enter 60 seconds in box below
Select 'Configure Shutdown' from sidebar
Uncheck the box to 'Turn off the UPS after shutdown finishes'

Depending on what else you’ve tried to configure, these instructions might need tweaking to remove previous components, and the JRE might be picky. But, we pulled a UPS without the shutdown signal being sent, and then correctly initiated a shutdown when the second UPS became low on battery. Oh, and make sure you set your power management options in the BIOS correctly – as the UPS initiates a clean shutdown, when the power is restored, make sure the server is set to always power back on :-) On our R805′s, ‘Last’ is great in the event the power just drops whilst the server is running, but since it was shut down cleanly, will not power on with the UPS back online, needs to be set to ‘Always’.

Another Friday 13th…

Monday, March 16th, 2009

Last week was busy. I broke 60 hours by the time I got home around 7p.m Sunday. We had an engineer out with our VMware deployment which didn’t exactly go to plan due to networking issues outside of our control which was a little frustrating, but totally expected to be honest. The handling and resolution of those issues within the department weren’t ideal though, which at the end of a long week, didn’t help improve matters :( It took a lot of phone calls and soul searching to make it back in to the office this morning.

On the plus side, the VMware environment delivers everything it’s meant to – we tested power supply redundancy, NIC redundancy, controller redundancy and hard drive redundancy on all the servers and SAN units; set up APC PCNS software for a multi-master setup with dual UPS’; had replication between the two Equallogic PS5000E SAN’s working; got the core Cisco 3560G configured and running everything nicely; and tested HA and DRS functioned correctly and responded to failures as expected. We now have a 3-node ESX cluster of Dell R805′s running through Virtual Center on a EQL PS5000E, with another PS5000E as a replica and a single Dell 2970 as standalone ESX host for disaster recovery. Due to our networking issues, the replica EQL box and ESX host could not be put offsite as planned, so that will be phase two (by which point a second 2970 can be re-deployed giving us a pair of off-site servers in the event of a failure at the core). Having spent so long getting everything planned out and now up + running, it’s really quite sad to be crippled by a single 100Mb uplink for all LAN traffic to/from the virtual environment so can’t really use it for much (too complicated to explain the logic behind a $700 10/100 ProCurve at our backbone I wasn’t fully aware of…).

But, I finally made it to the chiropractor this afternoon, which was definately needed after last week :) After a checkup, x-rays and review of past medical records, was hooked up to a TENS system to relax the muscles along hot pads before a bit of a massage, before actually getting on to the chiropractic adjustment. After that, I then lay down on a weird setup with a rolling base to decompress the spine. All in all, just a little different to treatment back in England! Have another couple of sessions this week, and figure I may as well make the most of it with the medical insurance covering so much of it.

Friday 13th strikes

Friday, February 13th, 2009

Although this morning started out with a nice tax refund being deposited in to my account, Friday 13th hit, and hit hard. Aside from the school district’s bank not handling direct deposits overnight and so ending up without being paid (still hasn’t posted to my account), technical gremlins plagued the day.

Around 9a.m I got a call that our streaming media server wasn’t providing content. Couldn’t connect via remote desktop, so went in to the server room to be greeted by three health warning lights across the RAID array. Diagnostics claim 3 out of 4 drives in the array had failed. The odds on that happening are pretty slim shall we say. This is the same server that was hit with errors last summer before I started with temperatures breaking 105F in the server room on a number of occasions. I’m thinking it’s more likely the controller has failed than three simultaneous drive failures. Regardless, without any spare parts or usable servers, I quickly brought online a Win 2k3 VM, moved 60Gb worth of content from the backups and brought it online. I could have been doing this on a new 3-node R805 cluster designed specifically for VMware, but am lacking support for a $10k Cisco 3750 switch given people are too used to expecting a $500 ProCurve to adequately perform as a core backbone. Right now I have $120k worth of gear doing nothing as a result. That’s another ongoing issue getting comical if it wasn’t so serious.

I spent quite a bit of time out the office this week in other sites seeing how their set ups are functioning, and honestly, the excuse of “it’s Bethel” just doesn’t wash anymore. No disrespect at all to Joe and Dean, but if a couple of guys at KYUK with ‘Networking for Dummies’ can knock 10 bells out of the district setup, there’s something clearly wrong. It’s not even complacency in some areas, just plain laziness and being hopelessly out in the deep end.

FirstClass continues to be the bane of my life, especially as it’s not even my responsibility. Once again it had issues today. After being yelled at for questioning the validity of restarting the entire server in the middle of the afternoon once again, I gave up. The idea that 25 years ago, ‘bits needed to drain out memory’ or even 5 years ago ‘log files would fill up and lock the whole system’ may have been valid, but on a server with 16Gb RAM that never breaks 3Gb in use and daily log files that don’t break half a meg, slow delivery of mail and loading inboxes is a different issue and it gets embarrassing being involved in it’s management sometimes.

I ended up leaving at 4p.m, will see if I get up on Monday morning and go back in. I’ve come to realise that work is not the be-all and end-all. If it brings you to tears on a Sunday evening at the thought of going back in on a Monday morning, I can happily walk away knowing I gave everything I could and with no regrets. The most frustrating part is that outside of work I’m actually very happy + stable, looking to the future, and simply enjoying life. To leave Bethel would be sad, but I’m learning to put myself first rather than carry on in a situation I’m not happy with.

VCP in VI3

Wednesday, January 28th, 2009

After hanging around for 45 minutes this morning whilst the ‘test center’ got ready (read – a corner in the college library, was not impressed), I finally got to sit the VCP exam and passed with a score of 83 :-) So, I’m now a VMware Certified Professional (VCP) in VMware Infrastructure 3 (VI3).

Since the school board gave approval for $110k+ worth of equipment to migrate our entire datacenter to a virtual environment including full off-site SAN replication along with a pair of redundant ESX hosts in the event of a major failure as a complete disaster recovery location (without SRM), it’s all coming together nicely and should makes things a whole lot easier come the summertime.

Developing K300 weekend photos

Sunday, January 25th, 2009

As part of my ongoing attempt to avoid studying for my VCP exam on Wednesday (actually, I’ve covered 8 of the 11 modules and have a 2-hour revision session via telephone tomorrow), I finished off the last few frames of film and tried developing the two rolls of 35mm from the K300 last weekend. I was worried I’d lost the first roll as the rewind froze / numb fingers, and the rear popped open exposing the film. Turns out I only lost about 6-8 frames and most of the actual K300 photos are just fine. Most of those I’ve lost were of the fiddle dance, which is probably worse as I’ve lost my blackmail photos ;-) Ah well, live + learn (i.e. find an open pick-up truck and just sit in the drivers seat changing film like I ended up doing!).

Will try to find someone heading to the post office tomorrow that I can blag a ride with to pick up my Canon 8800F film scanner to actually get them on the computer. Pretty happy with how much easier it was developing the film this time, wasn’t as nervous, probably as I wasn’t expecting anything on one roll! Still, just shows the magic of film when you actually slide if off the developing reel and see the images sitting there :-) Not quite the same as plugging a USB cable in and downloading to a computer!

Last weekend I also booked flights in to Anchorage to see the start of the Iditarod which I’m really looking forward to! It’s something I’ve wanted to get to the last couple of years, and figure I have the time + money now so may as well. There’s a few other friends going to be around for it too, so should be fun. Just making it a long weekend, flying out of Bethel on the Thursday evening after work and coming back on Sunday, but will be a nice little Spring break.

VMWare course completed

Friday, January 16th, 2009

VMWare training ended really well. Was nice the last day or two set up VMotion / DRS and try out HA, as with our single-ESX host test environment that just hasn’t been possible. I also found there’s a Vue testing center in Bethel, so I hope to sit the VCP (VMWare Certified Professional) exam shortly too.

I’m now hanging out at the Anchorage airport waiting for my flight back to Bethel. The flights have been crazy the last few days with the weather with a bunch of them canceled, so hope I’m good to go. There’s a fantastic sunset building here with the breaking cloud cover, but Bethel is still pretty crappy.

The K300 races have been delayed until tomorrow due to the weather. With the wet conditions and soft snow, hopefully it will cool down and start to freeze up again by tomorrow. Will mean I have the Akiak Dash at 2p.m, K300 at 3p.m and then the Bogus Creek 150 at 4p.m to photograph :) Hopefully it’s not raining given switching between camera gear! Happy in a way I get to see the start, but with the bad weather, it won’t be much fun for the dogs or the mushers, and it means I’ll likely miss the K300 winners coming in as it won’t be until Monday afternoon probably. Will see how things pan out though. Need to back it back to Bethel first.