Entries Tagged 'Events' ↓

CES 2010

jaces2Well a week later and I have finally recovered from the whirlwind trip to CES 2010. This was a great experience and one that exposed me to the insane energy in the consumer electronics space. The clear winner was 3d tvs but still a gimmick in my opinion. The glasses that you need to ware to get the real experience make this a non-starter for many people. Not to mention the masses don’t want to upgrade their tv yet again. The magicjack GSM femtocell was a quite winner but they are going to run into all kinds of FCC issues I suspect with this device. CES 2010 had a record 120,000 attendees this year which you noticed trying to make your way through the amazing LG booth. CES is something every geek has to do at least once but I am not sure I would do this more than every few years at the most.

A Different Kind of QA

So yesterday we were wracking or brain trying to figure out where a 300% request per second increase to an app only seeing a 30% page view increase was coming from. We started with “why is the DB so slow” following our rules, but soon realized something else was going on. One of our engineers, while using fiddler, noticed an error in the flash that on mouse over made a call to / or the root of the app for no reason. The way the app was laid out this would account for a huge number of requests, somewhere in the neighborhood of 3000/sec at peak that were unnecessary.

This got me thinking what kind of QA would find this, is it peer review, classic code review including the design portion, or should this be part of our role? We run our shop very similar to a startup as it is primarily event driven so we don’t have the classic development cycles clearly defined. What this did show me is designers are designers and developers are developers while many can do both sometimes it really is best to separate the functions.

In our org I believe we should have a technical qa team that works with the operations team ripping apart and through the final product from an engineering and technical production standpoint. I think this would provide the best level of accountability on the two teams and formalize the release without sacrificing the startup feel. Of course we wold need to officially work this into the time line but would leave the core teams focusing on building the best possible products.

Importance of DB Trending

How can you know when something is about to go wrong if you can’t see it?? We finally closed the loop today on some MSSQL trending we have been missing for a very long time. Being able to watch things like table scans/sec, batch reads & writes/sec, and transactions/sec is huge during an event. As much as we drill into folks heads the importance of communicating changes, it is still to easy for a simple change to have unexpected impact on something like a DB. As I noted the other day it is almost always the DB or the file system and while we have our share of issues that aren’t many times, we have chased our tail due to lack of trending on the DB a lot and in the end it has been something stupid like an index got dropped.

“If it isn’t your DB it’s your File System”

That is a loose quote from a SXSW panel on scalable web ventures. This just hits way to close to home today to not write about. In our case it was both the DB and the File System. We were doing some last minute load testing, you can never be too sure, and after almost the entire day and roping in two other teams to dig into the db and SAN we realized the db had not been setup right. All files associated with the specific database were on the same storage LUN. This caused us a 95% reduction in throughput of our service. Splitting up the t-logs, data files, etc. onto different LUNs got us back to where we expected. The bottom line is we wouldn’t even have noticed this with out trending everything possible in cacti on our hosts and while win2k3 disk trending has some hurdles for SAN attached disk, it still pointed us in the right direction.

Terminal Velocity

Have we mentioned we work Sundays?

Well, we don’t work every Sunday, but we work whatever it takes when it comes to an event. And it just so happens it’s event time again.

Like Joe blogged, the week has been full of nearly full-time vendor-management. In addition to that full time job, we’ve also been busy trying to coordinate with our content production teams to make sure that everything was in order. And, we’ve been busy trying to make sure all of our devices, all of our services, and all monitoring and instrumentation was all buttoned up.

But now we’ve reached Terminal Velocity. There’s no stopping now. All the wheels are in motion. Between now and the end of this week, we will find out just how much our hard work and dedication will pay off. Will the site stay up? We’re pretty sure it will. Will the load balancer melt down? We don’t think so. And will we provide a killer experience for all of our users.