<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.3.2" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>

<channel>
	<title>ArmChair Operations</title>
	<link>http://www.magicbucketsofgoodness.com</link>
	<description>Web Operations, Capacity Planning, and Distributed Systems now with more "Magic Buckets of Goodness™"</description>
	<pubDate>Thu, 27 Mar 2008 14:22:12 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.2</generator>
	<language>en</language>
			<item>
		<title>Amazon flood gates open.</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/27/amazon-flood-gates-open/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/27/amazon-flood-gates-open/#comments</comments>
		<pubDate>Thu, 27 Mar 2008 14:19:52 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[The Industry]]></category>

		<category><![CDATA[amazon]]></category>

		<category><![CDATA[ec2]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/27/amazon-flood-gates-open/</guid>
		<description><![CDATA[Two huge new features were announced today for EC2. The first being Elastic IPs which is basically the static IP solution everyone has been waiting for, but better! Elastic IP is a 1:1 NAT solution. What is so cool about this is you can dynamically remap your static IP to different running instances creating a [...]]]></description>
			<content:encoded><![CDATA[<p><img align="left" src='http://www.magicbucketsofgoodness.com/wp-content/uploads/2008/03/aws_logo.gif' alt='AWS Logo' />Two huge new features were announced today for EC2. The first being Elastic IPs which is basically the static IP solution everyone has been waiting for, but better! Elastic IP is a 1:1 NAT solution. What is so cool about this is you can dynamically remap your static IP to different running instances creating a poor mans HA solution. The second feature is Availability Zones. This allows you to launch instances in isolated zones that amazon describes as &#8220;distinct locations engineered to be insulated from failures in other zones.&#8221; The next step to this is allowing for region specific selection as well, currently you are limited to selecting a zone within your defined region based on your account. This provides for a huge increase in availability and will certainly make organizations take another hard look at what amazon has to offer to extend or augment their existing facilities.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/27/amazon-flood-gates-open/feed/</wfw:commentRss>
		</item>
		<item>
		<title>A Different Kind of QA</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/18/a-different-kind-of-qa/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/18/a-different-kind-of-qa/#comments</comments>
		<pubDate>Wed, 19 Mar 2008 03:07:54 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Capacity]]></category>

		<category><![CDATA[Events]]></category>

		<category><![CDATA[applications]]></category>

		<category><![CDATA[db]]></category>

		<category><![CDATA[qa]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/18/a-different-kind-of-qa/</guid>
		<description><![CDATA[So yesterday we were wracking or brain trying to figure out where a 300% request per second increase to an app only seeing a 30% page view increase was coming from. We started with &#8220;why is the DB so slow&#8221; following our rules, but soon realized something else was going on. One of our engineers, [...]]]></description>
			<content:encoded><![CDATA[<p>So yesterday we were wracking or brain trying to figure out where a 300% request per second increase to an app only seeing a 30% page view increase was coming from. We started with &#8220;why is the DB so slow&#8221; following our rules, but soon realized something else was going on. One of our engineers, while using fiddler, noticed an error in the flash that on mouse over made a call to / or the root of the app for no reason. The way the app was laid out this would account for a huge number of requests, somewhere in the neighborhood of 3000/sec at peak that were unnecessary.</p>
<p>This got me thinking what kind of QA would find this, is it peer review, classic code review including the design portion, or should this be part of our role? We run our shop very similar to a startup as it is primarily event driven so we don&#8217;t have the classic development cycles clearly defined. What this did show me is designers are designers and developers are developers while many can do both sometimes it really is best to separate the functions.</p>
<p>In our org I believe we should have a technical qa team that works with the operations team ripping apart and through the final product from an engineering and technical production standpoint. I think this would provide the best level of accountability on the two teams and formalize the release without sacrificing the startup feel. Of course we wold need to officially work this into the time line but would leave the core teams focusing on building the best possible products.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/18/a-different-kind-of-qa/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Importance of DB Trending</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/14/importance-of-db-trending/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/14/importance-of-db-trending/#comments</comments>
		<pubDate>Fri, 14 Mar 2008 16:18:11 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Capacity]]></category>

		<category><![CDATA[Events]]></category>

		<category><![CDATA[db]]></category>

		<category><![CDATA[MSSQL]]></category>

		<category><![CDATA[TRENDING]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/14/importance-of-db-trending/</guid>
		<description><![CDATA[How can you know when something is about to go wrong if you can&#8217;t see it?? We finally closed the loop today on some MSSQL trending we have been missing for a very long time. Being able to watch things like table scans/sec, batch reads &#038; writes/sec, and transactions/sec  is huge during an event. [...]]]></description>
			<content:encoded><![CDATA[<p><img align="left" src="http://farm3.static.flickr.com/2065/2221727575_506aed27e7_m.jpg" alt="" border="0" />How can you know when something is about to go wrong if you can&#8217;t see it?? We finally closed the loop today on some MSSQL trending we have been missing for a very long time. Being able to watch things like table scans/sec, batch reads &#038; writes/sec, and transactions/sec  is huge during an event. As much as we drill into folks heads the importance of communicating changes, it is still to easy for a simple change to have unexpected impact on something like a DB. As I noted the other day it is almost always the DB or the file system and while we have our share of issues that aren&#8217;t many times, we have chased our tail due to lack of trending on the DB a lot and in the end it has been something stupid like an index got dropped.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/14/importance-of-db-trending/feed/</wfw:commentRss>
		</item>
		<item>
		<title>&#8220;If it isn&#8217;t your DB it&#8217;s your File System&#8221;</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/11/if-it-isnt-your-db-its-your-file-system/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/11/if-it-isnt-your-db-its-your-file-system/#comments</comments>
		<pubDate>Wed, 12 Mar 2008 02:22:07 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Capacity]]></category>

		<category><![CDATA[Events]]></category>

		<category><![CDATA[cacti]]></category>

		<category><![CDATA[database]]></category>

		<category><![CDATA[problems]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/11/if-it-isnt-your-db-its-your-file-system/</guid>
		<description><![CDATA[That is a loose quote from a SXSW panel on scalable web ventures. This just hits way to close to home today to not write about. In our case it was both the DB and the File System. We were doing some last minute load testing, you can never be too sure, and after almost [...]]]></description>
			<content:encoded><![CDATA[<p><img align="right" src="http://farm1.static.flickr.com/36/75593157_fb0c0c1e4c_m.jpg" alt="" border="0" />That is a loose quote from a SXSW panel on scalable web ventures. This just hits way to close to home today to not write about. In our case it was both the DB and the File System. We were doing some last minute load testing, you can never be too sure, and after almost the entire day and roping in two other teams to dig into the db and SAN we realized the db had not been setup right. All files associated with the specific database were on the same storage LUN. This caused us a 95% reduction in throughput of our service. Splitting up the t-logs, data files, etc. onto different LUNs got us back to where we expected. The bottom line is we wouldn&#8217;t even have noticed this with out trending everything possible in cacti on our hosts and while win2k3 disk trending has some hurdles for SAN attached disk, it still pointed us in the right direction.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/11/if-it-isnt-your-db-its-your-file-system/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Seriously Facebook?</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/11/seriously-facebook/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/11/seriously-facebook/#comments</comments>
		<pubDate>Tue, 11 Mar 2008 14:43:56 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Capacity]]></category>

		<category><![CDATA[What?]]></category>

		<category><![CDATA[facebook]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/11/seriously-facebook/</guid>
		<description><![CDATA[In between avoiding any real specifics in an interview with GiggaOm&#8217;s Stacey Higginbotham Zuckerberg eludes to Facebook running on &#8220;tens of thousands&#8221; and approaching &#8220;hundreds of thousands&#8221; of hosts. According to comscore facebook supports up to 65 Billion, yes that is billion with a B, page views a month, or more then 2 Billion a [...]]]></description>
			<content:encoded><![CDATA[<p><img align="left" src="http://farm3.static.flickr.com/2408/2255637540_1ca48e4f10_m.jpg" alt="" border="0" />In between avoiding any real specifics in an <a href="http://gigaom.com/2008/03/10/zuckerberg-gets-geeky-but-keeps-ad-details-close/">interview</a> with GiggaOm&#8217;s Stacey Higginbotham Zuckerberg eludes to Facebook running on &#8220;tens of thousands&#8221; and approaching &#8220;hundreds of thousands&#8221; of hosts. According to comscore facebook supports up to 65 Billion, yes that is billion with a B, page views a month, or more then 2 Billion a day. Relatively speaking that is a huge number of hosts to support the site. He does talk about how they use memcached extensively but it sounds to me like some general re-architecture is in order. Or is this a VMWARE salesman&#8217;s dream?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/11/seriously-facebook/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Digg Cashing in finally?</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/07/digg-cashing-in-finally/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/07/digg-cashing-in-finally/#comments</comments>
		<pubDate>Fri, 07 Mar 2008 20:48:12 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Links]]></category>

		<category><![CDATA[What?]]></category>

		<category><![CDATA[digg buyout]]></category>

		<category><![CDATA[rumors]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/07/digg-cashing-in-finally/</guid>
		<description><![CDATA[Well I woke up to another Digg sell out rumor so it must be the first Friday of the month. What is interesting this time is the players couldn&#8217;t get much bigger; Google and Microsoft thanks to techcrunch breaking the news. While this has to be the geekiest picture ever on the cover of a [...]]]></description>
			<content:encoded><![CDATA[<p><img align="left" src='http://www.magicbucketsofgoodness.com/wp-content/uploads/2008/03/krose.thumbnail.gif' alt='krose.gif' />Well I woke up to another Digg sell out rumor so it must be the first Friday of the month. What is interesting this time is the players couldn&#8217;t get much bigger; Google and Microsoft thanks to <a href="http://www.techcrunch.com/2008/03/07/google-microsoft-bidding-for-digg/">techcrunch</a> breaking the news. While this has to be the geekiest picture ever on the cover of a magazine it is damn good marketing. Kevin Rose took a suggestion he made to the Slashdot founder over lunch and turned it into the largest social news site that so far can&#8217;t be outdone. The digg stack is completely Lamp, <a href="http://highscalability.com/digg-architecture">High Scalability has the details</a>, and done on a shoestring budget of course with the exception of the support of digg CEO and co-founder Adelson&#8217;s Equinox as the founder and CTO. Good idea, reasonable execution, and a solid Ops team, which started with one guy.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/07/digg-cashing-in-finally/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Operations Rules</title>
		<link>http://www.magicbucketsofgoodness.com/2008/03/02/operations-rules/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/03/02/operations-rules/#comments</comments>
		<pubDate>Sun, 02 Mar 2008 21:06:06 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Capacity]]></category>

		<category><![CDATA[Links]]></category>

		<category><![CDATA[capacity planning]]></category>

		<category><![CDATA[lists]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/03/02/operations-rules/</guid>
		<description><![CDATA[Jon Pral has great list here of 85 Operaions rules to live by. Here are my top 5 from the list.
1. Know your bottlenecks and know how to spot them - every layer - know if you are blocking on disk, RAM, or CPU.  It is usually that simple.
2. The value of a project [...]]]></description>
			<content:encoded><![CDATA[<p>Jon Pral has great list <a href="http://jprall.vox.com/library/post/85-operations-rules-to-live-by.html">here</a> of 85 Operaions rules to live by. Here are my top 5 from the list.</p>
<p><a href="http://www.flickr.com/photos/23737784@N00/1328284686/" title="" target="_blank"><img class="left" src="http://farm2.static.flickr.com/1167/1328284686_4f452b3ac9_m.jpg" alt="" border="0" /></a>1. Know your bottlenecks and know how to spot them - every layer - know if you are blocking on disk, RAM, or CPU.  It is usually that simple.</p>
<p>2. The value of a project manager, tech writer, and financial analyst in the ops organization should not be underestimated.  They will more than pay for themselves.</p>
<p>3. Monitor EVERYTHING - alert on actionable only, record other for trend information.</p>
<p>4. Assign people to be point people for every bit of technology.</p>
<p>5. Do it right the first time.  Rarely do you get the chance to go back and redo things.  If you do, it comes at a very big cost to the company.  Take the hit on work, the first go round.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/03/02/operations-rules/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Don&#8217;t reinvent the wheel twice</title>
		<link>http://www.magicbucketsofgoodness.com/2008/02/20/dont-reinvent-the-wheel-twice/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/02/20/dont-reinvent-the-wheel-twice/#comments</comments>
		<pubDate>Thu, 21 Feb 2008 01:48:05 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Software]]></category>

		<category><![CDATA[apache]]></category>

		<category><![CDATA[open-source]]></category>

		<category><![CDATA[yahoo]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/2008/02/20/dont-reinvent-the-wheel-twice/</guid>
		<description><![CDATA[
What did the team at Yahoo do when they realized they had an old proprietary back-end application running their core search business? They moved to an open source map reduce implementation from the Apache Software Foundation named Hadoop. This is a huge move and shows that sometimes even  a core piece of your business [...]]]></description>
			<content:encoded><![CDATA[<p><img align="left" src='http://www.magicbucketsofgoodness.com/wp-content/uploads/2008/02/hadoop-logo.jpg' alt='Hadoop' /><br />
What did the team at Yahoo do when they realized they had an old proprietary back-end application running their core search business? They moved to an open source map reduce implementation from the Apache Software Foundation named Hadoop. This is a huge move and shows that sometimes even  a core piece of your business is better accomplished with software that someone else has built. In Yahoo&#8217;s case they were able to accomplish the same tasks 34% faster on their new 10,000 core Linux cluster using <a href="http://hadoop.apache.org/">Hadoop</a>.</p>
<div align=center><code><object classid='clsid:d27cdb6e-ae6d-11cf-96b8-444553540000' codebase='http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=8,0,0,0' width='320' height='270' id='yfop'>
<param name='movie' value='http://d.yimg.com/cosmos.bcst.yahoo.com/up/fop/embedflv/swf/fop.swf' />
<param name='flashvars' value='id=6418984' /><embed	src='http://d.yimg.com/cosmos.bcst.yahoo.com/up/fop/embedflv/swf/fop.swf' width='320' height='270' name='yfop' type='application/x-shockwave-flash' pluginspage='http://www.macromedia.com/go/getflashplayer' flashvars='id=6418984'></embed></object></code></div>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/02/20/dont-reinvent-the-wheel-twice/feed/</wfw:commentRss>
		</item>
		<item>
		<title>What we can all learn from the Amazon outage.</title>
		<link>http://www.magicbucketsofgoodness.com/2008/02/18/what-we-can-all-learn-from-the-amazon-outage/</link>
		<comments>http://www.magicbucketsofgoodness.com/2008/02/18/what-we-can-all-learn-from-the-amazon-outage/#comments</comments>
		<pubDate>Tue, 19 Feb 2008 03:54:33 +0000</pubDate>
		<dc:creator>Joe</dc:creator>
		
		<category><![CDATA[Capacity]]></category>

		<category><![CDATA[amazon]]></category>

		<category><![CDATA[capacity planning]]></category>

		<category><![CDATA[montoring]]></category>

		<category><![CDATA[s3]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/?p=23</guid>
		<description><![CDATA[I didn&#8217;t write about the Amazon storage service outage here before now but I have been thinking a lot about what we all can learn from it. First a few details; The amazon S3 storage solution had issues from 3:30am PT to 6:48am PT on 2/15. The issue manifested itself in a &#8220;large&#8221; increase in [...]]]></description>
			<content:encoded><![CDATA[<p>I didn&#8217;t write about the Amazon storage service outage here before now but I have been thinking a lot about what we all can learn from it. First a few details; The amazon S3 storage solution had issues from 3:30am PT to 6:48am PT on 2/15. The issue manifested itself in a &#8220;large&#8221; increase in authenticated calls to the S3 service. The real problem is the team didn&#8217;t know this was coming until it was to late. To resolve the problem the Amazon Team moved additional capacity in to handle this increase in authenticated requests.</p>
<p>I can certainly feel for the Amazon team, being caught off guard is NOT a good feeling. So what monitoring is missing from your environment? This should be an opportunity for all of us to think about the little service that everything relies on and could cripple the environment. Monitoring, trending and basic capacity planning is critical to the health of all our applications. We have been working much more closely with out engineering teams then ever before to instrument all parts of the applications supporting our sites via <a href="http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/">JMX</a>. Call it what you want and I don&#8217;t like the word but it feels like a good time for a basic monitoring audit.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2008/02/18/what-we-can-all-learn-from-the-amazon-outage/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Radical Thinking: Reply-To-All Sucks!</title>
		<link>http://www.magicbucketsofgoodness.com/2007/12/12/new-radical-thinking-methodology-reply-to-all-sucks/</link>
		<comments>http://www.magicbucketsofgoodness.com/2007/12/12/new-radical-thinking-methodology-reply-to-all-sucks/#comments</comments>
		<pubDate>Wed, 12 Dec 2007 17:00:03 +0000</pubDate>
		<dc:creator>Mistamista</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.magicbucketsofgoodness.com/?p=21</guid>
		<description><![CDATA[You heard it here first.
]]></description>
			<content:encoded><![CDATA[<p>You heard it here first.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.magicbucketsofgoodness.com/2007/12/12/new-radical-thinking-methodology-reply-to-all-sucks/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
