« April 2007 | Main | June 2007 »

May 31, 2007

Beautiful Moments in Information Architecture

Two things about this short video of a remarkable TED presentation caught my attention:

  1. There is a moment about 1/2 way through the presentation when a novel bit of information architecture dawns on the audience and they burst into spontaneous applause. This is a beautiful moment as it is both intellectual and emotional and springs from something fundamental collectively discovered.
  2. The future of information access and organization is going to be spectacular!

Newspaper Economics - Online vs. Print Revenues III

Instead of blog entries on this topic lately, I have been drafting an article summarizing story to date. You can download an early draft of Online Newspaper Revenues: Albatross, Lifeboat or New World?.  Please send your comments and crticisms.  Just leave a comment or email me directly at scott[at]drskippy.net.

The short summary is that newspaper online revenues are meaningless for smaller newspapers (~40,000 circulation) and are unlikely to save the day for larger newspapers. 

Official Bolder Boulder Results In

The official BB results are up two days after the race.  Yesterday, the BB Web team implemented most of my suggestions (not necessarily due to my suggestioning them--they are pretty obvious, I think).  And a bit after noon yesterday, they had the results up.  I downloaded my offical time and splits to compare.

Mile 1:    00:07:24
Mile 2:    00:07:38
Mile 3:    00:07:59
Mile 4:    00:07:55
Mile 5:    00:07:31
Mile 6:    00:08:01
Total Time:    00:48:07

I knew before the race that I wanted to compare the official times to my own timing, so I was fairly rigorous in my lap-button pushing at each 1-mile station, yet my official time is fully 8 seconds faster than my watch time. A mystery...

May 29, 2007

Bolder Boulder Time

My "official" (i.e. Timex) time from the 29th annual Bolder Boulder...  It will be interesting to see how these compare to the actual official times produced by the new shoe-tag timing system.

MILETIMESPLIT
17:277:27
215:077:41
323:057:57
430:577:52
538:327:35
646:307:59
Finish48:151:44

This is about 30s off of last year's time -- on schedule to beat my age in 2012.

Bolder Boulder: Electronic timing meltdown

Fifty thousand of my friends and I ran the 29th Bolder Boulder yesterday.  At the starting line and at every mile along the way, there were two blue lines on the ground about a foot wide running across the course.  And there was an incessant beeping (more like a continuous tone since I was in a fairly dense crowd) every time I crossed one of these lines.  This was the "detection" end of the a new system in which we all wore little plastic tags, zip-tied to our shoes in the hopes of getting precise timing for every section of the course.

It is Tuesday morning and the results don't seem to be available anywhere yet.  In fact, the Bolder-Boulder Web site is only occasionally accessible.  This reminded me of some customer service guidelines that can help relieve the load when something like this happens (as it always does):

  1. Transparency--give your users up-to-the-minute information about what is going on. If you have 50,000 people hitting your web site multiple times in a matter of hours, you start to have additional load problems that you didn't have before the failure.  This slows the site, increased frustration and decreases satisfaction.  Instead, tell people what is going on, preferably with a temporary blog where additional traffic isn't going to multiply your problems on your main site.  Put prominent links to this timely information on a temporary, lightweight (minimal server load) homepage so your users get to the information they need to make a good web site use decision. Consider releasing the story in a way that a Google search will result in the information being communicated without users hitting your site.
  2. Give an estimate date or time for the race results to be available.   Keep the projected date/time updated. Plan on, a huge traffic surge at this time and get the resources together to accommodate it in the meantime.
  3. Syndicate.  Publish the content to some trusted partners to take the load off your site. Users want the information; getting it from your site is secondary.  You may have a business model around the traffic, but poor customer satisfaction may make any short term losses do to lost traffic irrelevant.
  4. Run parallel systems.  Until you have verified that your system can handle the load, run the old system in parallel.  "If you can't run it on papers, you can't automate it" is a good rule for all business process modeling.  In this case, run the old and new system in parallel so you have a fallback.

All of these suggestions boil down to two aspects of responsibly made commitments:
  • Clearly set expectations as the what?, when?, where how?
  • Avoid denial! during a crisis, commit only to things that your organization can do (You can dream big on your own dime, not while you have customers lined up waiting for you basic services to be deliverred.)
  • Competently meet or exceed expectations

Making responsible committments doesn't stop after the first glitch. In fact, that is just the time when excellent organizations get diligent about it.

May 23, 2007

Inflaming Bread?

While thinking about running 9 miles last weekend on the buttes in New Mexico I realized two things:

  1. I like covering the territory at running speed. It is better than faster (e.g. on bike or by car) because you get to see more detail. But, running is faster than walking so you get to move on to see the next thing sooner.
  2. And it is chemically addicting.  I feel good.

So, I was primed to "hear" Art's take on foods we crave:
If there is anything "you can't live without" you may actually be allergic to that food. An allergic response to a food will release adrenaline so that you feel a hit that is like a reward when you eat it. Bread is particularly allergenic if you are moderately celliac, yeast intolerant, or inflammed due to insulin resistance or obesity. These are all inflammatory processes, even obesity (which may be the worst), and are, therefore, allergenic. You are setting off your immune system and may suffer collaterol damage as it begins to attack your own tissues in addition to the alllergens. (link)

Maybe this doesn't quite cover the whole field of food cravings, but my experience says it will be interesting to start paying attention to my experience here.

May 22, 2007

Chatfield Reservior 7.5 mi Trail Run

Photos came back recently from the 21 April 2007 7.5 mile trail run put on by the guys at Run Uphill Racing.  The course was a lot of fun with 2 water crossings (just got out of the first when the race photographer caught this picture) some fast sections and single track in the trees. I missed my goal of 1 hr by a couple of minutes coming in 33rd.

1st Water Crossing 

May 13, 2007

Run the Rockies - 1

In an effort to stay off the asphalt as much as possible this summer, I picked up Steven Bragg's Run the Rockies a few weeks ago to find some new places on the Front Range to run. I ran the my first of his 50 runs yesterday It was a 6.9 mi loop Bragg rated "moderate" and took about 68 minutes. I had to walk a hundred yards or so of the steep part of the trail on the west side, but the 1.5 mi descent to the trail head is very fast. Go in cooler weather as there is not much shade.
www.flickr.com
Dr. Skippy's photos tagged with bragg_28 More of Dr. Skippy's photos tagged with bragg_28

May 10, 2007

LiniStepper TestDrive Board Tested

Working toward a CNC machine, I decided to use the LiniStepper as my Stepper motor driver.  It is inexpensive, has a good feature set, is based on the Microchip 16F638A and open source.  I have build two LiniSteppers to date (need 1, maybe 2 more).  To test the stepper drivers, 4 control lines have to be set and a step clock function provided. A simple 555-based oscillator with some DIP switches will do. (You can buy a complete kit from James Newton for $19. That's what I did, but I had a small soldering incident that destroyed the board--hence this project.)

Here's the LinStepper driving a small stepper motor at 12V.

 

Stepper motor test

 

 

And here's a close up of the Stepper Tester.

LiniStepper tester circuit (LiniTester) 

 If you want to build one of your own, here's the schematic and circuit board (600dpi) artwork.  I used a 470 ohm resistor for the LED, 10K ohm resistors for the pull-ups, 2 500K pots for frequency and duty cycle adjustment and a .1 microfarad capacitor for the timing cap.  The 7 pin header matches the layout of the LiniStepper.  The upper 2-pin jumper in the photo is power input to the regulator circuit; the lower 2-pin header is for additional 5V supply (here I am powering the circuit directly from my bench supply).

 

May 09, 2007

Newspaper Economics - Online vs. Print Revenues II

We hear a lot about how newspaper circulation continues to decline a few percent per year. Now that many newspapers have very active Web sites, how are online and offline audiences correlated? Based on current trends (decreasing circulation, increasing Web traffic), what can we say about the future of the newspaper business?

Some data will help build intuition and point us toward a model.

The figure below shows Newspaper Daily Ciruclation (from Info Please) vs. Unique Web Site Visitors (stats from Nielson) for about 3/4 of the Top 100 newspapers in the US.  The ciruclation numbers are daily circulation, while the traffic is given in unique Web site visitors per month. Download the combined data as tab-delimited text file.




This correlation gives a rule of thumb for comparing the relative success of newspapers in attracting online readers (see the formula in the plot). As circulation decreases, will online readership adauqately compensate?

The next step is to compare revenue generation per circulation and unique web visitor. Stay tuned...

May 08, 2007

Is the distribution of the digits 0-9 uniform?

Another way of stating this question is "Are all digits equally likely?"

It turns out, no.  For large sets of numbers resulting from measurements of nearly anything, the lower numbers are more common.  In fact, they tend to follow a power law (See below).

But saying so doesn't make it so. How about some examples?

To get some quick results, I wrote a Python script to count digits.  The core counting routine is shown below (download .py, PyX required for making plots).

...

inf = file(options.filename,'r')
buf = inf.readlines()

nre = re.compile('[0-9]')

hist = {0:0,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0}

for line in buf:
    nlist = nre.findall(line)
    for n in nlist:
        hist[int(n)] += 1

... 

Next, find some data.  I started close to home by looking at the data from a monthly report of online performance data and financial performance data for the employer.  For this data, the histogram of 1 month's data looks like Figure 1 below.

 

April Performance Report Histogram
 

 

Figure 1. Distribution of digits 0-9 in monthly performance data for AdPay, Inc.

 

For a quick comparison, let's find some data on the Web for rainfall and population statistics.

Rainfall Histogram

Figure 2.  Distribution of digits 0-9 in rainfall data. Is the digit '3' unusually common in rainfall data?

 

Distribution Population Historgram 

Figure 3.  Distribution of digits 0-9 in population data from a combination of several countries.

 

For more information:


May 07, 2007

Newspaper Economics - Online vs. Print Revenues

I have blogged previously about the economic interplay of the news/story content creation, advertising (classified and display) and circulation (printing and distribution) silos of the newspaper business.  A Wall Street Journal article has some interesting data that seems consistent with what I have seen.  The difference between a newspaper's ability to generate revenues online and off goes a long way to explaining the newspapers efforts to keep a kung fu grip on tradition revenue streams.  This data often looks strange to people with more experience online than off (for example, see some of the comments at Freakonomics).  The geographic monopoly and passage of 100+ years of doing business have given this business time and space to learn to generate cash.  And the age of the business and its slowness to move to "cool" Web technologies has kept many of us from looking closely at the newspaper business.  The magnitude of the advertising revenues, the margins on advertising, the overall profitability of newspapers, average salaries of employees, etc. surprise many of us in the online world.

May 02, 2007

Craigslist Listing Growth Rates

A few days ago, I posted Craigslist listing data for selected cities. To take the analysis a little further, today I looked at the data that best fit exponential growth and calculated doubling periods for each site (doubling periods are easier to think about than exponential growth rates).

Here are the doubling periods in days and the DMA sizes associated with each Craigslist site. DMA numbers are the number of households in the area.

XXX.craiglist.com DMA Housholds Doubling Time (Days)
tallahassee 260,194 247
minneapolis 1,739,407 320
louisville 614,940 236
boise 239,376 111
denver 1,550,960 322
miami 1,515,347 356
lasvegas 635,356 318
memphis 668,804 236
westpalmbeach 707,934 188
indianapolis 1,029,361 231
fortmyers 440,261 169
orlando 1,287,863 190
houston 1,859,586 215
dallas 2,201,625 203
nashville 922,435 176
sanantonio 753,076 165
tampa 1,663,780 188
jacksonville 614,068 169
Average   225

Below, the data are shown as a scatter plot. There doesn't seem to be much correlation between DMA size and growth rate. In fact, the fastest growing Craigslist site also has one of lowest number of (DMA) households in the data set. On the Boise, ID, Craigslist site, the number of live listings per day in the for sale, services, jobs and housing is doubling every 110 days.

Craigslist doubling period vs. DMA size

This histogram shows the distribution of growth rates for the data set. The average doubling time for the number of live listings per day on Craigslist is 225 days for this data set.

Craigslist doubling periods

Some Caveats:

  • Ad duration: it appears that Craiglist may adjust the ad duration and/or scrub ads periodically. It may be that they control the number of live ads for a new site by increasing the ad during when sites are young and then cutting it down later to ensure fresher content. This means that Craigslist may target a specific growth rate.
  • Categories: The categories of for sale, services, jobs and housing were chosen to match traditional newspaper categories. The volume of traffic and content in other Craigslist categories may attract online visitors and effect the overall growth rate of the listings counted in this data set.
  • Life cycle: I don't have data showing the start date for various sites so this data glosses over any differences in the life cycle of a new Craigslist site and how that may be affected by DMA size.
  • "Fits exponential Growth" means that I performed linear regressions on Ln(lisings) vs time and chose all the sites with a correlation coefficient > 0.9.

Hosting by Yahoo!