Archive for November, 2006

The Black Friday / Cyber Monday Fallacy

Tuesday, November 21st, 2006

Since some are speculating about shoppers habits this upcoming weekend, I’d like to chime in. Speaking as a person formerly involved in E-commerce, I can tell you that Black Friday and the upcoming Cyber Monday won’t be all that. Sure, the weekend will be good for online retailers, but these won’t be the busiest days for online retailers during the 2006 Holiday Season.

In fact, the busiest days for online retailers this year will be December 12th and 13th. The reason? Christmas is on a Monday this year, and that pushes up the last day for a package to ship via UPS Ground within the Continental US to Friday, December 15th. (You can spec out a shipment on the UPS Web site to see what I mean.) Many retailers will use the 14th as an extra day for shipping and handling and to sort out issues on their end.

The bulk of e-commerce shoppers are those who would rather not go shopping in brick and mortar stores, hence the waiting until the second week of December to shop, but who also don’t want to be bothered with expedited shipping expenses and/or waiting for a package to arrive at the last minute.

Shipping is a big deal to consumers, as we’ve seen how consumers would rather take a “free shipping” deal over a percentage off deal - even when the percentage off will help consumers save more. This shopper psychology is why sites like Amazon pull stunts like “free expedited shipping” in the last days before Christmas - they’re trying to convince folks to stick around and buy and not worry about shipping.

*Formerly as in, currently advising.

Google

Tuesday, November 21st, 2006

Well, since Google stock hit $500 today, I thought I’d do a Google post.

-I had someone pulling my feed who was on Google wifi.

-My Dad just had to compare Microsoft Virtual Earth vs. Google Earth. He seemed to like the zooming action of both, but thought it was difficult to tilt the view within Google Earth.

-I am administering a Google AdWords campaign which is unfortunately showing ads on a number of those cruddy waste of bandwidth Made for Adsense (MFA) sites. While I don’t want to take the ads off the content network, the only solution I can think of is to Google for the advertised URL and manually remove the MFA sites from the campaign. What a pain.

[tags]Google, Google Wifi, Google Earth, Google AdWords, Google AdSense [/tags]

Entireweb Speedy Spider

Monday, November 20th, 2006

Speedy Spider is the crawler for the Sweden based search engine Entireweb. I have not seen any referrers from Entireweb, but their Speedy Spider featured a URL to the informative Speedy Spider FAQ. In addition, Speedy Spider is quite polite for a bot, only crawling one or two pages per request.

Host: 62.13.25.220
/robots.txt
Http Code: 200 Date: Nov 19 06:57:54 Http Version: HTTP/1.0 Size in Bytes: 6702
Referer: -
Agent: Speedy Spider (Entireweb; Beta/1.0; http://www.entireweb.com/about/search_tech/speedyspider/)

[tags]Search Engine, bot, crawler, spider, Entireweb, Speedy Spider [/tags]

Oh, LaDainian!

Monday, November 20th, 2006

And so, there I was yesterday, resigned to the fact that I was facing a middling Fantasy Football loss due to some non-spectacular WRs, a TE with one point, a Vikings defense that left a lot to be desired and Westbrook who didn’t get to carry the ball that much. Even Tony Romo, my very non-intuitive QB pick from a few weeks back, led the Cowboys to victory only to net me 9 points in Fantasy Football play.

And so, I watched last night and wondered - could LT even get me 41 points or more? Would I be doomed? And as it turns out, he did and I tied.

Also a shoutout to the Rutgers Scarlet Knights, who lost to Cincinnati on Saturday and so are 9-1, #14 BCS, #15 AP Poll. They face Syracuse this week.

[tags]Football, Fantasy Football, Rutgers Football [/tags]

Multitasking

Monday, November 20th, 2006

I have like 12 tabs open in Firefox, including work email, with a few other miscellaneous windows open on the PowerBook. The cell phone is right beside me.

My daughter has been sick for the past few days, with a horrendous fever and sore throat and the accompanying congestion. She’s ok when the tylenol works but her appetite is diminished and she has less energy than usual. She’s old enough to get some amusement out of taking her own temperature with the thermometer, “Mom, it’s 101!”

After getting a continual busy signal this morning, I finally have an appointment for her to be seen by the doctor this afternoon. The current thinking is that it’s probably the flu. An upper respiratory viral illness. A certain someone will be hanging with Grandma & Pop Pop for the next few days.

The irony of all this? I was supposed to get a flu shot at work today. And my throat hurts now.

BuzzLogic

Sunday, November 19th, 2006

Oh, wait, just when you thought we were done here with research services for the Google impaired, there is yet another one. Buzzlogic has been sending out their crawler for the past few weeks to this blog and by happenstance currently has a private beta for companies.

What is different about BuzzLogic’s crawler though is that it’s revealing a referrer which really, honestly should not be seen in the Web logs. Also, their crawler does not have any identifying information in the User Agent field. Here’s an example.

The questionable referrer, which I am seeing via Sitemeter looks like this:

[file:///data/thumbnailer/work/home-2006-11-17-17:21:16.438/2006-11-19-07:37:13.838-in.html

If I had to guess, however BuzzLogic compiles the collected data into a static HTML file. I’ve seen that static HTML file change day by day, each with a different time/date stamp for each individual instance it hits my Web server.

This is what I see via my Web logs.
Host: 64.34.246.44 (I was only able to connect this to BuzzLogic through a traceroute of the IP address. The BuzzLogic Web server is hosted on what seems to be a completely different hosting provider.)
/wp-content/plugins/sociable/images/reddit.png (This crawler is hitting my image files for some reason.)
Http Code: 200 Date: Nov 19 10:37:14 Http Version: HTTP/1.1 Size in Bytes: 5943
Referer: -
Agent: Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.4 (like Gecko)

[tags]bot, crawler, scraper, buzzlogic, brand monitoring services, search engine challenged PR firms [/tags]

Webclipping

Saturday, November 18th, 2006

Yet another monitoring the Web just for you, your brand and your PR department which can’t use Google service, a bot from Webclipping was spied hitting my RSS feeds recently. Clicking around the Webclipping site (which doesn’t look all that hot in Firefox 2.0), the service seems to be similar to other monitoring outfits including brandimensions. (A side note, brandimensions, which I’ve written about before, charmingly has a Flash-only-I-don’t-really-want-to-be-found-by-search-engines homepage. As you can see, I’m not exactly a fan of a service which compiles my content and doesn’t allow me to see the context.)

Host: 38.144.36.19
/blog/blog.rdf
Http Code: 302 Date: Nov 18 18:08:51 Http Version: HTTP/1.1 Size in Bytes: 224
Referer: -
Agent: Mozilla/4.0 (Webclipping.com)

[tags]bot, crawler, scraper, webclipping, brand monitoring services, search engine challenged PR firms [/tags]

Hoopla

Saturday, November 18th, 2006

A still in private beta service Hoopla purports to be “the next big portal that renders other online news and blog services obsolete.” There’s also an accompanying blog, currently with only one entry.

I found Hoopla via my usual discovery method, my Web site logs where the crawler was hitting my RSS feeds. It appears they need to be crawling the Web and blogosphere for a bit in order to collect content for their portal. I can’t tell if the folks behind Hoopla are American and/or German though. It looks like the anonymous WHOIS registration is for an American company, and the language on the Hoopla parked page is definitely colloquial American English, but the crawler is from a German IP.

Host: 82.165.243.217
/blog/blog.rdf
Http Code: 302 Date: Nov 17 12:37:08 Http Version: HTTP/1.1 Size inBytes: 224
Referer: -
Agent: http://www.hoopla.com/; tracker@hoopla.com - Hoopla.com honors
robots.txt; Hoopla.com Tracker; Mozilla/5.0 (Windows; U; Windows NT 5.1;
en-US; rv:1.7.6) Gecko/20050402 Firefox/1.0.2

[tags]hoopla, RSS, beta, crawler, portal, portal page, Web 2.0 [/tags]

Three RSS applications - FeedSweep, Fatcast, Wefeelfine

Thursday, November 16th, 2006

FeedSweep provides a way to display syndicated RSS content on your site. So, for example, you could show the cleverhack feed if you really, really wanted to. One gripe, I couldn’t find an Add Feed to FeedSweep button.

<script src="http://www.feedsweep.com/products/feedsweep/producer.aspx?feeds=http%3A%2F%2Fcleverhack%2Ecom%2Ffeed%2F"></script>

Fatcast is an online RSS reader similar to Bloglines. Again, I could not find an Add Feed to Fatcast button. However, the service does allow one to share their list of feeds if they wish, so of course I made one exclusively with cleverhack feeds.

My Fatcast feed list.

Last, but certainly not without emotion is WeFeelFine. Appears to be a reasearch project which searches the blogosphere for words or phrases on how the blogosphere feels. The applet which displays the emotional information looks quite cool (click on the We Feel Fine link on the home page) and allows one to search via demographics - age, sex, location. (Warning, the applet seems to take a bit of memory in Firefox.) Anyone up for reading about how some emo twentysomethings from Seattle feel?

And how did I find WeFeelFine? Their crawler, which didn’t have any identifying info in the User Agent, but the lookup on the IP provided the domain name.

Host: 128.177.11.193
/2006/11/10/rutgers-9-0/
Http Code: 200 Date: Nov 10 00:05:56 Http Version: HTTP/1.1 Size in Bytes: 16477
Referer: -
Agent: Mozilla/4.0
00000000000000000000000000000000000000

[tags] RSS, RSS feeds, RSS Syndication, RSS Readers, RSS Research, FeedSweep, Fatcast, WeFeelFine [/tags]

Identify your (Web) spider

Wednesday, November 15th, 2006

Today Slashdot had a front page article about how to create a Web spider on Linux. Aside from the fact that the subject matter just totally excites my inner nerd, I wanted make a point especially for those who would be writing a spider, bot or crawler for fun and profit.

I have this true story about how, not so long ago, I was a Webmaster. One very busy morning, I had a crawler that was hitting my site and it was annoying the heck out of me as it was a little too aggressive. I really wanted to ban it, but I saw a URL in the User Agent, and so I tracked down the source. The homepage for the bot at the time looked like this and the site it was crawling for wasn’t live yet. At that point, I had a choice - I could just ban the bot and be done with it or allow the bot to run and hope that the not yet live site would someday provide some benefit.

As it turns out, I held my nose and allowed the bot to run. In fact, a few weeks later, it did slow down and was friendlier - so I didn’t mind it as much. The other part to this story is that the site in question went live in April 2006 - and it did show the crawled content.

In other words, if your bot is legit, identify it or face the chance that you could be banned from the very sites you want to crawl. While the shopwiki example isn’t the best example of a parked page, at least I had some information to go on as a Webmaster.

[tags]Spider, Bot, Crawler, User Agent, Webmaster, Web Admin [/tags]

Anti-Spam Update - Knuj0n and Boxbe

Wednesday, November 15th, 2006

Some up and coming anti-spam services I’ve heard about recently…

On the technical side, KnujOn offers a method to help identify the folks sending fraudulent email. As an end user, all you do is register your email address with the service so the address can be whitelisted and then send your email to KnujOn. The father/son team behind KnujOn collects the data and invites Web hosts, Credit Card investigators and law enforcement to use the collected data during investigations. As a bonus, the service sends participants weekly progress reports as to how many fraudulent sites have been taken down.

More information about Knuj0n can be found at Castle Cops and there’s even a Thunderbird extension for the service.

Aside from reporting spam, for inbox protection why not take a look at Boxbe? Boxbe is all about giving you a forwarding email address that you can share with others without the hassle of receiving spam. In order to reach your your pre-existing email address inbox, advertisers have to pay you a price you specify. As a value proposition, Boxbe protects your inbox and pays you for your attention.

The service is not without drawbacks, however. For example, when you set up a profile on Boxbe, you’re asked to divulge interests and other profile data, which Boxbe anonymously shares with advertisers. In addition, there could be problems with senders. If the sender doesn’t want to work with the Boxbe system (either by refusing to complete the sender test or refusing to pay to send to you), the email in question would land in your quarantine.

[tags] Email, Email Deliverability, Spam, Email, KnujOn, Boxbe [/tags]

Jesus 2.0

Tuesday, November 14th, 2006

MyCCM bills itself as social networking for Christians and it bears the hallmarks of a true Web 2.0 space - RSS feeds and RSS search capabilities, blogs, podcasts, personal profiles and the ability to join a community.

I had a referrer from the site and I had to go and click around. The site design looks fine. But I have one question about the site, it seems that you can see a great part of the site without needing to log in. I was able to click around to each section of the site - myRSS, Blogs, Tags, Search, Groups. Community and see the section pages in addition to searching for profiles. To me, it appears that this site allows way more unfettered unlogged in access than a MySpace, Facebook or Linked In and some of those profiles looked young, even though the registration process doesn’t allow birthdays later than 1993.

[tags]Web 2.0, MyCCM, online communities [/tags]