Archive for the 'Tech' Category

My agent/1.0

Friday, November 10th, 2006

A user agent which appears to be someone’s handrolled RSS feed reader or feed aggregator, this visitor came from a Comcast IP somewhere near Boston. I will have to watch to see how often he’s hitting my feed.

/wp-rss.php
Http Code: 200 Date: Nov 09 23:22:34 Http Version: HTTP/1.1 Size in Bytes: 4464
Referer: -
Agent: My agent/1.0

[tags]RSS, RSS feed, RSS reader, RSS aggregator, user agent, Why don’t my readers provide some sort of technical documentation to satisfy my curiosity? [/tags]

Hacking MyBlogLog

Saturday, November 4th, 2006

I’ve been using MyBlogLog heavily for the past week and there’s a few interesting things I’ve noticed about the service.

Observations

  • Since being active on the site nearly a week ago, I’ve had over 460 new visitors to cleverhack, with 56 viewing my about me page (which, if I think about it, is probably as many clicks as I get to my about me page in 90 days).
  • I have seen a number of visitors who, in marketing parlance, have converted to regular readers and RSS users.
  • I’ve noticed two types of accounts on MyBlogLog, your normal user accounts and accounts representing a Web site or brand ( I’m thinking Zillow, Buzz Tracker and Dogster and the like…)
  • This blog is in the Top 50 “C” Communities.
  • I’ve seen some good SEO from MyBlogLog (the cleverhack MyBlogLog page is currently #4 when searching on cleverhack.)

What I’d like to see and other issues

  • I like using the My Communities page to browse blogs. What I’d like to see is if you’re logged in, and looking at your “community” of blogs, to be able to click through to the blog rather than to the blog profile page.
  • User profile page should have RSS feeds of the users blog(s).
  • The Hot In My Communities widget on the user profile page doesn’t seem to update all that much.
  • I don’t understand why I would have to manually create a screenshot of my blog’s page to update the screenshot on the service.
  • Blog profile page should have RSS feeds of the blog higher in the right hand column. (Right now, the RSS feeds are shoved down at the bottom).
  • I’d love to see some sort of random show of blogs on the MyBlogLog Community page.
  • I’d love to see some sort of random show of members on the MyBlogLog Members page.
  • I don’t know what exactly makes a member “hot” or a blog “hot”.
  • The blog stats features are kinda cool but are dependent on javascript (meaning that people using non-javascript enabled browsers won’t be counted).
  • Aside from the extra blog stats features, what is the value proposition for a Pro Membership?

How to take advantage of MyBlogLog

  • Click through to the MyBlogLog front page often, that way you show up in the Recent Readers widget. I do see traffic coming in that way.
  • Check out other people’s profiles and their blogs. They will usually reciprocate.
  • There is some traffic value in belonging to the hot communities, although I’d expect the advantage to diminish as more people belong to those same communities.
  • Be female. (Whoops, my bad, said that out loud.)
  • Use a photo of yourself as your avatar. The better the photo, the better the response.
  • Join communities but avoid becoming the dude who joins everything.
  • Feel free to ignore contact requests if they aren’t a good fit.

Oh, and googling for mybloglog brought up this 10 Hottest Clicks map.

Yahoo gets snarky with postmasters

Saturday, November 4th, 2006

So, this morning I was clicking around on the Web, trying to find more information about the Yahoo mail server problems (i.e. the 451 Message temporarily deferred errors). There’s been some talk about it on NANOG and other sources, and I was hoping to see some more information about it.

As it turns out, I visited the Yahoo Mail Postmasters Help page this morning to find some newly updated information about the deliverability problems. This updated Postmaster’s page could be a resource - if the main links weren’t all broken. I kid you not — the links in the body of the page all have quotes…for example, http://help.yahoo.com/help/us/mail/defer/”defer-06.html”

But, there’s more… if you’re clever enough to remove the extraneous quotes on the “Does Yahoo! use “greylisting” to reject messages?” link…this is what you see as of 10:17am EST on 11/4/06.

Yahoo! Mail Help
Yahoo! Mail > Yahoo! Mail Help > Yahoo! Mail Postmasters Help >

Does Yahoo! use “greylisting” to reject messages?
No.

The most commonly understood form of “greylisting” is where an SMTP server will reject every message the first time it is attempted, and then accept it if the sending server retries later. The theory is that spammers won’t retry messages, while legitimate senders will.

Yahoo! does not utilize this method, and we have no intention of doing so in the future — no matter what you may read on some random blog.

Nice. Not only does Yahoo continue to have problems with email deliverability, that their main postmaster page has broken links but now they’ve got some snark in their corporate voice when communicating with outside postmasters. Good going guys.

Oh, and another thing… I am more than happy to submit the URL of the page with the broken links to someone at Yahoo, but their Postmaster pages “Contact Us” button links to their form for submitting technical feedback for mail. Not very helpful.

Previous posts about Yahoo mail deliverability issues: Tuesday and Wednesday.

Update: 11/4/06 4:02pm EST A Yahoo Postmaster contact just stated the following on NANOG

The issue some of you are
seeing is that your mailserver IPs are being grey-listed after a certain
number of emails and being traffic shaped. To have your legitimate
mailservers added to a white list, please refer to the following info.

http://help.yahoo.com/help/us/mail/defer/defer-06.html

Thanks!

So, a postmaster contact is directly contradicting what is on the official Yahoo postmaster pages. Nice.

Update 11/6/06: Yahoo has fixed the broken links on their postmaster page and has edited their “Does Yahoo! use “greylisting” to reject messages?” page. Yahoo now says that they do not “greylist” as understood to be rejecting every message initially and then accepting later. So basically, their postmaster contact’s statement still stands.

Yahoo Email Deliverability (update)

Wednesday, November 1st, 2006

Chuq blogged about the Yahoo problems, and happened to solicit a good comment from an ISP admin who is active on NANOG.

With some testing today, a Yahoo account I have was accepting email from a domain that doesn’t mail all that often. Some mail that I receive on a daily basis which usually gets routed to the spam folder made it to the spam folder at the usual time.

However, mail that I sent from a domain which has significant volume…well, I sent the mail at 9am this morning and it still hadn’t made it to Yahoo by the time I left work this evening.

From what I understand, it seems that part of the deliverability issue concerns how Yahoo mail handles messages sent from a particular mail server. From the mail admin’s side, the outgoing messages are held back in queue for hours at a time and are only accepted intermittently by Yahoo.

For background, read yesterday’s post.

[tags]Email deliverability, Yahoo, Yahoo mail, spam filtering[/tags]

it’s not you, it’s Yahoo

Tuesday, October 31st, 2006

For those of you who are not on the email deliverability front lines, you may have been wondering why your email to Yahoo addresses hasn’t been getting through all that reliably recently.

As it turns out, the folks at Yahoo Mail apparently changed their spam filtering system sometime mid-October. Here’s a great blog entry detailing the Yahoo issues.

Of course, I’ve been hearing that mail admins haven’t been getting helpful responses from Yahoo about this, in addition to the complete lack of documentation about the problem.

Personally, I’ve seen emails sent during the past few days take around 24 hours or so to reach the Yahoo mailbox I was sending to.

[tags]Email deliverability, Yahoo, Yahoo mail, spam filtering [/tags]

Firefox 2.0 for Mac OS X- Can’t import bookmarks from file

Saturday, October 28th, 2006

Well, I got all fired up about using Firefox 2.0 as my main Web browser on my PowerBook. I went and downloaded it, and when I tried to import my bookmarks from Camino (after exporting those bookmarks into a nifty html file), I couldn’t.

Why? Because the File > Import menu on Firefox 2.0 for Mac OS X does not have a file import capability. It only imports bookmarks from specified browsers such as Safari, Internet Explorer and older versions of Mozilla based browsers.

[tags]Mozilla, Firefox, Firefox 2.0, bookmarks, browser, Web, annoyances [/tags]

on search engines and trust certification authorities

Tuesday, October 24th, 2006

First, Didier Stevens has posted an update on the bogus SERP pages serving badware (Didier calls this method spamdexing). In this update, he reveals the overall number of spamdexing sites and analyzes the AOL search data and finds that roughly 1% of AOL users clicked on these spamdexing sites.

Previous cleverhack posts about these sites can be found here and here.

Not too long ago, Ben Edelman posted an article examining the trustworthiness of sites certified by site certification authorities - most notably TRUSTe. His methodology included cross-referencing TRUSTe’s ratings with the findings of SiteAdvisor - and finding that TRUSTe certified some sites that SiteAdvisor did not rate as trustworthy. These findings should be considered when thinking about purchasing a site certification seal for your site or using a site with a certification seal. (And yes, I will admit my bias - of being on the Web for so long to dislike anything trying to show “trust” - against these seals…)

[tags]SERPS, Google, search engines, traffic, spyware, adware, badware, viruses, trojan, dialers, spamdexing, site certification, site trustworthiness, TRUSTe, SiteAdvisor [/tags]

two unknown search tools

Sunday, October 22nd, 2006

An easy peasy post to start off the morning about two unknown search tools I’ve been seeing in the logs.

First is scroogle.org which basically acts as a proxy between you and Google. The cachet is that Google can’t track you if you use the scroogle.org service. I see less than 10 referrers from there every few days. Also, scroogle dot com, uh, redirects you to a very not safe for work search engine. You have been warned.

Gridwell is a beta search engine based in the UK. The search engine appears to be a consultancy project and it looks like they’re just getting their feet wet trying out site SERP design ideas such as showing site favicons with results. Their results, as of this writing, are from Yahoo. Here’s the about page for the gridwell search service.

[tags]scroogle.org, gridwell, Google, Yahoo, search engines, SERP [/tags]

ancestry.com bot

Saturday, October 21st, 2006

I thought this was clever. The grandaddy of all genealogy sites, Ancestry.com has a crawler which searches for biographical information on the Web. Here’s the bot information page.

Host: 66.43.16.184
/
Http Code: 200 Date: Oct 09 17:28:47 Http Version: HTTP/1.1 Size in Bytes: 50133
Referer: -
Agent: Mozilla/4.0 (compatible; MyFamilyBot/1.0; http://www.ancestry.com/learn/bot.aspx)

A special bonus, my genealogy course notes from my days as a computer instructor.

For the incoming search engine visitors, if you want to get into the genealogy blogosphere, start with the well written and sharp witted genealogue.blogspot.com.

[tags]genealogy, ancestry.com, robots, bots, crawlers[/tags]

Siphoned Google traffic used to install badware

Saturday, October 7th, 2006

A few months ago I wrote up a post about how I found some very shifty results on Google search engine result pages. At the time, I found some pages that were essentially mockups of Google SERPs. These pages were all on a .info TLD, had an iframe linking to a website called cleansearch.info and had numerous of links at the bottom of each mocked up page linking to other mocked up .info pages. I couldn’t figure out why someone would go through all of that trouble.

Well, as it turns out, Didier Stevens found out why.

Didier examined Google SERPs on Google.be and found the Google mockup .info pages were infecting folks with spyware, adware, dialers and other badware. As of this writing, most virus scanners can’t find these infected files. Here’s video of the infection and cleanup.

Didier has also tried to determine the probability of seeing one of these drive by download sites on a Google search query, and the figure is roughly 1 out of 1000.

Of note, these mocked up sites are rife with misspelled words, so if you’re a chronic misspeller I dare say your chances would be higher.

[tags]SERPS, Google, search engines, traffic, spyware, adware, badware, viruses, trojan, dialers [/tags]

MSRBOT update

Thursday, October 5th, 2006

Remember the earlier speculation about a bot called MSRBOT and whether it was Microsoft related?

As it turns out, MSRBOT does belong to Microsoft Research in San Jose. The researchers at Microsoft finally put an identifying url in the user agent string.

Host: 209.133.64.213
/blog/archives/001251.html
Http Code: 200 Date: Oct 05 05:22:51 Http Version: HTTP/1.1 Size in Bytes: 7743
Referer: -
Agent: MSRBOT (http://research.microsoft.com/research/sv/msrbot/)

[tags]Web, robot, crawler, Microsoft[/tags]

Sleipnir

Sunday, October 1st, 2006

Sleipnir is a Web browser based on Internet Explorer and designed for Japanese users. There is an unofficial Yahoo group for Sleipnir users.

Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) Sleipnir/2.47

[tags]Sleipnir, Web browser, Japanese language Web browser[/tags]