Archive for the 'General' Category

R.I.P. My Personal Blog (1999-2006)

Saturday, December 30th, 2006

I’ve decided to stop personal blogging, at least in the style I have been doing here at PeterCooper.co.uk. I’ve been blogging in this way for 7 years (since September 22, 1999 - yes, you can definitely tell I was a teenager by that writing) and it just hasn’t clicked for me. I’ve never become known as a blogger, rarely get links from anyone else, and have always relied on Google for most of my traffic (and that party is now totally over given the latest updates). Some people manage to pull off persona, blogs to a large degree of success, such as Jason Kottke, Robert Scoble, and Hugh McLeod, but they still have a focus in one area or another, which is something I, perhaps, have lacked.

This is not a sob-story though. Ruby Inside has grown beyond my wildest dreams, and it proves I can put a great, popular blog together within a particular niche. At heart, I’m a generalist rather than a specialist, but I’m beginning to feel I need to start putting down roots and become a specialist, at least on a blog-by-blog basis! Therefore.. the eclectic, personal blog is over. Ruby Inside will continue, and, I am sure, more new blogs will follow, but they will each have a focus and a specific audience.

As of this post, this blog is becoming an ‘announcement’ style blog, in the same vein as Nick Denton’s. He posts once a month, if that, just to make a point or announce a new site he’s launching. I’m going to redesign the front page of PeterCooper.co.uk to emphasize my bookmarks and Twitter feed a bit more (I update those anyway) and simply have my small ‘announcement’ blog poking out at the bottom or the side or whatever. So don’t unsubscribe, but don’t panic if you don’t see any posts from me for a while.

Signing off, and not for the last time,
Pete.

Google indexes trash and “supplementals” above original content

Thursday, December 28th, 2006

Google continues its on-going trend to suck ass by ranking verbatim reprints of my feed by ‘myFeedz‘ higher than the original posts at this site. In fact, Google is now even ranking pages with only links to my posts higher than my actual posts even when I search for the post title as a phrase.

First step, I thought, would be to send a nice letter to myFeedz:

How can I have my feed removed from myFeedz? Or.. rather, how can I have my articles being reproduced verbatim on your site?

I have no problem with it personally, but Google does, and now all my pages are disappearing, only to be replaced with your reprints of them which is obviously “not good” ;-)

You can e-mail me at XXXX@gmail.com

Second step, though, is to record my thoughts that the Google search engine is rapidly becoming a piece of useless crap only good for indexing ripped off content. Google has indexed my posts up until about the last week, yet when I do a verbatim search for “The Markov Manifesto” (the title of a post I made almost a month ago), I get these results:

Markov

All but the last two results are complete trash, and even the last two aren’t the original post. This is pretty crazy considering my site is a PR 7 and all the ones above me are Supplementals! I can repeat this experiment for most of my posts, even those where the original post actually comes up. In some situations my posts come first, quite why it varies this way is a mystery to me.

All respect to the wonderful people at Google who, I am sure, are mostly pretty great engineers, but I think those swimming pools, pool tables, and dogs running around the workplace are screwing with their management’s decision-making processes by making fine engineers work on total crap rather than improving their core business of search.

Is blocking HTML e-mail an answer?

Monday, December 25th, 2006

Supposedly the US Department of Defense are blocking all “HTML-based e-mail”. I’m thinking this might be a good solution, depending on who you generally converse with.

Looking through my e-mail, all of the spam comes in HTML e-mails (image spam, junk mail, graphic ads) and all my legitimate e-mail is text (except a few people who insist on using HTML stationery, but who I could white-list or give ultimatums to). Put it this way, I don’t get any legitimate mail from unknown sources that’s both HTML and valid. Combining a weak white-list with blocking HTML mails from unknown addresses could probably eradicate 99% of my spam. Might be worth me writing a quick POP3 filter just to try out the theory..

Life In The Googleplex

Sunday, December 24th, 2006

Googleplex

Check out this great photoessay by Time Magazine that shows just what lucky bastards the folks at Google are.

Saddam Execution Video

Sunday, December 24th, 2006

Since I have nothing to lose, this is bait. Once such a video is posted on the Web, I’ll post a link to it from here.

UPDATE: December 31.. someone has finally uploaded a real ‘cameraphone’ version of the event.

It’s here.

Microsoft attempts to patent feed processing technology

Saturday, December 23rd, 2006

I can’t believe it. Dave Winer reports that Microsoft are rather specifically attempting to patent a system that acts and sounds rather like what Feed Digest does. All of these excerpts from the patent application are almost word for word descriptions of significant aspects of what Feed Digest does or how it operates. It also covers significant aspects of applications such as FeedBurner.

The ability of a central system to receive feeds and allow others to retrieve data related to those feeds:

[…] the platform can acquire and organize web content, and make such content available for consumption by many different types of applications. These applications may or may not necessarily understand the particular syndication format. Thus, in the implementation example, applications that do not understand the RSS format can nonetheless, through the platform, acquire and consume content, such as enclosures, acquired by the platform through an RSS feed […]

There are cases, however, when an application that uses the platform does not wish to be subscribed to a particular feed. Rather, the application just wants to use the functionality of the platform to access data from a feed. In this case, in this particular embodiment, subscriptions object 202 supports a method that allows a feed to be downloaded without subscribing to the feed. In this particular example, the application calls the method and provides it with a URL associated with the feed. The platform then utilizes the URL to fetch the data of interest to the application. In this manner, the application can acquire data associated with a feed in an adhoc fashion without ever having to subscribe to the feed.

The ability to tailor data within the system for each feed:

On the other hand, there is data that is treated as read/write data, such as the name of a particular feed. That is, the user may wish to personalize a particular feed for their particular user interface. In this case, the object model has properties that are read/write. For example, a user may wish to change the name of a feed from “New York Times” to “NYT”. In this situation, the name property may be readable and writable.

Centralized synchronization:

In the illustrated and described embodiment, feed synchronization engine 108 (FIG. 1) is responsible for downloading RSS feeds from a source. A source can comprise any suitable source for a feed, such as a web site, a feed publishing site and the like. In at least one embodiment, any suitable valid URL or resource identifier can comprise the source of a feed. The synchronization engine receives feeds and processes the various feed formats, takes care of scheduling, handles content and enclosure downloads, as well as organizes archiving activities.

Feed normalization:

In the illustrated and described embodiment, feeds are capable of being received in a number of different feed formats. By way of example and not limitation, these feed formats can include RSS 1.0, 1.1, 0.9.times., 2.0, Atom 0.3, and so on. The synchronization engine, via the feed format module, receives these feeds in the various formats, parses the format and transforms the format into a normalized format referred to as the common format.

To Amar S. Ghandi, Edward J. Praitis, Jane T. Kim, Sean O. Lyndersay, Walter V. von Kock, William Gould, Bruce A. Morgan, and Cindy Kwan.. did you really collectively invent all of this stuff? Shame on those backing this pathetic attempt to trample over technology that has, so far, not necessitated the use of software patents.

Mac-compatible, USB 3G Vodafone Modem

Thursday, December 21st, 2006

Vodafone3Gusb

After all the experimenting with Sidekicks and Nokia Communicators, perhaps I’ve finally stumbled across what I really need? It’s an external USB modem that uses the Vodafone 3G network for on-the-go broadband! Even better, it works on the Mac, so I won’t need to buy a PC laptop with a PCMCIA slot or anything silly like that.

I’m locked into my T-Mobile plan for several more months yet, but perhaps it’ll be worth it since I’m barely using the T-Mobile one due to the crappy Sidekick. Vodafone have two quite different deals. You can get the modem for free and have an ‘unlimited’ (really 1 gigabyte per month) data plan for £45 per month, or you can pay £49 for the modem then £25 per month for 250MB of transfer (and use the Internet ‘pay as you go’ for £1 per megabyte after that).

I think I might go for it. I’m already paying £40 a month to T-Mobile, but with feature-less Sidekicks and broken Nokia Communicators.. perhaps I’ll break loose from that and try a third time lucky? I might even be able to find a way to weasel out of my T-Mobile. Any hints and tips on that front would be appreciated.

(Update: Just called T-Mobile and they say it’ll be £184.42 to escape the next 6 months of my contract. Since it’s only £220 in bills to run that long, I might as well milk the minutes and keep it till then..)

(Update 2: Okay, I’m signed up. It’ll cost £29 per month - VAT was extra - and I can upgrade to the ‘unlimited’ plan if I need to at any time. Now let’s see if this solution will work..)

How Many People Click On Which Search Results?

Thursday, December 21st, 2006

Picture 2-3
Yes, I know I need to pull my finger out and draw proper graphs ;-)

I’ve been doing a little research with the infamous AOL search data. I wanted to first find out the percentage of searches that result in a click (it’s about 55%), and then to find out what results get the most clicks.

First I produced some cumulative stats, and found that 87% of people who ultimately click on a result do so from the top 10 results. 92% click from the top 20. 97% from the top 50. Only an adventurous 3% bother to go any further (that the percentage is that high surprises me).

Finally I decided to just focus on a per-position basis, as shown in the basic graph above. P1 through P10 refer to positions 1 to 10 in the search results, and the data points refer to the percentage of ultimate clickers who visited the site shown in that position. Position 1 picks up an enviable 42% of visitors, with a constant slope downwards from there. Only position 10 gets a slight uptick, due to its position at the bottom of most search pages (the same can not be said of positions 20, 30, and so forth).

Note: The sample size for all of the above is 3420811 searches / 1889761 clicks.

Any Good Alternatives to Google?

Tuesday, December 19th, 2006

I’m looking for an alternative to Google to use in my Web browsers. Any suggestions? It doesn’t have to be a particularly popular engine either, as long as the design is clean, the sites are quick, and the results good enough.

Google’s results tend to be getting worse by the month and their constant waves of re-ranking are becoming tiring (especially when it seems to be valid sites rather than spam and splogs getting punished). I’m not the only one.. many people on WebmasterWorld are howling for Google’s blood. I personally feel Google can do whatever they want, but I want to find an alternative until they improve their search engine and stop resting on their laurels. It’s almost enough to drive me into creating a search startup ;-)

It’s also good to have an alternative so that you don’t become reliant on Google. I think a lot of us have nowadays, and that’s unhealthy for the industry. While Google isn’t, technically, a monopoly, it certainly has a form of ‘cultural’ monopoly over the Internet.. everyone assumes you’ll use Google to search for something, and “Google” has become a verb. We need alternatives, even if we continue to use Google for the areas where it shines (its blog search engine, for example, is far better than Technorati’s).

(Update: In my investigations, I’ve found Yahoo to probably be the best results-wise so far but a poor design.. Icerocket has poor results but they have a great design.. MSN is okay results, okay design.. shall keep looking :) )

New Digg Design Just Launched

Monday, December 18th, 2006

Digg

I wasn’t expecting this. Seems Digg just launched a new design. Must be pretty new as I go there a few times each day :)