2003 July 02 Wednesday
Search Engines Reduce Repeat Visitor Traffic To Web Sites

Jakob Nielsen has an interesting Alertbox column entitled Information Foraging: Why Google Makes People Leave Your Site Faster.

The easier it is to find places with good information, the less time users will spend visiting any individual website. This is one of many conclusions that follow from analyzing how people optimize their behavior in online information systems.

This argument makes a lot of intuitive sense. I've personally been going to Google News a lot more to look for news stories and as a consequence do not go to the front pages of news sites as often as I used to.

Search engines still have a long way to go though. In some topic areas the sales sites for particular products will dominate over technical discussions of the performance of products. In other topic areas amateur sites with little useful knowledge will show up high in a search result list while pages with the most insightful analyses get low ranking.

One feature I'd like to see Google support is a date-sorted return in regular pages like it does on news pages. Sometimes one is looking for the latest that has been said on a topic and while one can restrict Google searches to recent months in the Advanced Search facility it is not the same thing. What would be nice would be more knobs to turn to tell Google the kind of page one is looking for. Rather than simply sort by date it would be nice to just say "assign more weight to page date in the index algorithm" or "assign more weight to academic papers in the index algorithm".

By Randall Parker    2003 July 02 02:18 PM   Entry Permalink | Comments ( 0 )
2003 June 23 Monday
On Web Sites And Forums Companies Have A "Word Of Mouse" Reputation

The New York Times has an interesting article about the trend of companies taking a greater interest in monitoring and defending their reputations on internet consumer opinion survey sites and in discussion groups. Intuit, bowing to the power of the web, will discuss major new features in online forums before springing features that might upset their customer base.

"I think that, now, the power of the Internet is captured in the ability of everyday Americans to give their opinion on any product or event that they want," Mr. Gulbransen said. Next year, he added, before Intuit releases a new product, it will discuss possible changes with users of important online forums. The company will also eliminate the features that customers complained about angrily.

The Times article cites a long list of sites that report on and influence consumer customer opinions. Being somewhat backward as web sites go the Times did not make most of those URLs into clickable links. So if you want to click thru to them here are most of the ones mentioned: Extremetech.com, CNET.com, Slashdot.org, Amazon.com, and Epinions.com. As an internet deal search site they mention DealTime.com. But they should have mentioned Google Froogle too. The article also mentions that Paul Resnick of the University of Michigan runs a website that reports on research on online reputations.

I think a lot of companies are missing a big opportunity by not making a more concerted effort to more systematicaly collect a lot more informaton from their customers, ex-customers, and potential customers on what customers like and do not like about the various products on the market and what customers want but can not find. Companies ought to offer questionnaires on their sites that have detailed lists of products and aspects of products to solicit feedback about what ought to be changed and why and how. Companies that send emailings to convince customers to buy ought to include sections in such emails that ask for feedback and that provide links to places to provide the feedback.

In a nutshell: there are more excellent minds outside of most companies with great ideas for product planning than there are working on the inside. Those minds that are on the outside are a great resource that could be tapped in a variety of ways to get better ideas to make better products.

By Randall Parker    2003 June 23 09:35 PM   Entry Permalink | Comments ( 0 )
2003 January 14 Tuesday
Dweb.blogspot.com Singled Out By Chinese Govt

The Chinese government has blocked access to all of blogspot.

One explanation for the blockade suggested by a number of reports is that the entire site may have been blocked to prevent Chinese internet users reaching one blog in particular: dweb.blogspot.com. This site is has published lists of proxy servers that can be used to gain access to restricted web sites from within China.

Dweb presents evidence of that their subdomain was hijacked.

Besides IP blocking, China also hijacked the domain name "dweb.blogspot.com". (see http://www.dit-inc.us/hj-09-02.html for more details of the DNS hijacking.) However, other subdomains are not hijacked. This can be verified from outside China. Here is a screen output you can reproduce:
C:\>nslookup dweb.blogspot.com ns4.bta.net.cn
Server: ns4.bta.net.cn
Address: 202.106.0.20

Non-authoritative answer:
Name: dweb.blogspot.com
Address: 64.33.88.161

C:\>nslookup bill.blogspot.com ns4.bta.net.cn
Server: ns4.bta.net.cn
Address: 202.106.0.20

Name: bill.blogspot.com
Address: 64.41.146.221

C:\>nslookup blogspot.com ns4.bta.net.cn
Server: ns4.bta.net.cn
Address: 202.106.0.20

Name: blogspot.com
Address: 64.41.146.221

Here you can see that dweb.blogspot.com is the one Communist party pick up to be the most subversive among all bloggers to be on the DNS hijacking list.

In addition to blocking access to various sites on the web China even blocks the use of certain search terms on Google.

"If you enter one of these keywords, such as the Chinese president's name, you loose all IP connectivity for five minutes," Edelman told New Scientist. "I suspect they may have had this system in the wings all along."

Update: John Jay Ray reports that the Chinese block on Blogspot has been lifted.

By Randall Parker    2003 January 14 08:58 AM   Entry Permalink | Comments ( 0 )
2002 December 31 Tuesday
Google Glossary and Google Sets on Google Labs

Google has a new experimental feature on Google Labs called Google Glossary. It finds and extracts definitions of words and phrases from web sites. I had success with therapeutic cloning, Java Beans, temperature inversion, and multiple inheritance but not with reproductive cloning. Where dictionaries rarely list two or three word combinations that have some specific technical or cultural meaning it appears that Google Glossary can come back with decent definitions at least some of the time.

Google also has something called Google Sets that is very cool. Give it names of things from a set and it tries to predict other words that also belong in the same set. For instance, Buffy, Willow and Tara successfully yield a list of other characters from Buffy The Vampire Slayer. Kirk and McCoy successfully return a list of Star Trek TOS crew members. I mean, how neat is that? Sex, drugs, rock successfully predicts "roll". You get the idea. Some of the results are, well, curious. Chopin, Bach, and Mozart return a list of mostly classical composers. But Elvis is on the list and so is the word "Introduction". If you come up with any interesting word combinations please come back and post them in the comments to this post.

By Randall Parker    2002 December 31 10:24 PM   Entry Permalink | Comments ( 2 )
2002 December 16 Monday
Google Zeitgeist 2002 Released

Google has released their 2002 Zeitgeist of the web. Some of the results undermine some national pretensions. In France Britney Spears is number 8 in the overall most popular query list. This from a country that prides itself in its disdain for American popular culture? France also ranked Vaness Demoury (who I'd never heard of before) ahead of Alyssa Milano. France also has that quintessential pop culture figure Pamela Anderson on their top celebrity list. Though so did the world as a whole.

The most curious thing about all the lists is that the top 20 gaining query list has no contemporary political topic. Sports, celebrities, cartoon characters, and video games are the most popular entries. Also, World Cup beat out Iraq as top news story. "Las Ketchup" (whatever that is - I want to remain in blissful ignorance) is considered to be a news story but it sounds like a pop culture fad. Also why is Canada in second place on the popular destinations list?

By Randall Parker    2002 December 16 11:28 AM   Entry Permalink | Comments ( 0 )
Email Viruses Growing In Incidence

At least this isn't as big of a problem as Spam. The most successful is Klez.H followed by Yaha.E.

During 2002, one in every 212 emails passing through the company's filtering system was a virus. This is nearly double the rate of one in every 380 recorded for 2001. And in 2000 the ratio was one in every 790 email messages.

By Randall Parker    2002 December 16 09:46 AM   Entry Permalink | Comments ( 0 )
2002 December 13 Friday
Google Labs With Neat Scrolling Preview Mode

I saw this Google Labs URL in my referral logs. If it is still working then try this.

If you don't want it to scroll as quickly then try this.

By Randall Parker    2002 December 13 10:40 AM   Entry Permalink | Comments ( 0 )
2002 December 12 Thursday
Google's Frugal Froogle Shopping Searcher

Those Google people are at it again. Want to search for deals on stuff you want to buy? Try Froogle.

About Froogle.

Found the link posted by Razib on Gene Expression blog.

By Randall Parker    2002 December 12 10:05 PM   Entry Permalink | Comments ( 1 )
2002 December 01 Sunday
Google Zeitgeist, The Borg Mind, AI Blog Assistant

The Google Zeitgeist page shows what queries are moving up and down in popularity generally and in assorted categories. Periodically tune in to this page to watch one aspect of the changing thinking of the world's collection of minds. This ability of humans throughout the world to search on and find many of the same articles about any given topic ought to contribute to developing more commonality of outlook in heretofore fairly isolated sub-groups. Though just as significant differences of opinion remain in closely connected societies due to divergent personal interests, experiences, and innate personality characteristics so there will remain differences between groups around the world.

The New York Times reports, not surprisingly, that sex is a recurringly popular topic for searches. But Google also detects important events just after they happen:

On Feb. 28, 2001, for example, an earthquake began near Seattle at 10:54 a.m. local time. Within two minutes, earthquake-related searches jumped to 250 a minute from almost none, with a concentration in the Pacific Northwest. On Sept. 11, searches for the World Trade Center, Pentagon and CNN shot up immediately after the attacks. Over the next few days, Nostradamus became the top search query, fueled by a rumor that Nostradamus had predicted the trade center's destruction.

This ability to detect unfolding events might have a use in bioterrorism attack. There are plans afoot to automate the collection of data about symptom reports for doctors' office visits and pharmacy drug sales in order to detect a bioweapons attack before any of the victims are properly diagnosed. Well, if there are patterns of Google searches that people make for health information when family members come down with various categories of symptoms then the combination of originating IP addresses (since IP addresses usually can be assigned to geographic areas - though perhaps that isn't true for all ISPs) and disease information searches could be tracked as another way to detect the early stages of symptoms from a bioweapons attack.

I recently read the assertion (by John Derbyshire who also once again pointed to the important role played by Google) that cultural changes happen later in Canada than in the US. Well, Britney Spears is at the top of the Canadian search list at the moment and yet Spears has peaked in popularity on Google as a whole. It would be interesting to see the popularity of Spears and other major celebrities tracked by nation to see which nations jump on new celebrity icons the fastest and slowest. It would also be interesting to know whether local favoritism makes someone like Avril Lavigne a bigger search topic in Canada than in other developed English language countries and ditto for other artists that come from lower population countries who make it big.

Writing on Slate Michael Kinsley sees Google starting to do some of the functions historically done by editors

Google concedes that its choices of stories and news sources are "occasionally unusual and contradictory" but insists with uncharacteristic pomposity, "it is exactly this variety that makes Google News a valuable source of information on the important issues of the day."

Which is humbug. People still do it better. But not by much. The day is clearly approaching when editors can be replaced by computers. This requires some urgent rethinking.

He's writing somewhat tongue-in-cheek here in terms of his fears that editors and other mental workers will be increasingly replaced by computers in an increasing number of categories. But its actually true. In some cases the computers will automate just part of a mental worker's job. Take blogging for example. I bet a neural net with some additional other types of algorithms could do a decent job of doing some of the job of article selection that a web logger performs. The history of what a popular web logger posts (eg Glenn Reynolds of Instapundit) could be used to help make search queries to identify articles to post about. Google News could be searched for patterns that match the posting history of a successful blogger (said posting history would be analyzed by software perhaps using Bayesian algorithms of some sort). Also, other blogs could be watched for breaking interest stories by use of Daypop.com and MIT Blogdex. Daypop and Blogdex are already serving the function of meta-weblogs.

Of course bloggers also provide commentary and select portions of articles to excerpt. Until full artificial intelligence is achieved the earlier versions of the Blog Assistant AI software I envision could provide a list of proposed articles to blog about and a real human blogger could select from this list. The Blog Assistant could even select a proposed excerpt to use for the blog post. The blogger then accept or overrule the Blog Assistant choice. The Blog Assistant could bea learning system that gradually refines its algorithms based on choices that the blogger makes while using the Blog Assistant.

Of course, a Blog Assistant would be a lot smarter if it could somehow know what readers are thinking about. A really popular blogger (not me) gets a lot of e-mail from readers. A Blog Assistant could read the e-mail and look for patterns of reader interest. That Blog Assistant could even look at articles when the readers send links to articles and then propose to the human Blogger that particular articles submitted by readers match the blogger's interests. Also, Google search engine patterns for people who come to the blog site could be tracked and the Blog Assistant could make suggestions for popular topics to write about. Similarly, the Blog Assistant could track which posted articles get the most views as links to just those posts and then again adjust its preferences for which new articles the blogger should post about.

By Randall Parker    2002 December 01 12:08 PM   Entry Permalink | Comments ( 0 )
2002 November 22 Friday
Romania Big Source Of Criminal Hacking

StrategyPage.com has an interesting article on Romanian web hackers:

With only .3 percent of the world's population, a quarter of the attacks on Honeynet sites in the first six months of this year came from Romania.

The Honeynet they refer to is Honeynet.org.

Its a shame the Romanians don't have anything more constructive to do with their time.

By Randall Parker    2002 November 22 08:36 PM   Entry Permalink | Comments ( 37 )
2002 November 21 Thursday
Google Labs And Google Keyboard Shortcuts

I found this interesting URL http://labs.google.com/cgi-bin/keys?q=101+mozilla&hl=en&lr=&ie=UTF-8&start=0&sa=N&fromkey=1 in my TechiePundit referral logs. Note the labs.google.com. Okay, so what is it? It appears to be a site where Google lets people try out new Google features. One thing they are trying out is support for Google results pages browser keyboard shortcuts. So if you bring up that URL above you can hit N to go to the next page of results and then hit P to go back to the previous page. If you hit the '?' key it will pop down a list of all the keyboard shortcut keys. This works for me with Mozilla 1.2b. I see in the keyboard shortcuts discussion group (see below) that someone has found it works on Opera 7 beta as well. I like the way one can hit keys 1 thru 9 to choose any one of the first 9 results to go to.

Also, the Google folks have discussion groups (am I like the only one who doesn't know this? - yeah I know about their Usenet groups support) where users discuss their experiences trying out new Google features. For instance, there is a discussion group for Google keyboard shortcuts.

By Randall Parker    2002 November 21 09:08 PM   Entry Permalink | Comments ( 0 )
2002 November 14 Thursday
John Derbyshire: Google Makes Everyone Seem Learned

John Derbyshire explores how Google is changing punditry:

And not everything is yet online in any form. A month or so ago I quoted a line from a John Betjeman poem. Several readers wanted to know where they could read the whole poem. Not on the Web, is the answer — at any rate, Google couldn't find it. Those of us who have actually read and memorized a lot of stuff still have an edge, though probably not for much longer. I feel a bit like the guys who knew how to manipulate slide rules must have felt when pocket calculators came in. I have a head full of junk, crammed with odd and arcane facts, which I can sprinkle through my writing to add charm and seasoning to it. That head full of junk used to be my working capital. But now, anyone else can get the same effect, just by googling.

Another big step will come when mind-machine interfaces will allow one to think a query in one's mind and then get it back so quickly that one can use it in a real-time conversation without any listeners knowing that the information didn't come from one's own mind.

By Randall Parker    2002 November 14 10:16 AM   Entry Permalink | Comments ( 0 )
2002 November 10 Sunday
Java Google Relationship Graphing Applet

You need at least Java 1.3 to see this. If you have it (or want to go grab it from the Sun site) then go try out the TouchGraph GoogleBrowser.

Is it useful? Maybe. But is it fun? Definitely.

Thanks to Adam Flinton for the heads-up on it.

By Randall Parker    2002 November 10 02:07 PM   Entry Permalink | Comments ( 0 )
Site Traffic Info