May19
Suppose you need to evaluate a person. Not just for a job or something like that, but you have to see whether he would make a good friend. One that is reliable and subscribes to the same high values you do: honesty, integrity, friendliness, caring, etc. Now the only problem is that you can’t see nor hear the person. The only clue you have is what other people he visits or what people visit his house. Take a moment and think about this problem.
I think most of you would agree that the only way we could guess whether the person would be a good friend or not is to look at what the people he visits and receives are like, and the quantity of those. If every week a drugs dealer comes by, that would be better than having one to visit him every day. On the other hand, it would still be worse than seeing one of your best friends visit him two times a year. But if your best friend visits him almost every day, that would be far better. Or if all of you friends pay a regular visit, you would think there’s a reasonably good chance the guy/girl would make a nice friend. And if noone would visit this guy, and the guy would visit noone himself, you would be able to tell very little about him.
I think this must be somewhat like the task of a search engine. Part of it is determining whether the page is a reliable candidate to deliver good info for the user. They need to do this without being able to actually read the page. They know what words are on them. I can fill an entire page with the search phrase and this would not be your best friend. Most probably this page would have not many other links to it. It wouldn’t have a lot of friends. Or if it has a lot of links to it, they are most likely to be spam links, the drugs dealers of the not-so-best friend.
If you are that guy who needs to be evaluated, you best avoid visiting drugs dealers or letting them into your house. If you have a website, you best avoid linking to pages with dubious intentions or avoid you become the target of them.
The problem is, a search engine has to see at least some people initially to determine whether they are good people or bad people, and go on from there, inferring about other people who visit them or get a visit by them. What pages could a search engine take as a starting point? What pages are most likely to be informative and not to contain spam? If I were to choose those would be government sites and education sites. Some people think links from these domains get you more points (in Google) than links from other sites. I think that from my perspective this makes sense.
Feb19
One of the strangest phenomena in the Google world is the sandbox. Say you create a new site and are able to get it indexed in Google. A link to it from the right place is sometimes all you need for a page. So you see in your stats that Google has visited your site and a few days later you are about to check your site in the Google index. You carefully type in http://www.yoursite.com/, hit enter and yup, there’s your puppie! Awesome, in just about the blink of an eye you made it into Google’s index. Naturally, you are also eager to know what place you are in the search engine results. Take as an example that you opitmised the site for a three keyword phrase “word1 word2 word3″. So you type those three words in and let the results display. It appears you are not on the first page. That’s OK, since there is some strong competition going on for this keyword combo. You hit for the next results page. Again, your site isn’t there. In fact, it appears not to be in the first 100 of results. Guess things will be a little tougher than you thought. But you keep on searching and finally hit the last page. Your site isn’t there also. In fact it didn’t come up in the search results at all? Why is that?
The answer is: you’ve been sandboxed by Google. For some undetermined period, at least a couple of months, your site won’t appear in the search engine results for some more competitive keywords. Mind you, it is there. It is indexed and Google knows about your site. But it’s algorithm has decided not to show you for some terms while for others (the least interesting) you do show up.
Am I making this up? Not at all. Matt Cutts aknowledged there is such a thing as the sandbox on a PubCon conference. Brett Tabke, webmaster of webmasterworld asked the question there directly to Matt. His answer was that (and I quote from a thread at webmasterworld)
that there wasn’t a sandbox, but the algorithm might affect some sites, under some circumstances, in a way that a webmaster would perceive as being sandboxed.
I know from experience that it doesn’t have to be a whole site that is affected but can just be one page. I have done some testing where I put up a bunch of articles. A couple of genuine reviews I wrote together with some scraper reviews. I got a decent link pointing at it making sure Google would visit the site often and all pages would be indexed. All of the articles were indexed, but one genuine review got sandboxed. This review was in a really competitive niche and about buying something. I used the keyword “buy” in another review but as that was sort of the only real review of that product to be found on the internet it takes the first place in the results. Clearly it was not the specific keyword that triggered this.
Nobody really can tell whether your site will be sandboxed or not. There is speculation however, like when you point a link to it from a webpage with high pagerank you will avoid the sandbox. I can’t tell whether this is actually true.
And why does it exist? Debate about that too. Clearly, the Google folks are happy with this effect. Most webmasters aren’t and I am one of them. It is a lot less fun working on a website you know the next year or so noone will look at.
Feb09
I’ve found a site that provided me with a download of IE6: evolt.org has archives of lots of IE downloads. I downloaded the ugly beast (76MB!) and found I could do nothing with it. First I tried unchecking IE in installed programs -> windows components. But that removed only visible shortcuts and doesn’t actually uninstall IE. Then, after the download, I tried to install IE6. It said I had a newer version and it refused to install. Also, when IE was unchecked in my installed windows components, I got the same message.
Right before I was going to search on how to get multiple instances of IE running, I googled for “uninstall IE7″. That’s where I found the simple solution to check “updated” in the installed programs list. When you do that, it shows IE7beta and you can uninstall it. Sweet, though these are the things that make you feel stupid for not having found it yourself. ;-)
Some people might wonder why I didn’t just tried to dual install IE6 and IE7, so that I might check sites out in both of them. The reason is that I make a great deal of being able to sleep and not worry about yet another browser that fucks up CSS. I’ve heard too many things about how IE7beta still messes up, so I wont test sites in it before a more final version appears. Vista is reported to be released somewhere between september-december so I still have some months of baby-sleep before a release candidate hits the road.
Feb08
For those of you that are going to install IE7 for some reason, be warned that it will override your current IE6 installation.
I knew it and still went ahead with it. Now I need IE6 and it’s gone. And I need it because I’m developing another site. I need to check this site out. I need to see how IE6 screws up and what I need to do to adjust it.
Luckily I remember having read somewhere there is a standalone version of IE6 available. Not so lucky I wasn’t able to find it on my first try, but I’ll give it another try tomorrow.
ps: For those of you interested, there’s an interesting thread at WebmasterWorld about what’s fixed and what’s not fixed in regards to CSS in IE7. Take a look at it here. The first thing that struck me about IE7 was that fonts are rendered a lot better than in Firefox. No, actually that was the second thing that struck me. The first being how ugly I found its design.
Feb06
I’ve recently discovered Portable Thunderbird. It is an installation of the well known Mozilla Thunderbird but completely portable, so that you can install it on any media that is writeable. If, for example, you have an USB key you can take it to any other computer with an USB key drive and check your mail there.
The website of Portable Thunderbird is a bit behind in version but I can assure you that I run 1.5 here.
It would be nice if more portable projects were released. I believe OpenOffice.org has a portable version and recently I saw a thread about installing Ubuntu on a USB stick. That’s pretty cool.