googleblogging – Page 25 – Latest Google Updates

Issues using Iframes

Google does it penalize sites using iframes?

Google does not penalize sites using iframes. Plenty of legit sites use iframes, so it wouldn’t make sense to penalize for it. Some search engine spiders would have trouble with iframes just like some spiders have trouble with frames. But nothing expected iframes to cause any penalties.

Strategy on improving link exposure to search engines

Try to use absolute links instead of relative links, because there’s less chance for a spider (not just Google, but any spider) to get confused. In the same fashion, try to be consistent on your internal linking. Once you’ve picked a root page and decided on www vs. non-www, make sure that all your links follow the same convention and point to the root page that you picked. Also, I would use a 301 redirect or rewrite so that your root page doesn’t appear twice. For example, if you select http://www.yourdomain.com/ as your root page, then if
a spider tries to fetch http://yourdomain.com/ (without the www), your web server
should do a permanent (301) redirect to your root page at http://www.yourdomain.com/
So the high-order bits to bear in mind are- make it as easy as possible for search engines and spiders; save calculation
by giving absolute instead of relative links.- be consistent. Make a decision on www vs. non-www and follow the same conventionconsistently for all the links on your site.- Use permanent redirects to keep spiders fetching the correct page.

Those rules of thumb will serve you well no matter what with every search engine,
not just with Google. Of course, the vast majority of the time a search engine
will handle a situation correctly, but anything that you can do to reduce the
chance of a problem is a good idea. If you don’t see any problems with your
existing site, it is not necessary that you go back and change or rewrite links.
But it’s something good to bear in mind when you go for making new sites, for
example.

Yahoo fixes 1bu “bug”, only 1bu’s front page shows now, everything else gone.

p align=”justify”> 1bu.com proxy was creating lot of mirror pages and this was a big menace for google and other search engines, it seems yahoo has banned 1bu.com from their index see what this poster says in webmasterworld.com.

(they had tons of indexed pages last time I checked) this is amazing, Yahoo once again outdoes Google. First the fixed the 302 bug and now 1bu.com. I find it hard to believe that G doesn’t know about this, either way they should know since the web has been buzzing .

It’s their job to find things like this out. Most likely they filed this on the same cabinet as the years-old 302 page hijacking problem. Kudos to Y! and another huge thumbs down to the new Microsoft, at least when it comes to security responses.

Does rel=nofollow tag increase linking to bad neighbourhoods easier

It seems google has introduced a new attribute to the href elements the rel=”nofollow”, this attribute tells the bots like googlebot, yahoo slurp, msnbot etc not to follow that link and give the necessary credit to the targeted web page that link points too,
Though this tool is very useful for search engines to combat comment spam in blogs, guestbooks, message boards, referral logs etc it has lot of negative points,
1. Spammers will now be more comfortable in linking to bad neighbourhoods, For example an aggressive affiliate site is always in search for ways to prevent the bots from following the links and detecting the affiliate links, They used to hide it in php scripts, robots file, javascript, css, perl scripts etc, Now they dont need that, they can just say rel=nofollow intheir HREF and bots will just ignore those links, Looks easy isn’t it, now affiliates will be more happy they just dont need to find ways to hide links from the visiting crawlers,
2. Hoard pagerank which has been a long standing debate, before 2 years people used to avoid as less outbound links as possible to prevent something called PR leakage, Even now people are so concerned about pagerank and they dont want to leak pagerank to outbound links, this nofollow tag is a big gift for them, They can just add the tag and search engines will be happy to avoid the link which benefits the site,

In conclude Nofollow is not a great solution to combat comment spam,

Google Raises maximum word search limit to 32

Google has increased the search limit to for keywords to 32, it used to be only 10, Now it is a great inprovement,

Check this screen shot,

google pagerank update noticed,

Google has updated its toolbar pagerank on Jan 1 2005,

Check for some interesting discussion here, http://www.webmasterworld.com/forum3/27300.htm

Spammers ordered to pay $1 billion – best lesson for all spammers

Best lesson for all spammers, A recent news article says more than 300 spammers are fined 1 billion $,extract: Robert Kramer, whose company provides e-mail service for about 5,000 subscribers in eastern Iowa, filed suit against 300 spammers after his inbound mail servers received up to 10 million spam e-mails a day in 2000, according to court documents.U.S. District Judge Charles R. Wolle filed default judgments Friday against three of the defendants under the Federal Racketeer Influenced and Corrupt Organizations Act and the Iowa Ongoing Criminal Conduct Act.
AMP Dollar Savings Inc. of Mesa, Arizona, was ordered to pay $720 million and Cash Link Systems Inc. of Miami, Florida, was ordered to pay $360 million. The third company, Florida-based TEI Marketing Group, was ordered to pay $140,000.

Detecting duplicate and near-duplicate files Google patent page here,

Check this page, this is the main patent page of the google’s duplicate content algorithm, Check it out here, Detecting duplicate and near-duplicate files This web page describes research I did for Google 2000 through 2003, although mostly in 2000.

This work resulted in US Patent 6658423, by William Pugh and Monika Henzinger, assigned to Google. The information here does not reflect any information about Google business practices or technology, other than that described in the patent. I have no knowledge as to whether or how Google is currently applying the techniques described in the patent. This information is not approved or santioned by Google, other than by giving me permission to discuss the research;

I did for them that is described in the patent. The patent describes techniques to find near-duplicate documents in a collection. Google is obviously considering applying these techniques to web pages, but they could be applied to other documents as well. It might even be possible to sequences that are not documents (such as DNA sequences), although that raises some questions that aren’t covered here.

I’ll get more information up shortly, but for now: more information on the google’s patent of detecting duplicate files on the web is here,www.cs.umd.edu/~pugh/google/

Is Continuous PageRank Updating a Farce? Nice posting in webmasterworld.com

A poster in webmasterworld.com asks whether the pagerank updating is just a comedy act by google, He asks whether continuous pagerank updating is true or false,
here is his posting, I would appreciate it if someone could shed some light on this “continuous
updating” that google is supposed to be doing.
First, let me explain a few
concepts the way I understand them:

1) Google has 8 billion pages in their
index.

2) To calculate pagerank, google must do several “iterations” throught
the data. On the first iteration (of 8 billion pages) google has to see all the
outbound links from ALL of the pages. On the second iteration some pages gain
rank because they have incoming links. But this is not enough, several more
iterations must be completed in order to get a good reading and establish a
rank.

3) The computing power required to do numerous iterations across 8 billion
pages must be enourmous

. 4) Google uses “supplemental results” in addition to
the main index, alluding to the idea that PageRank may only be established for
the first 4 billion or so pages, and the rest is just used to “fill in”.

5)
Before google moved to only doing (allegedly visible) updates once per quarter,
there were numerous problems with Google keeping on their monthly schedule.
People would become alarmed.

6) Even before the quarterly updates, Google was
using “Freshbot” to help bring in new data between monthly updates. Please check
me on this: Freshbot results did not have PageRank.

7) We have been told that
even though there is no update to what we see in the little green bar, there is
actually “Continuous PageRank Update”
I find continuous update of PageRank
implausible. In order to get a true calcualtion it requires calcualtions across
the entire dataset of 8 billion pages multiple times. We have already seen signs
that there were issues in the past (missed updates), attempts to remedy the
problem (freshbot), and use of additional measures to hide what is really going
on (quarterly updates). Most likely, we are now in an age of “PageRank Lite”.
And here is the “kicker”: we have this mysterious “sandbox effect” (aka
“March Filter”) that seems to place a penalty on new links and/or new sites.
Could it be a result of Google’s failure to calculate PageRank across the entire
index?
IMHO, Yes!
Quietly, Google has been building more datacenters.
Recently, they opened a new one in Atlanta, GA, but there was no public
announcement. There is not even a sign on the building. If you go up to the
building to the door, there is a voice on the intercom that also doesn’t tell
you that you are at a Google facility (source: Atlanta Journal Constitution).
With the number of datacenters Google has already, the main reason for
adding more is probably not uptime and dependability. Though these things are
important, they certainly have plenty of datacenters, and you rarely see
problems with downtime or server speed. The reason for adding these datacenters
(quietly) must be they need more computing power to calculate PageRank.
I
believe I have provided many examples to support the idea that continuous
updating of PageRank is indeed a farce. I also feel that this “sandbox effect”
is a result of the inability to do complete calcualtions across the entire
dataset (especially new additions to the data).
I look forward to hearing
what others think.

more here www.webmasterworld.com/forum3/27065.htm

Google introduces Google Deskbar Plug-in Development Kit

p align=”justify”>This is what google wants to say about this new tool,
What’s a Deskbar plug-in?The Google Deskbar plug-in is a simple extension
mechanism for customizing your Google Deskbar. When you enter a search term and
choose your plug-in from the menu, Deskbar passes your search term to your
plug-in code, which can then return a specific URL to be displayed in a browser
or mini-viewer window, or return text to be displayed directly in the Deskbar’s
text box.
Users need to install the latest version of the Google Deskbar and
version 1.0 or higher of the Microsoft® .NET Framework to use plug-ins.

The
easiest way for users to get the .NET Framework is to visit the Windows Update
website. It’s a good idea to remind your users to install the latest versions of
the Deskbar and the .NET Framework before they install your plug-in.
more here, http://deskbar.google.com/help/api/index.html

more here http://deskbar.google.com/help/api/index.html