New tools to help publishers maximize their revenue

What do a celebrity blog, a video interview on a newspaper site and a cable channel’s smartphone app have in common? They’re all supported by advertising…and they’re all examples of how the lines between media formats are blurring.

These increasingly blurry lines are not only resulting in highly engaging forms of content for users, but many new revenue opportunities for publishers. A wave of innovation and investment over the past several years has also created better performing ads, a larger pool of online advertisers, and new technologies to sell and manage ad space. Together, these trends are helping to spur increased investment in online advertising. We’ve seen this in our own Google Display Network: our publisher partners have seen spending across the Google Display Network from our largest 1,000 advertisers more than double in the last 12 months.

With all these new opportunities in mind, we’re introducing new tools for our publisher partners—in our ad serving technology (DoubleClick for Publishers) and in our ad exchange (DoubleClick Ad Exchange).

Video and mobile in DoubleClick for Publishers
Given the changes in the media landscape, it’s not surprising that we’ve seen incredible growth for both mobile and video ad formats over the past year: the number of video ads on the Google Display Network has increased 350 percent in the past 12 months, while AdMob, our mobile network, has grown by more than 200 percent.

Before now, it’s been difficult for publishers to manage all their video and mobile ad space from a single ad server—the platform publishers use to schedule, measure and run the ads they’ve sold on their sites. To solve this challenge, we’re rolling out new tools in our latest version of DoubleClick for Publishers that enable publishers to better manage video and mobile inventory. Publishers will be able to manage all of the ads they’re running—across all of their webpages, videos and mobile devices—from a single dashboard, and see which formats and channels are performing best for them.

A handful of publishers have already begun using the video feature and it appears to be performing well for them: we’ve seen 55 percent month-over-month growth in video ad volume in the last quarter. In other words, publishers are now able not only to produce more video content, but to make more money from it as well.

Direct Deals on the DoubleClick Ad Exchange
Another way publishers make money is to sell their advertising via online exchanges, like the DoubleClick Ad Exchange, where they can offer their ad space to a wide pool of competing ad buyers. This has already proven to generate substantially more revenue for publishers, and as a result we’ve seen significant growth in the number of trades on our exchange (158 percent year over year).

However, publishers have told us that they’d also like the option of making some of their ad space available only to certain buyers at a certain price—similar to how an art dealer might want to offer a painting first to certain clients before giving it to an auction house to sell. So we’re introducing Direct Deals on the Doubleclick Ad Exchange, which gives publishers the ability to make these “first look” offers. For example, using Direct Deals, a news publisher could set aside all of the ad space on their sports page and offer it first to a select group of buyers at a specific price, and then if those buyers pass on the offer, automatically place that inventory into the Ad Exchange’s auction.

Looking back at that blog, news site and app, we’d like them to have one more thing in common—being able to advantage of new opportunities to grow their businesses even further. These new tools, together with the other solutions we’re continuing to develop, are designed to help businesses like them—and all our publisher partners—do just that, and get the most out of today’s advertising landscape.

Reference link:http://googleblog.blogspot.com/2011/09/new-tools-to-help-publishers-maximize.html

Time, technology and leaping seconds

Have you ever had a watch that ran slow or fast, and that you’d correct every morning off your bedside clock? Computers have that same problem. Many computers, including some desktop and laptop computers, use a service called the “Network Time Protocol” (NTP), which does something very similar—it periodically checks the computers’ time against a more accurate server, which may be connected to an external source of time, such as an atomic clock. NTP also takes into account variable factors like how long the NTP server takes to reply, or the speed of the network between you and the server when setting a to-the-second or better time on the computer you’re using.

Soon after the advent of ticking clocks, scientists observed that the time told by them (and now, much more accurate clocks), and the time told by the Earth’s position were rarely exactly the same. It turns out that being on a revolving imperfect sphere floating in space, being reshaped by earthquakes and volcanic eruptions, and being dragged around by gravitational forces makes your rotation somewhat irregular. Who knew?

These fluctuations in Earth’s rotational speed mean that even very accurate clocks, like the atomic clocks used by global timekeeping services, occasionally have to be adjusted slightly to bring them in line with “solar time.” There have been 24 such adjustments, called “leap seconds,” since they were introduced in 1972. Their effect on technology has become more and more profound as people come to rely on fast, accurate and reliable technology.

Why time matters at Google

Having accurate time is critical to everything we do at Google. Keeping replicas of data up to date, correctly reporting the order of searches and clicks, and determining which data-affecting operation came last are all examples of why accurate time is crucial to our products and to our ability to keep your data safe.

Very large-scale distributed systems, like ours, demand that time be well-synchronized and expect that time always moves forwards. Computers traditionally accommodate leap seconds by setting their clock backwards by one second at the very end of the day. But this “repeated” second can be a problem. For example, what happens to write operations that happen during that second? Does email that comes in during that second get stored correctly? What about all the unforeseen problems that may come up with the massive number of systems and servers that we run? Our systems are engineered for data integrity, and some will refuse to work if their time is sufficiently “wrong.” We saw some of our clustered systems stop accepting work on a small scale during the leap second in 2005, and while it didn’t affect the site or any of our data, we wanted to fix such issues once and for all.

This was the problem that a group of our engineers identified during 2008, with a leap second scheduled for December 31. Given our observations in 2005, we wanted to be ready this time, and in the future. How could we make sure everything at Google stays running as if nothing happened, when all our server clocks suddenly see the same second happening twice? Also, how could we make this solution scale? Would we need to audit every line of code that cares about the time? (That’s a lot of code!)

The solution we came up with came to be known as the “leap smear.” We modified our internal NTP servers to gradually add a couple of milliseconds to every update, varying over a time window before the moment when the leap second actually happens. This meant that when it became time to add an extra second at midnight, our clocks had already taken this into account, by skewing the time over the course of the day. All of our servers were then able to continue as normal with the new year, blissfully unaware that a leap second had just occurred. We plan to use this “leap smear” technique again in the future, when new leap seconds are announced by the IERS.

Here’s the science bit

Usually when a leap second is almost due, the NTP protocol says a server must indicate this to its clients by setting the “Leap Indicator” (LI) field in its response. This indicates that the last minute of that day will have 61 seconds, or 59 seconds. (Leap seconds can, in theory, be used to shorten a day too, although that hasn’t happened to date.) Rather than doing this, we applied a patch to the NTP server software on our internal Stratum 2 NTP servers to not set LI, and tell a small “lie” about the time, modulating this “lie” over a time window w before midnight:

lie(t) = (1.0 – cos(pi * t / w)) / 2.0

What this did was make sure that the “lie” we were telling our servers about the time wouldn’t trigger any undesirable behavior in the NTP clients, such as causing them to suspect the time servers to be wrong and applying local corrections themselves. It also made sure the updates were sufficiently small so that any software running on the servers that were doing synchronization actions or had chubby locks wouldn’t lose those locks or abandon any operations. It also meant this software didn’t necessarily have to be aware of or resilient to the leap second.

In an experiment, we performed two smears—one negative then one positive—and tested this setup using about 10,000 servers. We’d previously added monitoring to plot the skew between atomic time, our Stratum 2 servers and all those NTP clients, allowing us to constantly evaluate the performance of our time infrastructure. We were excited to see monitoring showing plots of those servers’ clocks tracking our model’s predictions, and that we were continuing to serve users’ requests without errors.

Following the successful test, we reconfigured all our production Stratum 2 NTP servers with details of the actual leap second, ready for New Year’s Eve, when they would automatically activate the smear for all production machines, without any further human intervention required. We had a “big red button” opt-out that allowed us to stop the smear in case anything went wrong.

What we learned

The leap smear is talked about internally in the Site Reliability Engineering group as one of our coolest workarounds, that took a lot of experimentation and verification, but paid off by ultimately saving us massive amounts of time and energy in inspecting and refactoring code. It meant that we didn’t have to sweep our entire (large) codebase, and Google engineers developing code don’t have to worry about leap seconds. The team involved in solving this issue was a handful of people, distributed around the world, who were able to work together without restriction in order to solve this problem.

The solution to this challenge drove a lot of thinking to develop better ways to implement locking and consistency, and synchronizing units of work between servers across the world. It also meant we thought more about the precision of our time systems, which have a knock-on effect on our ability to minimize resource wastage and run greener data centers by reducing the amount of time we must spend waiting for responses and rarely doing excess work.

By anticipating potential problems and developing solutions like these, the Site Reliability Engineering group informs and inspires the development of new technology for distributed systems—the systems that you use every day in Google’s products.

Reference link:http://googleblog.blogspot.com/2011/09/time-technology-and-leaping-seconds.html

PDFs in Google search results

Our mission is to organize the world’s information and make it universally accessible and useful. During this ambitious quest, we sometimes encounter non-HTML files such as PDFs, spreadsheets, and presentations. Our algorithms don’t let different filetypes slow them down; we work hard to extract the relevant content and to index it appropriately for our search results. But how do we actually index these filetypes, and—since they often differ so much from standard HTML—what guidelines apply to these files? What if a webmaster doesn’t want us to index them?

Google first started indexing PDF files in 2001 and currently has hundreds of millions of PDF files indexed. We’ve collected the most often-asked questions about PDF indexing; here are the answers:

Q: Can Google index any type of PDF file?
A: Generally we can index textual content (written in any language) from PDF files that use various kinds of character encodings, provided they’re not password protected or encrypted. If the text is embedded as images, we may process the images with OCR algorithms to extract the text. The general rule of the thumb is that if you can copy and paste the text from a PDF document into a standard text document, we should be able to index that text.

Q: What happens with the images in PDF files?
A: Currently the images are not indexed. In order for us to index your images, you should create HTML pages for them. To increase the likelihood of us returning your images in our search results, please read the tips in our Help Center.

Q: How are links treated in PDF documents?
A: Generally links in PDF files are treated similarly to links in HTML: they can pass PageRank and other indexing signals, and we may follow them after we have crawled the PDF file. It’s currently not possible to “nofollow” links within a PDF document.

Q: How can I prevent my PDF files from appearing in search results; or if they already do, how can I remove them?
A: The simplest way to prevent PDF documents from appearing in search results is to add an X-Robots-Tag: noindex in the HTTP header used to serve the file. If they’re already indexed, they’ll drop out over time if you use the X-Robot-Tag with the noindex directive. For faster removals, you can use the URL removal tool in Google Webmaster Tools.

Q: Can PDF files rank highly in the search results?
A: Sure! They’ll generally rank similarly to other webpages. For example, at the time of this post, [mortgage market review], [irs form 2011] or [paracetamol expert report] all return PDF documents that manage to rank highly in our search results, thanks to their content and the way they’re embedded and linked from other webpages.

Q: Is it considered duplicate content if I have a copy of my pages in both HTML and PDF?
A: Whenever possible, we recommend serving a single copy of your content. If this isn’t possible, make sure you indicate your preferred version by, for example, including the preferred URL in your Sitemap or by specifying the canonical version in the HTML or in the HTTP headers of the PDF resource. For more tips, read our Help Center article about canonicalization.

Q: How can I influence the title shown in search results for my PDF document?
A: We use two main elements to determine the title shown: the title metadata within the file, and the anchor text of links pointing to the PDF file. To give our algorithms a strong signal about the proper title to use, we recommend updating both.

If you want to learn more, watch Matt Cutt’s video about PDF files’ optimization for search, and visit our Help Center for information about the content types we’re able to index. If you have feedback or suggestions, please let us know in the Webmaster Help Forum.

Reference link:http://googlewebmastercentral.blogspot.com/2011/09/pdfs-in-google-search-results.html

Ten years later

The events of September 11, 2001 changed the lives of so many people around the world. In the years since that day, thoughtful online efforts have provided an outlet for grief, for learning and a means for healing. Virtual spaces have helped us to remember the victims and honor the courage of those who risked their lives to save others.

On this 10th anniversary, we wanted to note a few of these virtual places:

9/11 Memorial

  • On Monday September 12, the 9/11 Memorial will open to the public within the original footprint of the twin towers. Our relationship with the 9/11 Memorial team dates back to 2009, when we collaborated to build their Make History site. This web archive lets people place and share their photos and videos in geographical context, collectively piecing together the history that was witnessed, one photo or video at a time.
  • The 9/11 Memorial has also produced a commemorative album called Ten Years On, a musical tribute featuring well-known musicians and performers. The album has inspired a video archive project on YouTube of the same name which encourages people to submit video tributes to those affected by the events of 9/11.

The New York Times

  • YouTube also worked with The New York Times on a YouTube Channel featuring archived news broadcasts and personal stories and reflections from the public.

Mountain Lakes (NJ) Volunteer Fire Department

  • John Reilly, a software executive and Deputy Chief of the Mountain Lakes (NJ) Volunteer Fire Department, built First-Responder to help community organizations like fire departments and EMS corps increase their emergency preparedness and respond more effectively to crises. This open source application uses freely available web tools to map critical resources and contingency plans, dispatch and track first responders, and interoperate with mutual aid organizations during emergencies.

It’s been an honor to see these tools being built using our platforms and products—and humbling to see them come to life.

Reference link:http://googleblog.blogspot.com/2011/09/ten-years-later.html

Google just got ZAGAT Rated!

“Did you know there’s a place in Menlo Park near the Safeway that has a 27 food rating?” one of my friends asked me that about two years ago, and I was struck because I immediately knew what it meant. Food rating… 30 point scale… Zagat. And the place… had to be good. With no other context, I instantly recognized and trusted Zagat’s review and recommendation.

So, today, I’m thrilled that Google has acquired Zagat. Moving forward, Zagat will be a cornerstone of our local offering—delighting people with their impressive array of reviews, ratings and insights, while enabling people everywhere to find extraordinary (and ordinary) experiences around the corner and around the world.

With Zagat, we gain a world-class team that has more experience in consumer based-surveys, recommendations and reviews than anyone else in the industry. Founded by Tim and Nina Zagat more than 32 years ago, Zagat has established a trusted and well-loved brand the world over, operating in 13 categories and more than 100 cities. The Zagats have demonstrated their ability to innovate and to do so with tremendous insight. Their surveys may be one of the earliest forms of UGC (user-generated content)—gathering restaurant recommendations from friends, computing and distributing ratings before the Internet as we know it today even existed. Their iconic pocket-sized guides with paragraphs summarizing and “snippeting” sentiment were “mobile” before “mobile” involved electronics. Today, Zagat provides people with a democratized, authentic and comprehensive view of where to eat, drink, stay, shop and play worldwide based on millions of reviews and ratings.

For all of these reasons, I’m incredibly excited to collaborate with Zagat to bring the power of Google search and Google Maps to their products and users, and to bring their innovation, trusted reputation and wealth of experience to our users.

Reference link:http://googleblog.blogspot.com/2011/09/google-just-got-zagat-rated.html

A fall spring-clean

Technology improves, people’s needs change, some bets pay off and others don’t. So, as Larry previewed on our last earnings call, today we’re having a fall spring-clean at Google.

Over the next few months we’ll be shutting down a number of products and merging others into existing products as features. The list is below. This will make things much simpler for our users, improving the overall Google experience. It will also mean we can devote more resources to high impact products—the ones that improve the lives of billions of people. All the Googlers working on these projects will be moved over to higher-impact products. As for our users, we’ll communicate directly with them as we make these changes, giving sufficient time to make the transition and enabling them to take their data with them.

Here’s a quick overview of where a number of products and features are headed:

• Aardvark: Aardvark was a start-up we acquired in 2010. An experiment in a new kind of social search, it helped people answer each other’s questions. While Aardvark will be closing, we’ll continue to work on tools that enable people to connect and discover richer knowledge about the world.

• Desktop: In the last few years, there’s been a huge shift from local to cloud-based storage and computing, as well as the integration of search and gadget functionality into most modern operating systems. People now have instant access to their data, whether online or offline. As this was the goal of Google Desktop, the product will be discontinued on September 14, including all the associated APIs, services, plugins, gadgets and support.

• Fast Flip: Fast Flip was started to help pioneer news content browsing and reading experiences for the web and mobile devices. For the past two years, in collaboration with publishers, the Fast Flip experiment has fueled a new approach to faster, richer content display on the web. This approach will live on in our other display and delivery tools.

• Google Maps API for Flash: The Google Maps API for Flash was launched to provide ActionScript developers a way to integrate Google Maps into their applications. Although we’re deprecating the API, we’ll keep supporting existing Google Maps API Premier customers using the Google Maps API for Flash and we’ll focus our attention on the JavaScript Maps API v3 going forward.

• Google Pack: Due to the rapidly decreasing demand for downloadable software in favor of web apps, we will discontinue Google Pack today. People will still be able to access Google’s and our partners’ software quickly and easily through direct links on the Google Pack website.

• Google Web Security: Google Web Security came to Google as part of the Postini acquisition in 2007, and since then we’ve integrated much of the web security functionality directly into existing Google products, such as safe browsing in Chrome. Although we will discontinue new sales of Google Web Security, we’ll continue to support our existing customers.

• Image Labeler: We began Google Image Labeler as a fun game to help people explore and label the images on the web. Although it will be discontinued, a wide variety of online games from Google are still available.

• Notebook: Google Notebook enabled people to combine clipped URLs from the web and free-form notes into documents they could share and publish. We’ll be shutting down Google Notebook in the coming months, but we’ll automatically export all notebook data to Google Docs.

• Sidewiki: Over the past few years, we’ve seen extraordinary innovation in terms of making the web collaborative. So we’ve decided to discontinue Sidewiki and focus instead on our broader social initiatives. Sidewiki authors will be given more details about this closure in the weeks ahead, and they’ll have a number of months to download their content.

• Subscribed Links: Subscribed Links enabled developers to create specialized search results that were added to the normal Google search results on relevant queries for subscribed users. Although we’ll be discontinuing Subscribed Links, developers will be able to access and download their data until September 15, at which point subscribed links will no longer appear in people’s search results.

We’ve never been afraid to try big, bold things, and that won’t change. We’ll continue to take risks on interesting new technologies with a lot of potential. But by targeting our resources more effectively, we can focus on building world-changing products with a truly beautiful user experience.

Reference link:http://googleblog.blogspot.com/2011/09/fall-spring-clean.html