Google click quality 'investigation' - part 2

Following up to my previous post about Google click quality investigation:

Google made another apparently superficial look at their data. They still insist that they "did not find evidence of automated activity or unethical behavior. Our data indicates that the clicks arrived from a variety of users and IP addresses."

I have repeatedly sent them the following arguments, which they have consistently avoided responding to directly:
I still don't understand how you do not find the correlation between blank referrers and poor performance suspicious. This is the strongest correlation evident from our logs, and it holds not only on the Saturday and Sunday in February when this activity was highest, but it holds throughout a 3-month range of dates, and it holds regardless of where the blank referrer clicks predominantly originate from. That Sunday for which you analyzed the logs, the blank referrer clicks were coming from Brazil, but on 2006-12-10, for instance, they're coming from China, US and Turkey, and the correlation is the same.

Likewise, the proportion of blank referrers falls to zero once we stop advertising on the content network, and it's not like we don't get any search ad clicks from China, Turkey or Brazil. But we don't get search ad clicks with no referrers.

You conclude from the fact that the clicks arrived from a variety of users and IP addresses that these clicks must be kosher, but this ignores that this is exactly the type of an attack that a savvy fraudster will mount. Who is to say that those clicks aren't coming from malware bundled with certain Brazillian programs? Who is to say that such malware was not designed to elicit exactly the pattern of clicks that is seen? You seem to ignore that fraudsters have had several years to adapt to all of your detection mechanisms, and that by now the skillful ones would be engaging in attacks of sophistication quite similar to what we see here.

If you CANNOT account for the strong, ongoing correlation between blank referrer clicks and the poor effects we see, then you CANNOT claim that these clicks are legitimate.
I can interpret their refusal to provide a reasonable explanation for this correlation in two ways - either they don't understand, in which case they are incompetent, or they know they have a major problem with AdWords and their strategy is to avoiding admitting it at all times and at all cost, in which case they are themselves participants in this fraud.

After all my insistence, they now credited our account with a one-time $500 "courtesy credit", which seems to be on the order of 10% of the amount we've lost due to this click activity.

So, yes, it does look like Google 'click investigation' is a charade which they operate largely as part of their public relations strategy rather than as an attempt to actually do the right thing and figure out what's going wrong.

No wonder they only ever 'find' 0.02% of clicks as being charged incorrectly. Once they've charged for a click, it is in their very strongest interest to insist that the charge was correct.

Ever heard how disk manufacturers advertise their disks as having a mean time between failure in excess of 1 million hours, which translates to less than 1% chance of failure per year? And then an independent in-field investigation finds out that the real failure rates are more like 2-4% and higher?

Meanwhile, Google founders Larry Page and Sergey Brin buy a 767 party plane and then argue about what kind of bed Brin can have in it.


The doghouse: Windows Sidebar

Now, I think Windows Vista is the shiniest and nicest operating system to come out of Redmond so far. I have used it for several months and I'm quite content with the OS overall. I like the infrastructural improvements and wouldn't want to move back to using XP.


Windows Vista comes with this little utility called Sidebar, which is a fancy, largely transparent dock that places itself on the left or right side of your desktop. The sidebar is essentially a platform for smaller programs called gadgets, and it comes with some built-in ones. The built-in gadgets I've tried include a calendar, a to-do note, a weather indicator, a stock indicator, RSS headlines and a CPU meter.

I can confidently say, with the sole exception of the Calendar, all of the gadgets I've tried suck. Here is why.

Notes. The Notes gadget looks nice and all, the way it imitates a post-it note, but:
  • It only fits about two words horizontally and three lines.
  • It looks like it fits a fourth line, but no - that space is for some normally invisible buttons. Try entering a fourth line and it scrolls the first line out of visibility.
  • It's not resizable.
  • You can't change the font either. There is a dialog to change the font, but instead of all the fonts available on the system, you have only three options: Segoe, Segoe, and Segoe.
  • The only font sizes available are 9-18, and they're fixed. You cannot choose any other size than that. And the two words and three lines limitation on the note - that's with size 9.
  • Sometimes when I login the Notes gadgets don't even show properly. I have to lock the computer and unlock it again for the notes to appear.
  • Sometimes one of the Notes gadgets simply disappears in the time between a logoff and login, with no way to recall it or its contents.
Weather. It doesn't work. It says "Service not available". But the service is available - it works on my other machine, which is in the same location. But it doesn't work on my laptop. Online posts suggest that you might need to have a very specific configuration of region and locale settings in order for this gadget to work. I tried, but apparently was unable to hit all the right settings.

Stocks. Shows only three stocks at a time. Want more than three? It's not resizable. You have to manually scroll up and down to see them - which kind of defeats the purpose of a passively observable stocks indicator.

Feeds. Displays only 4 headlines at a time. Not resizable. Cycles through the headlines - which means it distracts you with screen changes when nothing is changed. Can only display 10, 20, 40, or 100 headlines. Want 25? You can pick no number in between.

CPU Meter. I eventually removed it because it was itself consuming too much CPU.

I don't know who signed off on a Sidebar of this quality going into the Vista release, but whoever it was, it must have been the kind of B category employee that Bill Gates supposedly said Microsoft ought not to employ.

The whole Sidebar thing looks like it's been polished by Microsoft, but developed by students. It's just not good enough. Not by a wide margin.


Google click quality 'investigation' is bunk

Following up to my previous article where I discussed the extent of Google content ad click fraud, I can now confirm that the Google "click quality team" that's ostensibly supposed to investigate click fraud is bunk.

First, they made me wait for a reply two weeks after I filed the initial click quality report. I guess two weeks doesn't look too long for an "investigation", and in this time some proportion of advertisers tend to forget anyway what it was they were complaining about.

From this reply, sent by one "Sachan" (no surname), it was apparent that they ran only a superficial check in their own database; they were entirely unwilling to even consider looking at my logs.

Here's what they have to say about the awful proportion of content ad clicks that were coming to us with no referrer, and the awful quality of those clicks (35% of them left within 10 seconds and without even waiting for the page to fully load - compare this with 2% for clicks coming from Google search, and 5% for content ad clicks coming to us with a non-blank referrer):
In general, referrer headers may not be present for all visits for your AdWords ads. There are multiple mechanisms which can prevent the transmission of this header, including firewall hardware/software, proxy IP configurations, and browser settings.
This is a total cop-out. They conveniently ignore that:
  • the quality of blank referrer clicks was impossibly low compared to other clicks;
  • that we have virtually no blank referrer clicks coming to us from search; somehow only content ad clicks were coming to us massively with blank referrers.
I find it very hard to believe that, over a period of 3 months, we would consistently see such poor performance in clicks that have a blank referrer purely by accident.

I think it's kind of obvious what those blank referrers are for. They prevent the website owner from looking at their logs, seeing where bad clicks are coming from, and disabling content ads on sites from which these clicks come from. The only way to learn the sites from where such clicks are coming is to ask Google - and they are not exactly eager to help.

There have been several days and I have since sent several replies to Sachan, employing everything from logical arguments to begging and threatening about publicity and legal action, and nothing works. Altogether, I received exactly two emails containing boilerplate wording such as this:
Our investigation did not find evidence that suggests automated or unethical clicks were charged to your account, but I understand that you feel this traffic may not have performed well on your site. As such, you may consider adding domains of concern to your site exclusion list for this campaign.
Right. And how exactly shall I do that, given that all the bad clicks are coming to our website with a blank referrer?!

In summary - the Google click quality investigation team is bunk. It seems Google created it as a false comfort to advertisers, in order to make it seem like they are doing something about click fraud, when in reality they're wasting money on chefs and perks and entertaining their employees' pet projects. The company is run like a kindergarten, with all the (lack of) responsibility implied.

If I were an investor, I would short Google's stock.

Oh, and if anyone is contemplating a class action lawsuit against Google, please count us in. We lost several thousand dollars to this over a span of several months, and Google isn't even so much as blinking.


The real extent of Google content ad click fraud

Google has recently claimed a ridiculous number of 0.02% as the amount of what they euphemistically call "invalid clicks" that gets charged to advertisers on Google AdWords.

In my experience as an advertiser, Google's claims are bollocks.

One weekend this February, our website got a huge spike of worthless, blank-referrer, Google content ad clicks. This caused me to turn off content ads entirely and spend several following days analyzing content ad data. My conclusions are that about 50% (fifty percent) of Google content ad clicks we got from December to February were fraudulent, while during the final spike the proportion of fraudulent clicks was 2/3. Most of the fraudulent clicks had a blank referrer and they were coming from all kinds of IP addresses, indicating that fraudsters are using large networks of zombie machines, or equivalent.

I have complained to Google and lodged a request for an invalid clicks investigation, which they say should get a response within 3-5 days. It's been 9 days and I still haven't heard back from them. They haven't even requested any of our logs, which I think would be a necessary precursor to an investigation.

At this point it seems likely that we'll have to pay for all of the fraud if we want to continue our Google search advertising.

Even supposing that Google makes an effort at investigation, how objective is this really likely to be when the investigators work for a company that would highly prefer that the investigation finds mostly nothing?

Preventing lower-integrity processes from reading higher-integrity data on Windows Vista

Joanna Rutkowska has posted an article about how to use Mark Minasi's chml tool to improve the security of your sensitive data from potential IE exploits when running IE in Protected Mode on Windows Vista.

Vista introduces the concept of Integrity Controls, where a program that runs at a lower integrity level is unable to access data that are associated with a higher integrity level, even if the program would otherwise have all the necessary permissions.

Internet Explorer is currently the only browser I know of that runs on Vista at a low integrity level by default. This means that any exploits against IE will find it more difficult to install themselves permanently into the system - the easiest way right now might be for them to trick you into absent-mindedly allowing them to run at a higher integrity level.

However, by default, Vista only prevents low-integrity processes from changing your medium-integrity data. What it does not do is prevent low-integrity processes from reading your medium-integrity data. This means that any IE exploit can still scan your system for sensitive data and passwords and silently transmit them somewhere without your knowledge.

So here is where the chml tool comes in. It allows you to apply an SACL (System Access Control List) to your files which tells Windows to prevent lower-integrity processes not only from writing, but also from reading or executing any of the files protected with such an SACL.

If you use Vista, I would definitely recommend using a browser that runs well as a low integrity process. Then I would further recommend downloading chml and applying "chml FolderName -i:m -nr -nx -nw" to all of your data folders. I think it is sensible to leave the Program Files and Windows directories readable though, because after all, those low integrity processes do have to load DLLs.


Copyright Royalty Board kills Internet radio

Those fine politicians, along with the unelected officials they hire, are improving everyone's lives yet again. Yes - this time, it is by killing effectively all forms of Internet radio.

The Copyright Royalty Board of the U.S. Library of Congress has seen it fit to set per-listener performance rates so high that, for someone who listens to Internet radio 8 hours a day for 20 days a month, the Internet radio company will have to pay $3 per month just in performance fees. With all the other costs an Internet radio company has to pay, this means they would have to collect at least $6 per month from every listener.

It is unlikely that any Internet radio company can survive on these terms. Why would anyone pay $6 per month, when you can just pay nothing and turn on the regular FM radio? Additionally, the schedule is such that the rates are going to double by 2010 anyway.

And, yeah, here's the kicker: just in case any Internet radio company managed to survive these new rates in some freakish way, the Board decided to set these rates retroactively since 2006!

This means that my favorite radio station would have to pay about $500,000 in back fees for past performance alone - due immediately. This means they will be forced to close unless - and this is iffy - they can strike a more palatable deal with the RIAA.

So, let's sing one big hip-hip-hurray for the government and politicians! Oh, where would we be without them?


Copilot sucks

For all the hype given to Copilot by Joel Spolsky (of Joel on Software fame, and he's also a creator of Copilot), it really kind of sucks. The first time I tried connecting with someone, the other person reported that it crashed. In the mean time, on my machine, it became the first application ever that managed to steadily consume 100% of both cores on my dual-core CPU while waiting for the connection. The other person then figured out that the Copilot crash might be because he himself was accessing the machine through VNC, and that perhaps there was a conflict. (If there was a conflict there should have been an error message, not a crash.) So he accessed the machine locally and Copilot did work, but oh my god how slowly. Yeah, the firewalls were closed, but the latency was still incredible - click, wait 15 seconds, a window starts to appear, wait another 15 seconds, the window finally appears. In the meanwhile, Copilot was steadily consuming 100% of one core on my machine. From that alone it really looks like this software was done by students.

Not to mention the invitation code fiasco. I first sent the other person an invitation code the day before, just to get prepared. Then today it turned out that the invitation code didn't work any more and he had to close everything, re-login with the new code and re-download the application. Then when the thing crashed I closed Copilot and the browser and subsequently when we tried again we had to exchange yet another invitation code. So much for simplicity and ease of use - this invitation code thing really needs to be worked out; at least it should be possible to reuse a previously issued invitation code, especially if someone's already online with that code and is waiting for you to help them.

Overall I've had a better usage experience with the Remote Assistance functionality that's built into Windows Live Messenger - it is faster, easier, less flaky and it also works through firewalls. And as opposed to Copilot, it's free. The only disadvantage of Remote Assistance is that, well, both parties need to be logged into Messenger. If it only wasn't for that, I'd recommend Messenger's Remote Assistance over Copilot any day. And in fact, if you and the other party both have Messenger accounts, I do.