DV Fraud Lab Report

Transcription

DV Fraud Lab Report
DV Fraud Lab Report
May 1, 2013
DoubleVerify has uncovered elaborate advertising Fraud activity on over
1,200 suspected websites that together are defrauding online advertisers
an estimated $6.8 million per month.
DoubleVerify has uncovered elaborate advertising Fraud activity on over 1,200 suspected websites that together are defrauding
online advertisers an estimated $6.8 million per month1. This massive Fraud originates from user traffic on websites classified as
copyright infringement. The ad impressions from this traffic are ‘laundered’ through a complex series of redirects that make the
ads appear as though they originated on legitimate sites containing advertiser-friendly content. Further, the Fraudulent ads use
code hiding the ad creative from being displayed in the user’s browser, resulting in advertisers paying for impressions that are
never seen.
.1%
DoubleVerify’s Fraud Detection Lab has released new technology to uncover
Publisher
and protect against this new wave of impression laundering Fraud. This new
10 Billion
technology is fully deployed into DoubleVerify’s current MRC-accredited
Impressions
Fraud identification and blocking platform in order to give customers the
3%
most powerful verification protection available in the market today. The DV
Network/RTB
Fraud Lab is releasing this report with detailed information about this Fraud
so that the industry can better understand the methodology employed by the
sites involved, and we can work cooperatively to combat this type of activity
on an ongoing basis.
= $6.8M/Month
Fraud on a Massive Scale
This activity is by far the largest group of sites and amount of impressions our Fraud Lab has uncovered from a single
identification instance since our company was founded five years ago. DoubleVerify data scientists analyzed over 1,000
advertiser campaigns delivering more than 10 billion impressions on 3 million plus websites in a recent 30-day period to reveal
the following results:
• Over 1,200 websites in copyright infringement categories
were laundering ad impressions through a set of seemingly
legitimate sites with advertiser-friendly content.
• 3.0% of all Network/RTB-purchased impressions were
delivered via sites participating in this Fraudulent activity.
• 0.1% of Publisher-purchased impressions were delivered via
sites participating in this Fraudulent activity. This likely occurs
when publishers purchase third-party inventory for audience
extension or other off-site fulfillment programs.
• DoubleVerify estimates this is costing advertisers $6.8 million
per month based on the following analysis:
– Recent IAB studies estimate online display ads generate
about $642M per month. DoubleVerify estimates one
third of the revenue ($214M) is spent on Networks/RTB
platforms and two thirds of the revenue ($428M) is spent
on Publisher buys.
– 3.0% of $214M + 0.1% of $428M = Estimated $6.8M per
month in revenue spent on sites identified as participating
in this Fraud.
PAGE 1
How Fraud Damages the Industry
Traffic fraud has become a hot topic rising from a recent spate of industry attention related to Botnet Fraud. For DoubleVerify,
combatting online advertising Fraud and protecting our clients’ advertising integrity and performance have been top priorities
since the company’s founding in 2008. Our capabilities were further enhanced with the establishment of the DV Fraud Lab
where our engineers develop new technologies to uncover the continuously changing methods to perpetrate Fraud.
The most obvious impact to the online advertising ecosystem is on the buy side, where advertisers lose millions of dollars
on impressions that are never delivered to an end user. A second damaging effect is that many types of Fraud, especially the
impression laundering described in this article, can funnel ad dollars to parties who use it to fund activities that advertisers are
actively combatting. For example, an entertainment company would be doubly harmed if they are first bilked out of media
spend for creative that was never exposed and then those ill-gotten gains are used to support illegal distribution of their
copyrighted material to consumers -- further damaging their core revenue stream. Finally, fighting online ad Fraud complicates
the media planning and delivery process. It costs advertisers time and money both to identify Fraud and to eliminate its impact
from their analysis and media plans. Repurposing the money that is wasted on Fraudulent ads and the operational cost to
combat it could have a significant benefit to the performance realized by the buy side.
Fraud also significantly harms sellers and the overall economics of the ad market. Every media dollar wasted on Fraudulent
impressions directly impacts the revenue earned by legitimate sellers that have unsold or under-monetized inventory. Less
obvious, but equally concerning, are the indirect ripple effects that the billions of Fraudulent impressions have on market
economics. For example, billions of Fraudulent impressions add supply to an ecosystem where over-supply conditions exist
and CPMs are being driven down. Additionally, because legitimate and Fraudulent ad impressions are pooled together in
marketplaces, the overall ROI of this segment of impressions will suffer (when compared to the performance of legitimate
impressions alone). This reduction in ROI drives down the CPM in these marketplaces, which further damages legitimate sellers.
For these reasons, online ad Fraud damages the very credibility of the ecosystem in which we all are participants and
beneficiaries. Given these significant impacts across the industry, it is the belief of DoubleVerify that fighting Fraud isn’t a
problem to be solved solely by the buyers or by the sell side alone. We also do not believe that the industry should pivot
away from the existing, well-understood metrics that have facilitated significant online ad revenue growth and solve fraud by
introducing new ad measures that reward only good behavior. Instead, all parties: buyers, sellers, intermediaries, measurement
and verification companies have a responsibility to fight Fraud across multiple fronts so that we can build a better industry that
benefits all of the parties that participate in it.
Hidden Ads and Impression
Laundering Working Together
Hiding online ads is a common technique used by
nefarious actors to increase served impression counts
by serving them in conditions under which they are not
visible to the user. Sites participating in Fraud utilize many
methods of ad serving manipulation including placing the
ads in tiny iframes (1 pixel wide and tall), creating off-page
DIV elements in which to serve ads or stacking ad creative
behind content or other advertisements.
Ad
Ad
Call
1x1 iFrame
X
Sell-side
Ad Server
Ad
Call
Advertiser
Ad Server
Buy-side
Ad Server
Hidden Ad
Fraud Site
Figure 1 - Illustration of Hidden Ad Fraud
PAGE 2
Hidden ad Fraud by itself clearly damages brand advertisers who are paying for impressions that are never seen by the user.
This type of Fraud reaches beyond the brand advertiser, to the performance marketers. For example, CPA advertisers that give
credit and pay for impression/view-based conversions may be compensating Fraudulent sites. When a hidden ad is delivered to
the user’s browser that impression is recorded as being served (via an anonymous user cookie) to that browser. If the user later
visits the advertiser’s website it may appear to attribution systems and ad servers as though the hidden impression influenced
this conversion and credit the site where the ad ‘ran.’ Worse yet, if the user saw a legitimate ad impression, then was served a
Fraudulent ad impression, then converted on the advertiser site, the attribution systems and ad servers may give some or all of
the credit to the Fraudulent impression, damaging the performance of the seller of the legitimate impression.
Sites that have user traffic but are undesirable to advertisers (such as those classified as Copyright Infringement) could
not effectively use hidden ads as a standalone mechanism for conducting Fraud. Advertisers using verification services,
contextualization and targeting services, blacklists or other techniques to prevent spending money on the sites they find
objectionable would easily identify the placement of these ads even if they did not recognize that they were hidden. As a result,
these sites add in another technical layer to their Fraud through impression laundering.
Impression laundering is almost identical to the well-understood concept of money laundering. Both impression laundering and
money laundering use obfuscation processes and seemingly legitimate sources in an attempt conceal the origination source of
the ad impression or money. Specifically:
Impression laundering is an obfuscation process that conceals the originating website which initiated the advertising
impression call and replaces it with a website more appealing to the advertiser.
Impression laundering often works by using a series of technically complex highly-nested ad calls through iframes. We have
found cases where individual ad calls have been routed through over 20 iframes as part of this obfuscation process. This
process makes it difficult for decisioning systems (Ad Servers, DSPs and Bidders) to accurately identify the true originating
source of the website that initiated the ad call. Additionally this process can fool verification techniques that advertisers rely on
to protect their ad spend. For clarity, we will use the following additional definitions when discussing the impression laundering
process:
Originating Site: The website the user visited that is the originating source of traffic that created the ad impression.
Laundering Site: The website that identifies itself as the source of the traffic primarily as a mechanism to legitimize
the ad impression.
Ad
Ad
Call
1x1 iFrame
X
Ad
Call
Ad
Call
Advertiser
Ad Server
Buy-side
Ad Server
Sell-side
Ad Server
Ad Page on Lau
ndering Site
(with ad call)
Ad Page on Lau
ndering Site
(with a dummy
image
hiding redirect)
Originating Site
(with Copyright
Infringing
Streaming Conte
nt)
Impression laundering
often works by using a
series of technically
complex highly-nested
ad calls through
iframes.
Figure 2 - Basic Impression Laundering Fraud Chain
PAGE 3
Some may wonder, if Originating Sites can launder the ad impression, why bother hiding the ad creative from the user? When
the laundering process is successful the advertiser believes their ads are running on the Laundering Site (which they perceive as
legitimate). However, if the ad creative was not hidden it would appear alongside the Originating Site content (that same content
that advertisers find objectionable) and because Originating Sites like copyright infringement websites are both highly visited
and highly scrutinized (for example in the Ad Transparency Report published by the Annenberg Innovation Lab at the University
of Southern California which highlights advertiser and sellers of inventory on copyright infringement websites), it is likely that
someone would identify the advertiser as appearing next to the content and identify to the advertiser that they are appearing
somewhere undesirable.
As a result, combining impression laundering with hidden ads is a sophisticated Fraud technique for copyright infringement
sites. It makes the Fraud difficult to identify by both human observation (on the Originating Site or Laundering Site) or by
machine analytics designed to identify the location in which the ad is being served and allows Fraud perpetrators to divert
money from both brand and performance marketers.
Why Impression Laundering is used by Sites classified as Copyright Infringement
DoubleVerify has found that sites typically have two characteristics that drive them to participate in impression
laundering Fraud:
• Significant consumer interest that drives substantial site traffic, combined with
• Low advertiser interest in appearing on the site because of objectionable content
Copyright infringement sites clearly exhibit the characteristics above. Since the origination of Napster in 1999 through the
Megaupload saga it has been clear that a significant percent of Internet usage will revolve around the sharing of copyrighted
material. Most advertisers have chosen to avoid appearing on sites identified as participating in copyright infringement and
utilize verification companies or other media planning techniques to identify and avoid these sites. The high traffic generated by
consumer demand and low advertiser acceptance of appearing on copyright infringement sites sets up the perfect motivation
for impression laundering.
How did this Impression Laundering Fraud activity work?
The websites involved in this impression laundering Fraud use highly complex operations and multiple technical techniques
that made it difficult to uncover. New technology developed in the DoubleVerify Fraud Lab was key in peeling back the layers of
this most recent Fraud activity, and we believe it important to provide the detail of a specific example of the laundering process
(albeit one of the simplest obfuscation chains we found) and sites involved so that the industry is educated about how this
occurs and can work together to combat it.
In this Fraud activity, every laundered ad impression began with an actual user visiting an Originating Site. In our example, we
used the Originating Site www.icefilms.info that is a popular site for downloading or streaming copyrighted movies & TV shows.
DoubleVerify had previously classified this site as “Copyright Infringement: Downloads” and would block ads for our customer
PAGE 4
that appeared directly on the site. Upon arriving at the site, we searched for the most recent episode of “Breaking Bad” and
arrived at a page on which it can be downloaded or streamed (www.icefilms.info/ip.php?v=157447&). Figure 3 represents how
the page looked when we visited it:
Figure 3 - icefilms.info page with Breaking Bad episode
Despite the suspected copyright infringing content on the site, from an advertising perspective the page looks relatively normal
and doesn’t raise immediate concern that Fraud is taking place. However, upon closer examination of the code we identified
abnormalities in the “Play Now / Download Now” image in the lower right hand corner of the page and determined that it is not
an actual advertisement. It is an image that overlays and hides the ads preventing the user from viewing them. Figure 4, below,
examines the code behind the ad and provides a technical explanation of how the Fraud is perpetuated and the traffic laundered
through the seemingly legitimate site, diychef.com.
PAGE 5
Figure 4 - Coding on icefilms.com page demonstrating the hiding of ads behind an image.
• (Figure 4 – Step 1) www.icefilms.info opens an iframe (via a series of
redirects) to www.diychef.com, which appears to be a legitimate video
recipe site. In this example, diychef.com is the Laundering Site involved in
the Fraud, which prior to this breakthrough, had been classified as a site
about Food & Drink. Figure 5 shows the home page of diychef.com with the
ads redacted.
It is important to note that the code opening the iframe (Figure 4 – Step 1) to
the Laundering Site points to a very specific page on the site: diychef.com/ads/
o1/728.php?mn=18 which has special characteristics which minimize detection
and maximize Fraudulent revenue:
• (Figure 4 – Step 2) The code prevents search engine crawlers from accessing
and indexing the page to minimize exposure by using the noindex/nofollow
metatag: <meta name=”robots” content=”noindex, nofollow”>.
• (Figure 4 – Step 3) The code refreshes the page every 30 seconds
using a second meta tag: <meta name=”30;url=728.php?mn=18”httpequiv=”refresh”>. This ensures a new ad is called every 30 seconds allowing
the Fraudster to rake up ad impressions and generate more income while
the user is downloading or viewing the pirated content.
Figure 5 - Home page of diychef.com
(ads redacted)
PAGE 6
• (Figure 4 – Step 4) The page html further obfuscates the process by overlaying another image designed to look like an 728x90
ad (The Play Now / Download image). This is not an ad, it is a static image overlaying and hiding the ads behind it. This image is
coded: <img src=”../728.gif”> and the full URL is diychef.com/ads/728.gif.
• (Figure 4 – Step 5) Finally the code opens the ads (behind the dummy image) by creating an iframe to a different URL on
diychef.com. This iframe is coded <iframe src=”728.htm”…</iframe> and the full URL is diychef.com/ads/o1/728.html.
The ads are being served in the last iframe (Figure 4 – Step 5) – within a transparent iframe, behind the dummy image, within the
Laundering Site (diychef.com) that appears legitimate rather than directly on the Originating Site (icefilms.info) where the user
was visiting in their browser when the ad serving process was initiated.
Figure 6 - Details of page coding for ad serving page
When examining the html coding (Figure 6) of the URL displaying the ads (728.html), note how the code includes expansive
keywords, title and description of the page despite a notable lack of content. This data presents the page as content-rich and can
be consumed by targeting and contextualization services that will use it to match a content-relevant, high-value advertisement
to this page on the Laundering Site. An advertiser that is targeting recipes or cooking content may get matched to this page
because of this coding and likely would not question appearing on a site with the diychef.com domain. However, in reality, their
ads are served to the user visiting icefilms.info behind a dummy image that hides their creative from view.
PAGE 7
Ad
Ad
Call
1x1 iFrame
Ad
Call
Ad
Call
Ad
Call
Ad
Call
Ad
Call
Ad
Call
X
05-30s
Ad Page on Laun
dering Site
(with a dummy
image
hiding redirect)
Ad
Ad
Call
Sell-side
Ad Server
Ad
Ad
Call
Buy-side
Ad Server
Advertiser
Ad Server
Ad Page on Laun
dering Site
(with ad call)
Originating Site
(with Copyright
Infringing
Streaming Cont
ent)
The originating Site opens an iframe to a Laundering Site, which routes the call through several dummy ad pages before sending
the add call to a Sell-side Ad Server, from which a Buy-side Ad Server will purchase the ad, leading to an ad call to the advertiser
ad server and causing an ad creative to be “displayed.” To maximize Fraudulent income, ad pages may refresh every 30 seconds,
generating additional Fraudulent ad impressions again and again.
Figure 7 - Overview of the entire laundering obfuscation process
Could a Laundering Site simply claim that in the process of buying
cheap traffic for their site they were unwittingly participating in this
Fraud activity? After all, sites like diychef.com appear to be legitimate
businesses, have real content, and generally appear to unsuspecting
eyes to be running a legitimate ad-driven website.
While an error could be possible, the examples we have examined
demonstrate that it is unlikely this obfuscation and ad hiding is simply
a mistake. The pages receiving traffic and showing ads contradict best
practices in consumer-oriented design. They maximize ad impressions
by showing multiple or rotating ads while having little to no content
outside of the advertisements. It is our opinion that Laundering Sites
likely know that pages receiving redirected traffic from the Originating
Sites will not be visible to the user and therefore they are coded to
maximize advertising revenue derived from Fraud.
As the ecosystem has
gotten more complex and
the intermediary platforms
have become more
open, these traits have
been utilized to initiate
increasing amounts
of Fraud.
PAGE 8
How did DoubleVerify identify the Impression Laundering Fraud?
DoubleVerify’s leading ad verification solution relies on our dual-verification (tandem pixel and crawler) technology to penetrate
the transparency gap created by the complex online advertising ecosystem. In many cases, the complexity that creates the
transparency gap exists for completely legitimate reasons such as to facilitate inventory fluidity and transfer through a variety of
or intermediaries that sit between impression buyer and seller. However, as the ecosystem has both gotten more complex and
the intermediary platforms have become more open, these traits have been utilized to initiate increasing amounts of Fraud.
DoubleVerify recently has made some significant proprietary advances in our dual-verification method allowing us to
more accurately identify all the originating and Laundering Sites in this impression laundering Fraud and how traffic and ad
impression serving chains flow between them. These advances, combined with new algorithms created by our data scientists,
enabled DoubleVerify to rapidly identify this series of websites, linked together, and executing the large-scale impression
laundering Fraud described herein.
Protecting Against Fraud
Finding solutions that protect against all types of Fraud is top of mind for many advertisers. DoubleVerify’s verification solutions
are leaders in the two critical elements necessary for protection: penetrating the transparency gap that exists in Fraudulent
impressions and identifying Fraud participants.
DoubleVerify’s verification
solutions are leaders in
the two critical elements
necessary for protection:
penetrating the
transparency gap that
exists in Fraudulent
impressions and
identifying Fraud
participants.
As evidenced by the discovery outlined in this report, DoubleVerify
has invested significant resources in identifying Fraud participants
and classifying those sites accordingly. This identification of Fraud
activity added over 1,200 sites to DoubleVerify’s Fraud categories
in our classification taxonomy. It also enables us to easily identify
new sites participating in impression laundering in a timely manner
– especially important given how easy it is for Fraudulent parties
to simply reroute their traffic through new Laundering Sites. These
technology advances have also led us to identify other promising
leads that we are actively researching to ensure we maintain
protection into the future and continue to proactively identify all
types of and participants engaged in online display advertising
Fraud. Our track record and the level of resources deployed in this
area are strong because DoubleVerify is aware that persistent effort
is needed to combat the bad actors in the industry. It is easy for them
to establish new websites and techniques to try and perpetuate their
revenue stream and customers that work with DoubleVerify can be
confident we will be leveraging all of our resources, expertise and
systems to continually stay ahead of and identify sites and methods
that perpetrate Fraud.
PAGE 9
However, identifying sites participating in online advertising Fraud
is only half of the battle against preventing it from impacting display
campaigns and media spend. In order to effectively combat it, the
technology being used must be able to penetrate the transparency
gap caused by nested iframes that is both inherent in the complex
ecosystem and exacerbated by laundering activity that deploys highly
obfuscated ad serving chains. No matter how accurate and lengthy
the list of Fraudulent sites may be, accurate determination of the
URL on which the ad appears is necessary to have an effective Fraud
prevention solution.
Our data scientists recently
examined impressions
running through major
online ad exchanges and
in that data set we
discovered that the RTB
transparency gap on the prebid URL was nearly 30%.
Consider for a moment the limitation of Fraud protection services
that operate as a pre-bid targeting style solution in the RTB
environment and how they address the transparency gap. Prebid Fraud identification relies on the URL identified by the RTB
platform to determine whether Fraud is occurring or not. Our data
scientists recently examined impressions running through major
online ad exchanges and in that data set we discovered that the RTB
transparency gap on the pre-bid URL was nearly 30%. The RTB transparency gap occurs when the URL the ad is displayed
on differs from the URL utilized by the exchange for pre-bid targeting, Fraud prevention, blacklist compliance, or any other
URL-based data decisions. A primary legitimate reason for this gap is the fact that impressions are passed through so many
intermediaries that the original authentic site data simply gets lost in the transactions. A secondary valid reason for this gap
is that some exchanges receive blinded traffic from their suppliers to protect the inventory source. In Fraudulent transactions
the Fraudster could deliberately misidentify the URL with limited or no ability for the exchange to authenticate its veracity. As
a result, pre-bid solutions using the most accurate list of sites engaged in Fraud or a complicated predictive Fraud score simply
won’t work – they have not overcome the RTB transparency gap and therefore are often making a flawed decision and failing to
protect the advertiser.
The only effective method for Fraud prevention is employing a solution that combines accurate Fraud identification with
technology that eliminates the transparency gap on each impression served in any complex serving environment. DoubleVerify
offers these solutions to marketers through our verification blocking (BrandShield) and reporting (BrandAssure) products. Our
real-time platform integration solution, BrandShield Connect, provides ad networks and exchanges the technology necessary
to penetrate the transparency gap and correctly identify the true URL and safety of the ad impression they are selling. All of the
DoubleVerify solutions combine our industry-leading Fraud identification with our superior ability to penetrate the transparency
gap – the two critical elements that provide a complete and effective solution against online advertising Fraud.
Moving the Industry Forward
A discovery of Fraud as large as this has significant impact for our customers and it reinforces the significant negative effects
that this Fraudulent activity has on the legitimate online display advertising ecosystem. It also confirms the value of vigilance
via independent third-party verification continually driving for transparency on ad campaigns. DoubleVerify wants to thank
our agency partners that helped us aggressively validate this Fraud activity and supported the rapid deployment of this
new technology into our systems in order to protect our joint advertising customers. For advertisers and suppliers using our
technology, we added all 1,220 sites into our recently enhanced content categories, and our systems are currently reporting and
blocking impressions from these sites as determined by the campaign settings.
DoubleVerify remains committed to overcoming the transparency gap in online display advertising and eliminating all types of
Fraud that diminish the value and credibility of its participants so that together we can build a better industry.
PAGE 10
1
Fraud Definition and Clarifications
• DoubleVerify’s MRC-accredited Verification solutions utilize two
content categories to classify sites as Fraud and provide clarity for
our customers:
• Confirmed Ad Impression Fraud: Sites with significant direct
evidence confirming that Ad Impression Fraud is taking place.
Ad Impression Fraud manipulates ad serving, ad display or traffic
activity such that ad impression measurements are incremented
inappropriately because the ads cannot be seen by a user, are not
served within operationally viewable parameters or were displayed
as a result of machine-generated traffic.
• Suspected Ad Impression Fraud: Sites with significant circumstantial
evidence that Ad Impression Fraud is taking place such as site
data, traffic patterns, viewability data and relationships to sites with
Confirmed Ad Impression Fraud.
• Ad Impression Fraud manipulates ad serving, ad display or traffic
activity such that ad impression measurements are incremented
inappropriately because the ads cannot be seen by a user, are not
served within operationally viewable parameters or were displayed
as a result of machine-generated traffic.
DoubleVerify wants to
thank our agency partners
that helped us aggressively
validate this Fraud activity
and supported the rapid
deployment of this new
technology into our
systems in order to
protect our joint
advertising customers.
• For the purposes of this report, any description of Fraud means Ad Impression Fraud. Sites described as Fraudulent means
those in this study that we have classified into the “Confirmed Ad Impression Fraud” or “Suspected Ad Impression Fraud”
contextual categories. Fraudulent activity or Fraudulent impressions means the advertising impressions on Fraudulent
Sites.
• As used by the IAB Verification Guidelines, DoubleVerify contextual category definitions, this release, and related posts
and reports, the use of the words Fraud, Fraud Site, Ad Impression Fraud, Fraud activity, Laundering, Laundering Site and
other related terms are not intended to represent Fraud or Laundering as defined in various laws, statutes and ordinances
or as conventionally used in U.S. Court or other legal proceedings, but rather a custom definition strictly for advertising
measurement purposes.
• The statements made in this report regarding Fraud all relate to DoubleVerify’s observations and data collection during
April 2013. These statements shall not be construed to imply information or conclusions regarding activity that took place
outside of the studied period.
Let’s build a better industry.
Contact info@doubleverify.com.
Visit us at doubleverify.com to learn more.
PAGE 11