Giter Club home page Giter Club logo

rabbithole's Introduction

rabbithole

Chrome Extension for publishers to investigate third-party ads

As a publisher on the web today, you may have a love-hate relationship with ad networks. On one hand, those ad networks help to pay the bills. But on the other hand, their questionable coding abilities, poor security, and dubious business practices expose you to all sorts of problems.

Our team has spent countless hours dissecting the DOM to try to track down offending ads. This is tedious work, especially when an ad call gets bounced from network to network, creating an extremely complex DOM of nested IFRAMEs.

rabbithole is designed to streamline this process. Using a CSS selector provided by you, it identifies the top-level DOM elements for your ad units. It then recursively descends those elements, building a simpler object model for you to analyze.

  • simplifies the identification and interpretation of the ad-related portion of the DOM
  • "compresses" the tree by skipping past DOM elements that aren't really important and only tracking select properties
  • makes off-site objects easily identifiable
  • tries to identify pixel trackers
  • tries to identify the DOM elements that are most likely the ad creative itself
  • attempts to keep track of a "network path" so you can see the whole ad chain where ads are coming from; this is a work in progress
  • computes statistics like the number of scripts and iframes loaded beneath each node in the tree

Configuration

In order for rabbithole to work with your site, you need to give it a CSS selector that will capture all the top-level DOM elements of your ad units (and is specific enough to not capture non-ad-unit elements).

For example, if you look at the ad tags on http://slashdot.org/, you will see that they all have IDs like div-gpt-ad-728x90_a and div-gpt-ad-300x250_a. You can capture them all with a selector like

[id*='div-gpt-ad']

Once you have your selector, use the options dialog to configure rabbithole. You can open it either from the extensions page in chrome, or by clicking on the gear icon in the rabbithole popup window. If you have a rabbithole window open when you change the CSS selector, you'll need to close rabbithole and reopen it to get it to rescan your page's DOM.

Installing from source

If you enable developer mode in chrome's extension manager, you can "load unpacked extension". Just point chrome at the rabbithole directory.

Notes

The code is interesting on a few levels:

Iframe traversal

rabbithole builds a single unified tree representing your page's DOM along with all nested IFRAMEs' DOMs, including those from other domains. Javascript running within a page cannot access the DOM of an iframe from another origin. A chrome extension can get around this limitation by injecting a script into the page and all iframes. It can then pass a message to the frames and gather information from all of the frames via callbacks, and then merging the sub-trees into the main tree.

Message passing

Rather than sending one message to all frames, it sends an individual message to each individual frame; this is important, because if we sent one message to all frames, we would only be able to process one callback (the first one that happens to be called by one of the frames).

Standalone window

I originally started with the rabbithole UI inside of a regular extension popup. The problem with that is that you can't move the popup, so it can be hard to see the ads in your page as you are inspecting them. In order to create a standalone window that could be moved, I changed the code to use an event page instead of a default_popup. When the user clicks the extension button, we inject the code into the frames from background.js (we can't do it in popup.js), then we send a message to the newly created window to have it begin messaging the frames and gathering the data.

Credits

Thanks to Renzo Trigoso (https://github.com/rtrigoso/) for a lot of the UI work.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.