It is well known that digital ads violate privacy, yet we know little about their content. Digital ads are of increasingly low-quality, and often contain manipulative and deceptive components to lure users into viewing potentially harmful content. This harmful content can range from disinformation or hyper-partisan websites all the way to products with questionable claims or value like health products or payday loans. Many such types of ads are directed at vulnerable populations like children, older adults, and the low-income individuals—all of which calls for more public scrutiny into their content. Yet owing to their fleeting nature, digital ads are neither archived nor systematically studied.
The Princeton Digital Ad Observatory fills this gap and is a large-scale and automatically updated repository of millions of digital ads that appear on the web. The observatory is enabled by an ad crawler software that visits websites, scrapes ads, and stores them. The observatory makes these ads available through a search engine, which organizes the ads by brand, size, disclosures, location, and color.
I aim to use this repository of ads to study the characteristics of digital ads, draw public attention towards problematic ads, and hold the publishers/platforms/networks that host these ads accountable. As a first line of analysis, I will analyze this data to identify and quantify dark patterns, clickbait, and other manipulative practices. I will also open-source the collection tool so other researchers can continue this work and extend it to other platforms such as Facebook and disinformation websites. However, because digital ads are an online reflection of the zeitgeist, this ad repository will also be broadly useful to researchers from other disciplines who might be interested in studying the evolution of digital ads, such as communication and marketing scholars.
If you are interested in learning more about the project or if you might be interested in using the data, please reach out.