Private search engines and traditional advertising platforms: uncovering the privacy risks
by Salim Chouaki and Oana Goga (CNRS)
Is it possible for private search engines to preserve user privacy while depending on traditional advertising platforms?
Several researchers and advocacy groups have condemned traditional search engines such as Google or Bing for the privacy-violating techniques they employ to deliver search results and ads to users. For example, these search engines use data from tracking users’ information to estimate results’ relevance and serve users with personalised ads. In response to these concerns, a number of private search engines such as DuckDuckGo, Startpage, and Qwant have emerged in the market. These private search engines promote a strategy of respecting user privacy and promise not to track users’ search and browsing activities, all while delivering relevant search results.
However, these private search engines rely on advertising for funding and can only maintain their commitment to privacy if they continue to delivering ads to their users. Alarmingly, they rely on traditional advertising systems that have been criticized for their lack of respect for users’ privacy: DuckDuckGo and Qwant use Microsoft’s advertising system, while StartPage uses Google’s advertising system. Furthermore, when examining the privacy policies of these private search engines, we find that they are either silent or ambiguous on the privacy of the ads they deliver to users.
Our objective was to investigate the privacy properties of the advertising systems employed by these three major privacy-focused search engines, and compare them with more widely used search engines: Bing and Google. We implement an automated measurement methodology to measure if and how users can be re-identified (hence, their privacy is compromised) when clicking on ads on each search engine. We build an open-source implementation of this methodology in the form of a Puppeteer-based pipeline that simulates search queries and ad clicks. We apply this crawling methodology to the five search engines, providing a full dataset with visited websites, first-party storage, and web requests to search engines’ servers and/or other third parties when clicking ads.
We investigate the privacy properties of these search engines when they: (i) present search ads to users, (ii) when a user clicks on an ad, and (iii) when the user lands on the advertiser’s page.
- Users’ privacy is not harmed on private search engines before clicking on ads. Private search engines do not appear to attempt to re-identify users across visits and do not include resources from, or make network requests to known trackers.
- Users’ privacy is compromised when clicking on ads on private search engines. All search engines engage in navigation-based user-tracking. Navigation-based tracking refers to tracking techniques that are redirecting users through one or more redirectors when navigating from one website to another in order to share user information across sites. Navigational tracking occurred on 4% ad clicks on Bing, on 100% ad clicks on Google, on 100% ad clicks on DuckDuckGo, on 86% ad clicks on Qwant, and on 100% ad clicks on StartPage.
- Users’ privacy is compromised after clicking on ads on private search engines. We check whether the search engines require advertisers to abide by privacy-respecting practices by measuring whether advertisers include trackers or other known privacy-harming resources. We found that 93% of advertisers websites (across all search engines) included tracker and privacy-harming resources. Moreover, we check whether search engines or redirectors aid advertisers in profiling visitors by measuring the data they receive in the form of user-describing query parameters. We find that advertisers receive user identifiers as query parameters in 68%, 92%, and 53% of cases for DuckDuckGo, StartPage, and Qwant, respectively. This practice, known as UID smuggling, enables redirectors to aggregate more user behaviour data if they have scripts on the ads’ destination websites and they store the user-identifying parameters they receive. Notably, in the case of private search engines, the user-identifying parameters are not set by the search engine but by the redirectors encountered between the search engine’s and the advertiser’s sites.
Our results indicate that privacy-focused search engines’ privacy protections do not sufficiently cover their advertising systems. Although these search engines refrain from identifying and tracking users and their ad clicks, the presence of ads from Google or Microsoft subjects users to the privacy-invasive practices performed by these two advertising platforms. When users click on ads on private search engines, they are often identified and tracked either by Google, Microsoft, or other third parties, through bounce tracking and UID smuggling techniques.
Salim Chouaki, Oana Goga, Hamed Haddadi, and Peter Snyder. 2023. Understanding the Privacy Risks of Popular Search Engine Advertising Systems. In Proceedings of the 2023 ACM Internet Measurement Conference (IMC ’23), October 24–26, 2023, Montreal, QC, Canada. ACM, New York, NY, USA, 13 pages.