Google’s Contracts Harm Competition In The General Search Services Market

13 Aug 2024

United States of America v. Google LLC., Court Filing, retrieved on April 30, 2024, is part of HackerNoon’s Legal PDF Series. You can jump to any part of this filing here. This part is 27 of 37.

A. Google’s Contracts Prevent General Search Services From Accessing Scale And Reduced Scale Directly Reduces The Quality Of Rivals To The Detriment Of Consumers And Advertisers

1. Google Has Significantly Greater Scale Than Its Rivals

978. “Scale” refers to the amount of user-side data a search engine accumulates. Userside data includes: (1) the user query, (2) the ranked results returned by the search engine, (3) corresponding data generated from the user’s interactions with the results of a query, and (4) information, such as location and device type, about the user issuing a query. Supra ¶¶ 159–162.

979. Google receives nine times more queries in a day than all its rivals combined. Tr. 4761:4–24 (Whinston (Pls. Expert)) (endorsing UPXD102 at 47). For mobile queries, Google’s scale advantage is even starker. Google receives 19 times more mobile queries in a day than all its rivals combined. Id. 4762:19–4763:2 (endorsing UPXD102 at 49); Tr. 2662:20– 2663:3 (Parakhin (Microsoft)) (In the United States Bing’s mobile share “is immaterial, it’s probably around 3[%].”). Google’s scale advantage on mobile is particularly meaningful because that is where the search market has been growing. Tr. 5798:17–5799:5 (Whinston (Pls. Expert)).

980. Google has a meaningful scale advantage not just in the volume of queries, but in the breadth of queries it sees. Tr. 5785:11–5786:23 (Whinston (Pls. Expert)) (discussing UPXD104 at 44). That is, Google sees a wider variety of queries that other search engines do not see. Prof. Whinston analyzed 3,708 million unique query phrases issued on Google and Bing during the week of February 10–16, 2020 (e.g., “Facebook,” “what’s the weather in DC,” or “the restaurant with the green awning on State Street in Madison, Wisconsin,” would all be considered unique query phrases).

Prof. Whinston found that for desktop and mobile combined, 93% of all unique search phrases were seen only by Google, 4.8% were seen only by Bing, and 2.2% were seen by both search engines.

For mobile phones, 98.4% were seen only by Google, 1% were seen only by Bing, and 0.7% were seen by both. Id. 5785:5–5789:13 (referencing UPXD104 at 44). Because the unique search phrases that were seen on both Bing and Google tended to be frequently issued queries (like “Facebook”), Prof. Whinston found that about half of all queries issued during this period were phrases that only Google saw. Id. 5789:14–5790:23 (referencing UPXD104 at 44).

981. Google’s scale advantage extends to tail queries. Tail queries refer to queries that individually do not occur many times in the query stream. Tr. 1811:4–25 (Lehman (Google)) (A long-tail query is a “rare query,” and these are “masses of . . . queries that are extremely rare individually, but collectively, they make up a significant part of the query stream.”); DX0678 at -030 (Microsoft document identifying head, torso, and tail queries as each around one-third of the query stream); UPX1079 at -996 (Google document identifying the same); id. (noting that the first “1/3 of traffic,” head queries, are only “0.1% of distinct queries” and the last third of traffic (tail queries) are “90% of distinct” queries.).

The definition of a tail query is relative to the volume and composition of a GSE’s query stream—the same query can be a tail query on one search engine and a head query on another. Tr. 2676:5–11 (Parakhin (Microsoft)) (“For example, if tomorrow on my home machine I . . . create [a] search engine, every single query will be [a] tail query for it, right.

But for Google, of course, many queries would be very much head queries.”); Tr. 10341:23–10342:24 (Oard (Pls. Expert) (“If you had 5 percent of [Google’s] data, that amount of user-side data, and you had something that was occurring once or twice a month, it would occur, at most, once a year. So something that looks to Google as something they can model, something they can work on, is invisible when you only have 5 percent of the user-side data, because your long tail gets to zero, whereas somebody that has 20 times as much user-side data will see 10 or 20.”).

982. Google maintains a significant advantage over its competitors for tail queries. For example, Prof. Whinston’s query analysis found that for query phrases seen between 1 to 4 times on Google (a proxy for a tail query), 99.8% were not seen by Bing at all. Tr. 5789:14–5790:23 (Whinston (Pls. Expert)) (referencing UPXD104 at 44).

983. Google also has an advantage in fresh queries. The term “fresh” is used to describe recency in scale or queries. Tr. 2251:10–23 (Giannandrea (Apple)) (Freshness means that the answer is more up to date.). Due to Google’s query volume, the company has a significant scale advantage in fresh queries. UPX0227 at -134 (noting “billions of times each day” Google searches “gives us another example, another bit of training data”).

984. Google continues to accumulate scale. From 2010 to 2021, the total number of queries Google received per year quadrupled from approximately 200 billion in 2010 to approximately 800 billion in 2021. Tr. 5829:23–5830:12 (Whinston (Pls. Expert)) (discussing UPXD104 at 56).

2. By Depriving Rivals Of Scale, Google’s Contracts Harm Competition In The General Search Services Market

985. Google’s contracts deprive rivals and potential entrants of the scale necessary to compete with Google’s search quality. First, scale is a crucial ingredient to the iterative cycle of search engine improvement.

The central way GSEs learn how to return better results is by observing users. Because of Google’s contracts, rivals are unable to access valuable data— especially, in key query segments like tail, mobile, and fresh queries. Second, Google further weakens rival’s ability to compete by depriving them of sufficient scale to fuel their development cycle. A crucial method of GSE development is through live experimentation.

Without sufficient scale, rivals cannot quickly and accurately identify areas for improvement and test their product development ideas on real users. Thus, without sufficient scale rivals cannot effectively compete with Google’s search quality.

a) Scale Is Important To Compete In General Search

986. Scale is vital for improving search quality. Supra ¶¶ 163–195 (§ III.E.1).

987. In search, increased scale fuels a feedback loop. Tr. 2644:20–2646:2 (Parakhin (Microsoft)); Tr. 1761:4–24 (Lehman (Google)) (discussing UPX0228 at -503). One step in the feedback loop is that more searches provide the GSE with more data, which improves search quality.

The more users a GSE observes, the more it understands what good and bad results are. Tr. 2644:20–2646:2 (Parakhin (Microsoft)) (“Simply if you’ve seen -- if this query was issued previously and people already clicked on certain results and read them, and some results they click-click-click back, it gives you a lot of information which results are actually good or not, and you can memorize them.”); UPX0228 at -503 (“The source of Google’s magic is this twoway dialogue with users. With every query, we give a [sic] some knowledge, and get a little back. . . . After a few hundred billion rounds, we start lookin’ pretty smart!”).

988. Of note, Google can train key algorithms with an amount of data not available to its rivals. Supra ¶¶ 163–166. For example, one of Google’s key algorithms, Navboost, makes use of all user-side data Google receives in 13 months. Tr. 1805:6–13 (Lehman (Google)); UPX0182 at -438 (“Navboost is one of the most successful algorithms in search quality.

Without Navboost our quality will be fairly close to that of Yahoo. Navboost has single-handedly given us the quality lead we have over Yahoo.”); Tr. 6433:15–6434:2 (Nayak (Google)) (13 months means “quite literally all of the data that Google has collected over 13 months.”). To put this number into perspective, in 2020 Google’s share of the general search market was 89.2%, whereas Bing’s share was 5.5%; the ratios of those two numbers is 16.2. Supra ¶ 522.

If the amount of user-side data Google receives is proportional to the number of queries it receives, it would take Bing 17 years and 7 months to collect the data Google sees in 13 months. Tr. 5792:15–5793:23 (Whinston (Pls. Expert)); Tr. 10350:8–10351:8 (Oard (Pls. Expert)) (using a market share number provided by Google’s expert and calculating “more than 13 years of data, two decades of data”).

989. The greater volume of user-side data also allows Google to increase its search quality by helping identify areas for potential improvement and development. Des. Tr. 153:4– 154:13 (Google-PN 30(b)(6) Dep.) (“[O]ften we look at queries that have [a low quality metric score] to try and understand what is going on, what are we missing . . . . So that’s a way of figuring out how we can improve our algorithms.”); Tr. 2257:11–15 (Giannandrea (Apple)) (The more queries a search engine sees, “the more opportunities the engineers have to look for patterns and improve the algorithm.”). Supra ¶¶ 196–199 (§ III.E.2).

990. Prof. Whinston’s empirical analysis supports the conclusion that greater scale improves search quality. Both Google and Bing have higher quality results for head queries (queries seen more frequently), as compared to tail queries. Tr. 5795:4–5796:15 (Whinston (Pls. Expert)) (referencing UPXD104 at 50). Prof Whinston’s analysis showed the gap between Google’s and Bing’s respective quality scores are larger for tail queries than for head queries. Id.

991. Prof. Whinston also analyzed how long a user stayed on results pages after clicking; this also supported the conclusion that greater scale improves search quality. Tr. 5797:3–5798:15 (Whinston (Pls. Expert)). One measure Google uses for a search result page’s quality is the length of time that a user spends on a search-result website. UPX0007 at -433. Google measures this by tracking the “click split,” which is the ratio of long clicks to short clicks. UPX0007 at -433 (“‘Click split’ is . . . a proxy lately for goodness of outcomes per query.”).

A short click, i.e., the user clicks on the search result link and reverts back to Google quickly, indicates bad quality, and a long click indicates high quality. Tr. 5797:3–5798:15 (Whinston (Pls. Expert)); UPX0007 at -433. A comparison of the click split for Google’s responses to head, torso, and tail queries found that head queries had a better click split (more long clicks) than torso queries, and torso queries had a better click split than tail queries. Tr. 5797:3–5798:15 (Whinston (Pls. Expert)). This analysis shows that Google serves better search results for queries that it sees more frequently, as compared to queries it sees less often. Id.

992. Increased search quality will attract users and advertisers (who follow users) resulting in increased advertising revenue, which increases resources available for investment. Increased investment, in turn, indirectly and directly leads to greater scale (through revenue used for distribution) and search quality (through investments in search). This in turn attracts more users and leads to greater scale, in a flywheel that continuously spins. Tr. 2652:2–14, 2653:2– 2654:5, 2654:20–2655:13 (Parakhin (Microsoft)); infra ¶ 1054.

993. Because of this scale-driven feedback loop, GSEs are greatly affected by how much scale they have compared to their rivals. Tr. 2646:7–22 (Parakhin (Microsoft)) (Relative traffic, for example, functions such that “if I have more traffic than my competitors, that participates in multiple feedback loops driving quality and driving index completeness, which in effect is driving quality.

And not unimportant, it is very impactful for revenue.”); Tr. 2652:2– 2654:5, 2654:20–2655:13, 2681:25–2682:9 (Parakhin (Microsoft)) (Differences in relative scale create an asymmetry that also has a self-reinforcing component. If a GSE’s “relative scale is larger, [its] quality is better so people are more likely to prefer [its] results, and advertisers are more willing to come to [it] so [it will] have more revenue, and so [it will] have more money to invest” and with more money to invest it can buy more distribution that increases its scale, and spend more on infrastructure and engineers that make its quality better.).

994. Consistent with the importance of scale, where a rival has sufficient scale, it may narrow the search-quality gap. For example, on desktop, with its ownership of the Windows operating system, Bing (including partners) has recently served about 13–25% of the desktop search market. Tr. 2662:20–2663:1 (Parakhin (Microsoft)); Tr. 3495:16–20 (Nadella (Microsoft)). As a result of its scale on desktop, Bing has narrowed the search quality gap on desktop. UPX0238 at -667 (“Overall, Google leads other search engines.

However, Bing is comparable on desktop.”). Google’s and Apple’s evaluations found that in certain areas, Bing outperforms Google on desktop. UPX0238 at -679–80; UPX0260 at -681 (Apple evaluation found that “Bing in English (US) Desktop is actually preferred” to Google.); UPX0187 at -713 (Because Bing does not “have scale in queries they are not good in the long tail yet” even though “[f]or the head part of mainstream queries they have narrowed the gap.”).

i. Access To Tail Scale Is Necessary To Compete In General Search

995. Scale depth cannot substitute for scale breadth. Search quality depends on having a sufficient volume of queries and a sufficient variety of queries rather than more instances of the same query. Supra ¶¶ 980–983.

For example, having more head queries does not allow a search engine to optimize tail queries. Des. Tr. 151:20–152:16 (Google-PN 30(b)(6) Dep.) (more queries generally do not help address tail queries if the additional queries are all head queries); Des. Tr. 93:18–25 (van der Kooi (Microsoft) Dep.) (Microsoft has not reached parity with Google for tail queries because they do not show up with the same frequency.); Tr. 1902:8–19 (Lehman (Google)) (“[F]or these head queries that we’ve seen many times and for which we have [click data], we’re more confident in that boost data . . . as for queries that are in the longtail where we have little scraps of data, it’s ambiguous, it’s harder to figure out.”).

996. Tail queries particularly benefit from additional scale because these queries are not frequently observed by GSEs. Tr. 2675:14–24 (Parakhin (Microsoft)) (less frequent queries like tail or location-specific queries tend to benefit more from scale); UPX1079 at -996 (“[T]he vast number of queries we see rarely or even just once [are] the tail [queries].”); Tr. 10343:2– 10345:9 (Oard (Pls. Expert)) (“And so it follows exactly what you would expect, that the long tail queries are where user-side data can be particularly valuable, because if I have a head query, a query that’s occurring very often . . . then I don’t have to have a whole lot of user-side data before I’ve seen a lot of [that head query]. And if I see a lot more [of that head query] I’m not probably going to get a whole lot better.

But if I’m seeing zero or 20, there’s a big difference.”). Google’s scale at the tail allows it to improve its tail quality as compared to rivals “because at [Google’s] scale, even the most obscure choice would have been exercised by thousands of people.” UPX0205 at -202.

997. A GSE’s ability to respond to tail queries is important to its ability to attract and retain users. Tr. 2251:24–2252:13 (Giannandrea (Apple)) (“[T]he tail requirement is pretty onerous” because users “would become suspicious if they knew something existed and they couldn’t find it . . . .”); Des. Tr. 249:12–250:15 (van der Kooi (Microsoft) Dep.) (Tail quality is important because a “consumer . . . is most loyal to a product where all their needs are being met. So they care about the head and the torso, but they certainly care about the quality on tail queries as well because that is where they have local leads, et cetera.”).

ii. Access To Mobile Scale Is Necessary To Compete In General Search

998. Mobile scale is necessary to improve the quality of mobile search and compete in general search. Tr. 3495:23–3496:16 (Nadella (Microsoft)) (User quality for search requires participation in both desktop and mobile.); Tr. 2260:22–25 (Giannandrea (Apple)) (differences between mobile and desktop make access to mobile queries at scale important to search quality on mobile); Tr. 2765:5–22 (Parakhin (Microsoft)) (Scale on a particular form factor allows you to optimize to that form factor.).

999. The user-side data GSEs obtain from mobile users is markedly different from the user-side data the GSEs obtain from desktop users. Tr. 2663:7–2664:6 (Parakhin (Microsoft)); UPX0262 at -990 (“[W]e found user search intent and interaction patterns on mobile are substantially different from the patterns on desktop . . . .”).

1000. First, the query mix between mobile and desktop is different. Tr. 2260:19–21 (Giannandrea (Apple)). For example, desktop and laptop users tend to research things (like a new mortgage) that take more time. Tr. 2650:3–19 (Parakhin (Microsoft)). In contrast, mobile users are more likely to quickly search for things needed in the moment, like restaurants. Id.

Another important difference between desktop and mobile users is that mobile users tend to issue more local queries. Des. Tr. 80:18–82:15 (Google-PN 30(b)(6) Dep.); Tr. 2661:5–8 (Parakhin (Microsoft)); Des. Tr. 249:10–250:15 (van der Kooi (Microsoft) Dep.) (“[O]n a mobile device there is a larger percentage of those local and tail queries.”). A local query is defined as one where the results vary depending on the location of the user.

Tr. 2660:18–2661:4 (Parakhin (Microsoft)) (“So, for example, querying President of the United States is the same -- the result will be [the] same no matter where you are in the world whereas best restaurant near me would be very different depending on where you are in the world.”); Tr. 222:14–15 (Varian (Google)) (“A local query would be a query of a local store or merchant, geographically local.”).

1001. Moreover, because of the on-the-go nature of mobile devices, local queries on mobile tend to be more varied and fine-grained. Tr. 2660:1–17 (Parakhin (Microsoft)) (“On desktop, the location information [is] either unavailable or is much coarser or granular, right, because most laptops or desktops don’t have GPS or fine-grained location information.”); Tr. 3506:22–3507:12 (Nadella (Microsoft)) (Without mobile scale, a GSE would not “get the local restaurant” that it “didn’t see on the desktop.”).

Because local queries seek very specific location-dependent information, they tend to be tail queries. Tr. 2771:14–2772:15 (Parakhin (Microsoft)) (“[F]or certain locations, popular locations . . . like New York City, it would be [a] head query . . . . [I]f you’re querying it from less popular locations somewhere in the field, like, nobody ever might have queried in that area ever” it would be a tail query.); id. 2661:13–16 (“It’s much more likely that the local query will be [a] tail query.”); Tr. 10346:20–10349:14 (Oard (Pls Expert)).

1002. Second, user intent on mobile and desktop differs even when the user issues that same query. Tr. 6315:24–6317:16 (Nayak (Google)) (for the same queries, mobile users sometimes have different intents than desktop users); Des. Tr. 80:18–82:15 (Google-PN 30(b)(6) Dep.). For example, a Google-conducted study observed that for the query “norton pub,” “desktop users meant ‘Norton Publishing’ company, whereas mobile users wanted to find a pub named ‘norton.’” UPX1087 at -723.

Within mobile, GSEs observe a wider variety of user intents based on the user’s specific location. Tr. 2772:20–2773:16 (Parakhin (Microsoft)) (“You know, the word Eiffel Tower, if you’re in Paris would mean Eiffel Tower. If you’re somewhere else, it might mean like local attraction, Las Vegas Eiffel Tower or, again, [a] restaurant with the same name . . . .”).

1003. Third, users interact with mobile and desktop results differently, producing different data. UPX0201 at -211 (“Mobile gets a different aspect of user behavior.”); Des. Tr. 80:18–82:15 (Google-PN 30(b)(6) Dep.). As compared to desktop users, mobile users tend to

interact with results using “non-click" type feedback (i.e., hovers, scrolls, quickly “abandoning" results they are satisfied with).

UPX1087 at -720, -733; UPX0262 at -991 (With the dawn of the “mobile era” it became important to track “abandonment (query event which is followed by neither clicks nor manual query refinement) and attention signals measuring how long the user spent on a result," so that they "could be employed as important non-click signals that can be positive indicators for results providing answers.”).

1004. GSES which rely on patterns derived from user-side data to improve search quality-derive different conclusions from observing mobile versus desktop user-side data. Mobile scale is particularly important to be able to serve high quality results in response to the local queries that are only seen on mobile devices.

Because by observing user behavior in relation to their location, GSEs learn when users seek different information. Tr. 2660:1-17 (Parakhin (Microsoft)) ("So it's very important to have as much mobile traffic as possible to be able to answer queries that are very location-specific."); Tr. 10346:20-10349:14 (Oard (Pls. Expert)) (“But fine-grain location has value for serving user needs if I know what other people in that region have looked for."); Tr. 10419:1-19 (Oard (Pls Expert)) (For “finer grained location[s]" GSEs "would need even more user-side data to train features of this type."); Des. Tr. 202:13-203:6 (Edwards (Google) Dep.); Des. Tr. 133:17–134:12 (Google-PN (30(b)(6) Dep.) (“[W]e use your location if you should share it with us to retrieve results that are nearby. That proves to be incredibly valuable, also.”).

Thus, scale from desktop is not a substitute for mobile scale. Tr. 2663:4–2664:6 (Parakhin (Microsoft)) (“[Y]ou cannot easily sort of leverage data in one form factor to easily improve quality in another."); UPX0259 at .004 ("We can't always port over products as is from mobile [to desktop].").

1005. Consistent with the importance of scale, Google has a quality advantage in mobile where it has a marked scale advantage as compared to its competitors. UPX0268 at -132–33 (finding based on a Google-conducted 2020 comparative analysis that “[a]cross the board, Google outperforms more on mobile than desktop”). Supra ¶ 979.

iii. Rivals Are Deprived Of Sufficient Fresh Scale To Compete

1006. [Intentionally Left Blank]

1007. Fresh queries (at scale) are important for a GSE to provide useful responses to queries, as the meaning of search queries and search results change over time. Tr. 10337:12– 10339:6 (Oard (Pls. Expert)) (By observing users, Google learns that words have new meanings based on new events.); Tr. 1899:25–1902:4 (Lehman (Google)) (“[O]ld school techniques” that train on fresh user-side data are used to keep up with current events.); Tr. 2369:2–7 (Giannandrea (Apple)).

1008. As such, GSEs require fresh data to respond to fresh-seeking queries. For example, Google deploys “instant” systems that (1) log fresh user-side data and (2) improve search quality by promoting fresh results for fresh-seeking queries. DX0116 at -.027 (“Instant Glue will suppress” stale results); UPX1006 at -192 (“Instant Navboost” accounts “[f]or popular queries over the last 24 hours [redacted]. ”); Tr. 10336:18–10337:11 (Oard (Pls. Expert)) (“Instant Glue is only looking at the last 24 hours of logs.

And because of that, the processing can be faster. And so that allows them to get updates available to the search engine in something on the order of 10 minutes.”). Google’s “fast response system . . . tries to make use of [new clicks] as quickly as possible.” Tr. 1901:12–1902:4 (Lehman (Google)).

1009. Similarly, to serve fresh results, Google must regularly (in 2–3 month intervals) retrain deep-learning systems using fresh data. Tr. 6432:8–25 (Nayak (Google)) (Google trains RankBrain with fresh data at a regular cadence, because otherwise it would be blind to new events); Tr. 6448:20–6449:3 (Nayak (Google)) (Google must retrain RankEmbedBERT so that the training data reflects fresh events).

Failure to retrain Google’s deep-learning systems would result in a degradation of the quality of those systems because they would become “stale,” decreasing Google’s search quality. Des. Tr. 119:20–121:6 (Google-PN 30(b)(6) Dep.) (It is important to retrain deep-learning models at a regular cadence because fresh user-side data improves the performance of those models.); id. (“But once you retrain and you get this thing when you get a bump up in quality back to where it was before, because now you have got the fresher data and then, again, you see this steady decline.”).

1010. A search engine’s ability to accumulate data over time does not replace the need for fresh data at scale. Tr. 10350:8–10351:8 (Oard (Pls. Expert)) (Google ranking systems use as much as 13 months of user-side data. It would take Bing years to get the same quantity of data, and by that point, the data’s lack of freshness would make it not useful for training ranking systems.).

b) Rivals Lack Access To Sufficient Scale To Fuel Their Development Cycle

1011. GSEs run experiments to ensure system changes improve system quality; these experiments thus facilitate improvements. Supra ¶¶ 197–198. GSEs use various metrics to evaluate search results, including metrics based on human-rater evaluations and live evaluations. Human-rater evaluations are experiments done with a pool of hired reviewers. UPX0872 at -848.

Live experiments are conducted on real users of the live GSE. Id. at -849. Using multiple metrics allows the GSE to assess different aspects of search quality. Tr. 1786:11–16 (Lehman (Google)) (Human rater and live experiment metrics “provide two different perspectives on search quality . . . .”); UPX0204 at -220 (discussing difference between rater-based and user-based evaluation).

1012. Scale substantially increases a GSE’s ability to perform live experiments. Des. Tr. 92:13–93:10 (Jain (Google) Dep.) (“Google has such a high volume of users, you can get to statistical significance very quickly.”). Google uses its scale to continuously run live experiments and improve its search engine. UPX0870 at -.015 (“Every time you use Google Search, you participate in a number of experiments to test how users interact with new features or algorithms.

Every search request you make usually hits experimental code.”); Tr. 2315:15–2316:1 (Giannandrea (Apple)) (agreeing that Google typically runs “thousands” of “live experiments” for “proposed improvements to its search product”). Google runs so many experiments that even at its scale, it “can run out [of] data easily.” UPX1059 at -304.

1013. Because human-rater evaluations are blind to some aspects of search quality, human-rater evaluations do not replace the need for scale to run live experiments. Tr. 1779:13– 20 (Lehman (Google)) (Human-rater evaluations capture many aspects of search quality, “but it can’t get quite all of them.”); UPX0204 at -223–25 (listing shortcomings of human rater evaluations including “[r]aters may not understand technical queries” and “[r]aters cannot accurately judge the popularity of anything”); UPX0872 at -848–49 (“[H]uman eval metrics have many limitations: raters are not users and may not be able to represent the user & rating data quality is a concern.”); Tr. 10326:8–10328:2 (Oard (Pls. Expert)) (describing aspects of search quality that human raters are not good at measuring).

1014. Because of the limitations of human rater evaluations, Google regularly uses live experiments as a measurement of “ground truth” for how real users are responding to changes. UPX0213 at -720 (“Limits on rater-based evaluation force us to take decisions based heavily on live experiments. . .”); Tr. 9039:24–9040:6 (Fitzpatrick (Google)) (“We often can learn a lot more once a product or feature is out in the wild, seeing usage at scale, than we can just when we’re testing in a lab or with a handful of people.”).

Because live experiments provide an important and unique measurement of search quality, Google typically does not launch a change to its systems without running a live experiment. Tr. 2315:15–2316:4 (Giannandrea (Apple)) (Google did not “typically” make a change to its algorithm without a live experiment.); DX0080 at -743 (“All Ranking experiments run [Live Experiments] (if possible).”).

1015. Without access to sufficient mobile traffic, Google’s rivals cannot run live experiments (or are unable to run as many live experiments as would be optimal) to improve their search products. Des. Tr. 146:23–149:3 (Ribas (Microsoft) Dep.) (Bing cannot run as many experiments on mobile because it does not have the necessary scale.).

3. The Effects Of Scale On Search Quality Are Uniquely Important

a) Investments In Non-Scale Methods Of Improving Search Quality Do Not Mitigate The Need For Scale

1016. There are both scale and non-scale dependent ways to improve a GSE’s search quality. Tr. 2664:11–2664:18 (Parakhin (Microsoft)) (“[Q]uality is an aggregate term. It, of course, requires certain components that are scale-dependent, and, of course, also requires certain components that are not scale-dependent . . . . .”); UPX1058 at -311 (2009 Microsoft document detailing the scale and non-scale gap between Bing and Google). Scale, however, can affect the search engine’s performance in unique ways—even aspects of search quality that might not initially seem scale dependent. Tr. 2664:19–2665:11 (Parakhin (Microsoft)).

For example, the speed of serving results—i.e., latency—depends on scale “because the higher the scale you have, the more likely it is that [a] query was issued that isn’t in [the] cache. And so in the cache somewhere closer to the user, you will have [a] higher density of end points.” Id.

1017. The effect of scale on search quality cannot be mitigated by non-scale factors such as additional investments in engineering headcount and machine learning. For example, engineering headcount is important “up to a point” but then around 2,000 to 3,000 engineers “it hits diminishing returns.” Tr. 2665:12–23 (Parakhin (Microsoft)) (explaining that this is why “companies like Yandex, [Naver] in Korea -- successful Korean search company or Bing all have roughly the same sized teams.”); id. 2666:21–2667:12 (“But without scale, even the best engineering has proven, at least empirically, to be virtually powerless.”).

Similarly, further investment in machine learning can help but “empirically, even significant improvement in algorithms does not tend to outweigh [the] importance of scale.” Tr. 2665:24–2666:11 (Parakhin (Microsoft)) (“If you don’t have scale, you can to a certain degree try to mitigate it by trying to be smarter and running more sophisticated machine learning algorithms.

It will give you some way forward, which is why Bing very quickly embraced machine learning and was fully machine learning-based even in early -- or late 2000s. It’s not a substitution or a solution, it can be mitigation.”).

1018. Using publicly available data does not replace the need for scale from user-side data. Tr. 2763:9–24 (Parakhin (Microsoft)) (“[U]sing open data, using deep models, using better search algorithms, you can mitigate effects of scale to certain degree. We haven’t seen like them being able to reverse effects of scale.”).

1019. Further, click-based signals (including the absence of clicks) tend to be more important than non-click-based signals in terms of their effect on search quality. Tr. 2652:15–24 (Parakhin (Microsoft)); UPX0213 at -717, -722–23 (There are three primary signals used in ranking: body (the text of a document), anchors (links between documents) and clicks, which are by far the most important of the three); id. at -723 (“Exploiting user feedback, principally clicks, has been the major theme of ranking work in the past decade.”).

1020. Scale, not techniques, is a key differentiator in search quality, as Google recognizes and has recognized for a long time. UPX0856 at -345 (“Google’s chief scientist Peter Norvig shared his view: We don’t have better algorithms than anyone else. We just have more data.”); UPX1055 at -621 (“[W]e tend to overestimate how great [Google’s] techniques are, and under-estimate the effect of [our competitors] only seeing a fraction of [Google’s] overall query stream, compounded over decades.”).

In a 2009 PowerPoint, Alan Eustace, then VP of Engineering at Google, wrote in a slide titled “The Power of Data” that “[a] ton of data is better than a[n] ounce of algorithm” and that “[s]ome simple methods that appear to fail with limited data can work well with enormous amounts of data.” UPX0186 at .026. Dr. Nayak, Google’s VP of Search, recognized the importance of data to serving results. Des. Tr. 179:6–180:11 (GooglePN 30(b)(6) Dep.) (if someone had all of Google’s algorithms, but none of the data, the algorithms would not work).

1021. Data is also the key differentiator for deep-learning systems like Google’s RankBrain. UPX0861 at -827 (“In reality, machine learning practitioners spend the most time investigating sources of training data, processing that data, cleaning it up, and so on. Training data is more important than architecture. Generally speaking, the more training data, the better.

So, you should always look to exploit the largest sources of training data that you can find. Even if the training data is noisy, as is the case with user click data used for RankBrain, the large volume of it . . . can help us extract some signal despite the noise.”).

b) Technological Improvements, Like Generative AI, Have Not Replaced The Need For Scale

1022. Generative AI (including large language model chatbots) performs a different function than traditional GSEs. “AI [lets] you do . . . things like summarization, presenting a single answer in ways that, honestly, search engines of old could not do.” Tr. 3696:15–3697:21 (Ramaswamy (Neeva)). “[F]iguring out what are the most relevant pages for a given query in a given context still benefits enormously from query click information. And it’s absolutely not the case that AI models eliminate that need or supplant that need.” Tr. 3696:15–3697:21 (Ramaswamy (Neeva)); Tr. 8287:3–5 (Reid (Google)) (“[V]ery much agree[s]” “that Bard is not the same as Google search.”); Tr. 8288:14–19 (Reid (Google)) (agreeing that “Bard is really separate from search”).

1023. Google and Bing’s Generative AI tools, like Bard, Search Generative Experience, and Bing Chat, synthesize the results of their respective traditional search systems. (Parakhin (Microsoft)) Tr. 2670:19–2671:9 (Parakhin (Microsoft)) (“The large language model [in Bing Chat] is used for reasoning and for providing the answer, but the base information is coming from search.”); id. 2670:10–18 (Bing Chat marries the functionality of ChatGPT and Bing); Tr. 8331:18–24 (Reid (Google)) (Google relies on the search index to verify or confirm Search Generative Experience (AI) responses against the results that it gets from Google search).

1024. Thus, because they rely on traditional search results, these systems, do not eliminate the need for scale. Tr. 2669:20–2670:2 (Parakhin (Microsoft)).

1025. Generative AI technology is still a nascent technology. Supra ¶ 393. It is still error prone and can produce inaccurate, out-of-date information. Tr. 8285:17–19, Tr. 8276:10–24 (Reid (Google)) (“[T]he technology is very nascent. It makes mistakes.”); UPX2068 at -454 (“There are known limitations with generative AI and [large language models], and search, even today, will not always get it right.”); Supra ¶ 394.

1026. Generative AI large language models are expensive to train both in terms of cost and time as compared to traditional search systems. Tr. 8278:12–18 (Reid (Google)) (Large language models are “definitely expensive to retrain”); id. 8281:17–24 (Language models require a large amount of computer power to “build a base foundation model,” making them “expensive to train.”); Tr. 6452:1–6 (Nayak (Google)) (Training and running AI models can be energy consumptive because of the computation required.); Tr. 6452:9–24 (Nayak (Google)) (AI Models such as MUM can be more expensive than core systems.).

1027. Generative AI models will not replace traditional search. Tr. 7528:25–7530:8 (Raghavan (Google)) (does not believe that in 10 years people will be doing everything through chatbots and large language models); id. 7530:25–7531:8 (discussing UPX2040 at -299 that AI through chatbots and large language models have not created a whole new world and caused the old world to go away).

1028. [INTENTIONALLY LEFT BLANK]

1029. As with self-driving cars, which may perform acceptably under controlled conditions but substantially worse in real-world conditions, a search engine based solely on large language models (without access to a traditional search system or user-side data) may be useful and perform well for certain categories of queries but could not be used to build a competitive fully functioning search engine. Tr. 2763:25–2765:2 (Parakhin (Microsoft)).

4. Scale Impacts Advertiser Participation And Search And Text Ads Relevance

1030. Scale affects the advertiser side of search in two ways: (1) scale drives advertiser participation on a GSE’s platform and (2) scale improves a GSE’s ability to show relevant ads and increase ad click-through rates. Tr. 5828:1–20 (Whinston (Pls. Expert)) (referencing UPXD104 at 55).

1031. As with general search, Google enjoys a massive scale advantage in Search Ads—particularly Text Ads, which Google targets using keywords and prices through a standalone auction and matching system. Google’s market share is overwhelming in both Text Ads— over 80% since 2016 and 88% in 2020—and in Search Ads as a whole—a little below 65% since 2012 and 74% in 2020. Tr. 4777:24–4778:15, 4779:7–15 (Whinston (Pls. Expert)) (referencing UPXD102 at 62–63). This advantage becomes even greater on mobile, where Google has an even greater share of Search Ads, especially in Text Ads, due to its 94.9% market share in mobile general search queries. Supra ¶ 525.

1032. Advertisers allocate the vast majority of their paid search spend to Google. Tr. 5141:19–5142:13 (Booth (The Home Depot)) (The Home Depot allocates an “industry standard” 90% of its paid search spend to Google versus about 10% to Bing because “Google has more search volume” and more auctions.); UPX0841 at -460 (Microsoft’s analysis in 2018 noted that Bing does not have “about [redacted]% of the advertiser domains that [it] observes in Google’s ad clicks.”); supra ¶ 588.

1033. Improving the quality of a GSE’s Search Ads (including Text Ads) increases the overall search experience, meaning Google’s scale advantage in Search Ads reinforces its scale advantage in general search. Tr. 4194:14–4195:25, 4234:11–4235:4 (Juda (Google)) (“[I]mproving the quality of the ads that a user sees is more likely to help them achieve whatever their search goal is, sort of more quickly, more effectively . . . .”); Tr. 1328:14–1329:6 (Dischler (Google)) (“[W]e believe that it’s an actually worse user experience to not have ads on the page.”).

1034. For Search Ads, to train its machine learning models and improve the accuracy of matching Search Ads to queries, Google relies on both observed behavior from actual users as well as data from paid raters. Tr. 4199:17–22 (Juda (Google)) (Google’s machine learning models for predicted click-through rate are “grounded on actual user activity.”); Tr. 1789:1–3 (Lehman (Google)) (Technologies developed in search with user data benefit other parts of Google, including Search Ads; referring to UPX0219 at -426); UPX0454 at -644 (“Today we rely on observed user behavior (e.g. whether a user clicked on an ad and stayed on the page for a long time) as well as ad evaluations from a paid pool of trained raters to train our ML models and evaluate our experiments”).

a) Google And Other GSEs Rely On User Data From Scale To Increase Ad Clicks

1035. As described in more detail, supra ¶¶ 140–145, when a user performs a search, Google and other GSEs run multiple auctions between relevant ads to determine (1) which ads (if any) will appear on the SERP, (2) the order they will appear, and (3) the CPC of each ad. Tr. 4010:17–4012:5 (Juda (Google)) (discussing UPX0842 at -000); UPX0010 at -054–57, -064– 66 (describing the series of auctions to fill each ad slot on the SERP and what inputs they use).

For every ad in every auction, Google calculates an LTV, which relies not just on the bid but also on a complex algorithm assessing the ad’s quality, i.e., long-term value to Google. Tr. 4248:12– 4249:4 (Juda (Google)).

As described in greater detail in, supra ¶ 641; to calculate LTV (also known as Ad Rank), Google’s systems produce three predictions for each ad: the predicted clickthrough rate (pCTR), the predicted quality of the ad’s landing page (pLQ), and the predicted quality of the ad copy itself (pCQ). UPX0010 at -054–57 (explaining components of LTV); UPX6027 at -567 (written 30(b)(6) response: identifying “primary predictions” used in ad auction—likelihood of a click, quality of the ad copy, and quality of the advertiser’s landing page); UPX6058 at -002–04 (Google Ads Help: “About ad position and Ad Rank”).

1036. Google runs billions of auctions and displays billons of Search Ads per day. Tr. 1198:24–1199:5 (Dischler (Google)). The company relies on this massive scale to train the components of its LTV algorithms, which benefit from Google’s scale advantage.

1037. Google describes pCTR as the “most important quality metric” within its LTV algorithm. UPX0010 at -059–60. Google trains its systems to predict click-through rates by observing its user’s reactions to the ads it displays.

Thus, Google’s pCTR algorithm relies on user-side data. UPX6027 at -566–67 (written 30(b)(6) response: “Google’s predicted clickthrough rate (pCTR) machine-learning model uses query and click data.”); Tr. 8878:17–8879:1 (Israel (Def. Expert)) (“[T]he PCTR score rel[ies] on clicks.”); Tr. 4199:17–22 (Juda (Google)) (Google’s pCTR models “are grounded on actual user activity.”); UPX0231 at -978 (Internal Google email discussing serving ads to DuckDuckGo users: Click data is “a huge quality signal” to ensure a good user experience, and it is “rather important” because Google can “see to what extent in practice [DuckDuckGo’s] CTRs are much lower than what we would expect” by comparing them to other search partners.); UPX0010 at -059 (“Users vote with their clicks, and the more users click on a particular ad in response to particular queries, the more we learn the high-quality nature of the ad.”).

1038. Components of Google’s pCTR algorithm train on quantities of data that far exceed that possessed by Google’s rivals. Tr. 8879:9–8880:24 (Israel (Def. Expert)) (At least one component of pCTR model uses 12 months of data.); id. 8881:2–9 (According to Dr. Israel, Google’s volume of queries compared to its rivals show how much more data it has compared to its rivals.).

1039. PLQ is another critical component of Google’s Ad Rank algorithm. It predicts the quality of an ad’s landing page, which is the page a user arrives at after clicking on the Search Ad (usually a related page on the advertisers’ web site). Tr. 4248:12–4249:4 (Juda (Google)); UPX0010 at -061–62.

1040. GSEs rely on user-side data in assessing landing page quality by assessing whether a users’ pattern of interactions with the GSE before and after an ad click is consistent with high or low-quality ads. Tr. 10281:14–10282:8 (Oard (Pls. Expert)) (User-side data is used to estimate landing page quality.); UPX0021 at -376.003–06. Google is no different: its pLQ algorithm predicts landing page quality using a model trained on user log data. UPX0021 at -376.003–06 (adopting pLQ measure based on sampled click data from user logs to estimate quality based on “the pattern of user interactions before and after the ad click”).

1041. Significantly, Google made the decision to train its current pLQ model on user log data only after determining alternatives were inadequate to model pLQ. In 2017, Google implemented the “Eagle Lunar Module” launch, which (1) abandoned training pLQ models using human-rater analysis of scraped ad/click data and (2) adopted, instead, a model relying on sampled ad clicks.

UPX0021 at -376.003, -376.006–07. With the Eagle Lunar Module launch, Google identified multiple benefits from switching to user logs data, including that “logs data has large scale and richness” and that actual user logs data enables directly measuring serving accuracy at scale. UPX0021 at -376.007; UPX0248 at -279. Thus, Eagle Lunar Module opened the door to a “[h]uge amount of training data,” increasing the data “from [redacted] (query, ad) to potentially [redacted] over a year.” UPX0021 at -376.011.

b) Empirical Evidence Shows The Value Of Scale To Ad Predictions

1042. Empirical analysis shows click-through rates (CTRs) are higher for ads accompanying more popular queries, indicating scale helps search engines serve ads likely to receive clicks.

1043. Prof. Whinston analyzed Bing’s and Google’s CTRs for top-slot, first-position Text Ads on both mobile phones and desktop and reached two conclusions based on how CTRs changed with the frequency of the query and the comparison of Bing’s and Google’s CTRs. Tr. 5831:15–5834:16 (Whinston (Pls. Expert)) (referencing UPXD104 at 58).

Looking at CTRs based on the frequency of the query, the analysis showed that, for more frequently seen queries, both Google’s and Bing’s CTRs are higher. Id.; UPX1058 at -328 (In a review of search monetization, Microsoft noted, “[m]any more phrases may lack sufficient scale on Yahoo and Microsoft than Google.").

Comparing CTRs on desktop and mobile, the analysis showed that Bing's CTRs were higher than Google's on desktop, but lower than Google's on mobile phones. Tr. 5831:15-5834:16 (Whinston (Pls. Expert)) (referencing UPXD104 at 58). Bing has significantly more scale on desktop than on mobile, and this analysis shows that where Bing has decent scale on desktop it does well. Id. "[W]here Bing has no scale at all"—on mobile—it is "much worse." Id.

1044. In a second analysis, Prof. Whinston compared users' click splits (i.e., the number of users quickly returning to Google after clicking on an ad compared to those that do not return quickly, supra¶991, for Search Ads served in response to head, torso, and tail queries. Tr. 5797:3-5798:15 (Whinston (Pls. Expert)).

Prof. Whinston observed worse click splits (meaning the user returned more quickly) for ads on tail queries than for those on torso queries, and worse click splits for ads on torso queries than for those on head queries. Id. Google commonly uses click splits as a measure of quality for both ads and organic results. Id.; supra ¶991.

Prof. Whinston's analysis shows that the quality of the Search Ads Google returns is worse for queries Google sees less often and better for those it sees more often.

1045. These three conclusions from these two analyses mirror the results of Prof. Whinston's analyses of search quality. Supra ¶¶990-991

c) Greater Relative Scale Benefits Monetization And Ad Quality

1046. A GSE's relative scale—its scale relative to its competitors—increases its ability to monetize its advertising inventory in at least two ways: by increasing the pool of available relevant ads and by increasing auction pressure.

1047. First, the greater a search engine's scale, the greater the number of advertisers interested in placing ads on the search engine. UPX0244 at .004 (“[S]mall increases in scale meaningfully improve advertiser participation, making it possible to offset fixed cost of investing in another platform."); UPX1058 at -328 (“Volume scale drives long-run advertiser behavior.”). There are costs associated with participating on an ad platform and if a platform has fewer ad opportunities, the benefits of participating may not outweigh the costs.

Tr. 5828:21-5829:7 (Whinston (Pls. Expert)); UPX0244 at .004; UPX1058 at -328 (“Advertisers maximize ROI through volume by comparing profits on each platform to fixed costs of entering a platform and ongoing costs of monitoring a campaign.”). Advertisers thus focus on the market leader when placing Search Ads. Supra ¶¶ 587-588.

1048. An analysis of queries and advertisers on Google (from 2010–2021) found that as the number of queries went up the number of advertisers also went up. Tr. 5829:23-5830:12 (Whinston (Pls. Expert)) (referencing UPXD104 at 56).

1049. Greater advertiser participation increases the pool of potentially relevant ads for each query, which increases the chance the GSE can present a relevant ad to a query. Tr. 2653:2— 13 (Parakhin (Microsoft)) (With more scale, “you'll have more advertisers, so you'll have a high selection of ads, so you'll have more to choose from, so you will pick up better ads."); Des. Tr. 50:6–51:2 (van der Kooi (Microsoft) Dep.) (“[S]cale drives not only the scale of users, [it] drives how many advertisers are willing to participate in the market."); Tr. 5828:1–20 (Whinston (Pls. Expert)) (“[S]cale drives advertiser participation . . . on a general search engine platform and that's going to be important for monetization. . . . And then second, scale improves a general search engine's ability to show relevant ads and increase ad click-through rates . . .”).

1050. The greater pool of ads conferred by relative scale is particularly important for tail and local ads. Des. Tr. 55:20-56:23 (van der Kooi (Microsoft) Dep.) (“[O]f the very important tail [of advertisers], we have only a small minority of advertisers. And that makes -- that is the biggest scale gap that we refer to, is indeed with tail advertisers that are crucially important when users type in queries that are more local.”); DX0504 at -871 (“Google has more ad coverage and depth in the lower torso and tail because of their demand strength.”).

1051. Some advertisers prefer to have their Search Ads appear only on desktop and some prefer to have their Search Ads appear only on mobile; greater scale on desktop does not attract advertisers interested only in advertising on mobile and vice versa. Tr. 2686:14–19 (Parakhin (Microsoft)); id. 2650:3–19 (Because user queries vary depending on the form factor (device), different advertisers prefer to advertise on different form factors, “whether it’s a desktop ad or mobile ad is very much significant for advertisers.”); UPX0117 at -001 (Search Ad dollars are shifting to mobile and Microsoft’s lack of mobile presence will put PC monetization at risk.); UPX1005 at -183 (“Mobile and Audience targeting represent 2 of the largest growth opportunities in Search Ads today.”).

ii. Greater Relative Scale Increases Auction Pressure And Monetization

1052. Second, increases in the pool of ads increases the number of ads competing for individual ad slots, thickening auctions. This increases revenue: as Google acknowledges, the more bidders in an auction, the greater the prices generated by that auction. Supra ¶ 622.

1053. The increases in auction pressure conferred by relative scale have a more-than-linear effect on revenue, generating increasing returns. A search engine with twice the scale of its competitors will generate more than twice as much revenue as the competitor. Tr. 2646:7–22 (Parakhin (Microsoft)) (Relative traffic “is very impactful for revenue. Revenue in search -- in advertising in general is nonlinear: If you’re twice as big as your opponent, you will make four times as much money.

Not exactly these numbers, but I’m just trying to illustrate the concept of nonlinearity.”); id. 2683:25–2684:18 (The relationship between relative scale and a search engine’s ability to generate revenue through Search Ads is “strongly nonlinear” because relative scale increases both number of clicks and makes each click more expensive.).

d) Scale Is Critical For The Search Ad Development Process, Including Conducting Experiments On Potential Launches

1054. Having more users and queries attracts more advertisers, which increases the speed and ability to make improvements to an advertising platform. Des. Tr. 50:6–51:2 (van der Kooi (Microsoft) Dep.) (“[I]f there is more users, more advertisers will follow. If there is more users . . . the data and the ability to make improvements in the platform, the speed at which that happens happens faster and, thereby, the product improves at a faster clip.”); id. 90:14–22 (describing “the flywheel”: “[M]ore users bring more advertisers, bring more product improvements, more quickly that, then, in turn, bring in more users.”); Tr. 2648:24–2649:10, 2650:20–24, 2652:2–14 (Parakhin (Microsoft)) (The advertiser flywheel shows that more users bring more advertisers and better search quality, which leads to more relevant ads.).

1055. Google’s process for ad launches involves multiple A/B experiments on steadily increasing levels of traffic. Supra ¶¶ 144–145 (§ III.C.3.b); UPX0889 at -787 (Google analyzes logs and advertiser experiments to evaluate pricing launches.). Google’s massive scale enables it to get to statistical significance very quickly when conducting experiments on potential Search Ads launches. Des. Tr. 92:3–93:10 (Jain (Google) Dep.); id. 52:8–53:2 (describing Google’s use of long-term live experiments used to measure ad “blindness,” a metric for implementing Search Ads launches).

1056. These launches do more than just refine Google’s existing technologies and algorithms; they also enable Google to develop entirely new technologies. Thus, broader advertiser participation helps firms to improve not just matches, but the underlying matching technology itself. UPX0234 at -120 (“[I]t is true that we make most of our money from a bunch of head advertisers, but the additional data we do need in order to, for example, improve our matching algorithms”). As with search, these improvements compound over time, exacerbating the effect of scale differences. Supra ¶ 199.

e) Google And Other GSEs Recognize The Benefits Of Scale To Search Ads

1057. Google and other GSEs recognize the benefits scale confers for Search Ads, including Text Ads. Des. Tr. 110:5–6, 110:8–11, 110:13–17 (Jain (Google) Dep.) (Increasing data relating to consumer activity is always valuable to Google, leading to better Search Ads and better organic results.); id. 66:18–67:15 (Google can “infer, based on pattern recognition” what ads are likely to be effective “across cohorts and large groups of people.”); Tr. 2653:2–13 (Parakhin (Microsoft)) (More user interaction data improves ad relevance because “basically if you have more traffic and know more, then your ads will be better.”); Tr. 2367:24–2368:8 (Giannandrea (Apple)) (User data helps with ad monetization because “[k]nowing what ads people want and which ones they click on is essential.”); Des. Tr. 275:12–276:4 (Stein (IAC) Dep.) (Additional clicks and query data would allow Ask.com to better monetize its queries.); Des. Tr. 180:17–25 (Ribas (Microsoft) Dep.) (“[T]he ads quality was generally lower than the quality of the other elements of the page . . . because of the scale problem. If we didn’t have scale, advertisers wouldn’t come. If they wouldn’t give us more ads, we couldn’t have ads that were as relevant.”).

1058. And Google recognizes the benefits of scale for Search Ads, even as it has publicly tried to suggest otherwise. In a 2009 email, Dr. Manber forwarded to Diane Tang a proposed blog post by Dr. Varian; Dr. Manbar wrote “he seems to be saying Ads Quality does not make good use of its data and does not need more. Is that true?” UPX0234 at -120. Ms. Tang responded to Dr. Manber, “no, it’s not really true. hal doesn’t really understand the ads serving system. ugh ugh ugh. it is true that we make most of our money from a bunch of head advertisers, but the additional data we do need in order to, for example, improve our matching algorithms. it’s also a matter of what traffic we get vs. what yahoo gets. it’s a massive oversimplification.” Id. Ms. Tang separately forwarded the chain to Dr. Ramaswamy, stating “is it just me, or is what hal wrote a bunch of hooey?” UPX0233 at -828. Ramaswamy responded “it is probably factually incorrect. the more advertisers bid on things, the more chance that things like smartass [Smart Ad Selection System] will learn what works and what doesn’t.” Id.

1059. Google recognizes the importance of scale for ad quality and ad relevance. Tr. 5830:13–5831:1 (Whinston (Pls. Expert)) (“[A]n extract from an article -- scholarly article published by 16 Google employees that was titled, ‘Ad Click Prediction, a View From the Trenches,’ published in 2013 and it’s just talking about . . . how difficult it is to predict what ad is a good, relevant ad for . . . a query. And . . . in bold you see here training data sets are enormous.” (referring to UPXD104 at 57)); UPX0023 at -719 (“[S]martass machine learning looks at historical data for how ads are clicked and uses that – learning all the time; model that makes these prediction[s] almost instantaneous and refreshed hourly.”); UPX0195 at -678 (“[I]n advertising the clicks of users continually generate more training data, and we end up with tens of petabytes of logs over a year or so so it’d be a huge waste of resources to train from scratch every time.”).

1060. No alternate means exist for rivals to acquire the scale necessary to exceed Google’s ad relevance. Des. Tr. 53:10–54:2 (van der Kooi (Microsoft) Dep.) (“[T]he ability to close the gap by bringing in ads from other ad platforms, . . . as a way to improve the advertising platform[,] is a very coarse and inadequate way of improving the advertising quality and advertising volume on the search engine.

There is no substitute for actually having advertisers participate in the marketplace where they, as many advertisers do on an ongoing basis, optimize the keywords that they bid on, the ad copy that they create, [and] all of the ways that they tweak and manage search advertising on a daily basis.”).

5. The Importance Of Scale Is Confirmed Through The Actions Of Market Participants

1061. Getting access to scale is particularly difficult on mobile because Apple and Android—both of which have committed all default scale to Google through exclusive arrangements—are the only two significant mobile distribution channels and effectively act as “gatekeepers” to mobile scale. Tr. 3102:12–3104:25, 3112:23–3113:20, 3276:16–3277:23 (Tinter (Microsoft)) (“it’s very, very, very hard to achieve any quality scale”); Des. Tr. 96:16– 23, 97:1–11 (Ramalingam (Yahoo) Dep.) (gaining distribution for Yahoo on mobile was “near impossible” because “there’s Apple and Android” and both were tied up with Google contracts).

1062. Microsoft has pursued a partnership with Apple because such a deal would provide Bing mobile scale and allow Bing to “make additional fixed cost investments on . . . search relevance or search scale.” Tr. 3502:3–20, 3503:9–16, 3508:6–18 (Nadella (Microsoft)); Tr. 2723:4–6 (Parakhin (Microsoft)) (a deal that gave Microsoft access to Apple’s mobile traffic would be “game changing” and Microsoft would “invest more into mobile”); Des. Tr. 94:11–19 (Ribas (Microsoft) Dep.) (a distribution deal with Apple would increase Microsoft’s incentives to invest in search); UPX0117 at -001 (“Scale benefits from bringing the Apple search volume into our marketplace are significant. . . . As search advertising dollars shift to mobile, lack of mobile presence will put PC monetization at risk. Apple partnership provides a foothold in the key mobile battleground.”); DX0435 at -741 (“a strategic partnership with Apple, IMO, is the only viable option for us at this point” to gain share on mobile devices).

1063. For example, when Bing sought the Safari default in 2015 and 2016, infra ¶¶ 1263–1272, Bing’s quality was a frequent topic of conversation with Apple; Microsoft told Apple that if Microsoft had more scale, “[Microsoft] could use that scale to rapidly improve the quality of Bing.” Tr. 3252:13–3255:03 (Tinter (Microsoft)); Tr. 2509:23–2510:11 (Cue (Apple)); UPX0613 and UPX0614 at -122 (2015 cover email and presentation sent from Mr. Nadella to Mr. Cook stating “[t]he combination of the large search volumes on Apple devices with Microsoft’s global search platform enables a high-quality search platform”).

1064. Microsoft sent Apple a document “explain[ing] the economic model built by the Microsoft team to model the impact of increased scale in the Bing Ads marketplace from a search partnership with Apple,” and stating that “[s]cale is crucial for delivering superior end user experience, publisher revenue, and advertiser ROI,” “[m]ore scale leads to greater investments, and enables innovations at a faster pace,” and “enables search engine[s] to offer a competitive product in all sides of the marketplace.” UPX0246 at -259.

1065. Again, during negotiations in 2018, discussions between Microsoft and Apple included exploring the investments that Microsoft would need to make “to take advantage of the scale” it would receive if it entered a deal with Apple. Tr. 3255:09–3256:16 (Tinter (Microsoft)); UPX0797 at -022, -024–27 (In a presentation Microsoft sent to Apple, Microsoft explains how it would invest to improve its international search quality with added scale from Apple.).

1066. In 2018, Microsoft was willing to consider all the options when it came to Apple, even selling Bing to Apple, because, “in a world where scale matters tremendously to [a GSE’s] ability to compete in mobile,” Microsoft considered a potential deal with Apple to be the most compelling and “the one that justified the most creative thinking.” Tr. 3276:16–3277:23 (Tinter (Microsoft)). According to Mr. Nadella, Microsoft was prepared to take billions in losses each year to obtain a distribution deal with Apple because gaining access to Apple’s mobile scale had the long-term potential to make Bing, specifically, and search, generally, more competitive. Tr. 3501:5–3504:17 (Nadella (Microsoft)).

Continue Reading Here.

About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.

This court case retrieved on April 30, 2024, storage.courtlistener is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.

← Previous

Microsoft and OpenAI Accused of Copyright Theft

Up Next →

Google’s Contracts Reduce Investment And Innovation Among Market Participants