United States of America v. Google LLC., Court Filing, retrieved on April 30, 2024, is part of HackerNoon’s Legal PDF Series. You can jump to any part of this filing here. This part is 10 of 37.
E. Scale Is Vital For General Search Engines
159. “Scale” refers to the amount of user-side data a search engine can accumulate. Tr. 3695:6–3696:10 (Ramaswamy (Neeva)) (“[S]cale here refers to how much query click information is one able to collect.”).
User-side data is a term that includes many types of data GSEs can obtain from users—including a query and its ranked results—who interact or engage with their search engines. UPX0262 at -989 (“User interaction signals include clicks as well as all other interactions from users with search results or search result pages, which can be mined from session logs.”); UPX0212 at -122 (“To work around our inability to understand documents, we observe and recall human reactions to those documents. The form of reaction we rely upon most heavily in web ranking is clicks on search results.”)
The most basic and important interaction is a click—such as when a user clicks on a link. Tr. 2650:25–2651:7 (Parakhin (Microsoft)). GSEs track many types of user-side data, including when a user reads, pays attention to, scrolls, or hovers on a result. Tr. 1767:21–1771:14 (Lehman (Google)) (discussing UPX0004); Tr. 2255:9–18 (Giannandrea (Apple)).
160. GSEs track the nuances of these interactions. For example, GSEs track how much time users spend on a page after clicking, how quickly users click back, whether users scroll down, or even which results the users didn’t click on. Tr. 2650:25–2651:7 (Parakhin (Microsoft)); id. 2651:8–2652:1 (abandoning a SERP is an example of user-side data showing a bad result); Tr. 1767:21–1771:22 (Lehman (Google)) (discussing UPX0004 at .004).
161. GSEs also track other important information about the user. For example, the time of day a search was issued, where it was issued from, and the device type used. Tr. 2256:11– 2257:10 (Giannandrea (Apple)) (location and time of day are also useful search signals); Tr. 2661:17–20 (Parakhin (Microsoft)) (Microsoft tracks search traffic by device type or form factor); Tr. 6416:24–6417:4 (Nayak (Google)) (Google tracks what type of device from which each query is issued).
162. Users interact with Google billions of times a day, and Google logs this user data. Tr. 1761:16–21 (Lehman (Google)) (discussing UPX0228 at -503); UPX0870 at .016 (“[Google] logs data about every search result that appears on the SRP.”). In fact, Google retains anonymized user-side data indefinitely. Tr. 6395:19–6396:9 (Nayak (Google)) (stating that “to the extent that it is anonymized and de-identified,” Google will “keep it” and did not know Google “would delete it”).
1. Scale Enables General Search Engines To Return Better Results
163. Scale is vital for improving many aspects of search, including both quality and revenue. Tr. 2644:13–14 (Parakhin (Microsoft)) (explaining that scale “affects greatly many aspects of both quality and revenue”); id. 2644:20–2646:2 (explaining that more scale results in more clicks and more user behavior, which will “very directly influence search quality,” and causes websites to “optimize for the most popular search engine,” ultimately providing “better results”). Google initially held the opinion that scale was integral to search. Tr. 238:12–239:5 (Varian (Google)) (in one year Dr. Varian went from saying data is integral to scale is bogus).
164. The more search queries a GSE sees, “the better search quality you’re going to have by definition.” Tr. 3496:12–16 (Nadella (Microsoft)); Des. Tr. 225:6–13 (Ribas (Microsoft) Dep.) (“In this business, data is everything . . . . [S]cale is so critical. And so every additional data that we can get . . . . It’s really going to be helpful for us to improve the quality of our results.”).
The value of user-side data continues to accrue beyond the point of diminishing returns. Tr. 10078:3–12 (Murphy (Def. Expert)) (“[T]here’s pretty much always diminishing returns, but that doesn’t mean they’re not valuable even after some diminishing returns have set in.”); Tr. 6637:6–6338:6 (Nayak (Google)) (acknowledging that Google deploys user-side data beyond the point of diminishing returns so long as its value outweighs the cost); Tr. 10349:15– 10351:7 (Oard (Pls. Expert)) (Google values user-side data beyond its competitor’s shares because, despite the costs, Google deploys 13 months’ worth of user-side to train its algorithms.).
The amount of data a GSE has affects where that search engine is on the diminishing returns curve; GSEs with less data receive higher returns to additional data than GSEs with more data. Tr. 10346:23–10350:7 (Oard (Pls. Expert)) (“[W]hen you have very little [data], then not only do you get better, but you keep getting better at a faster and faster rate . . . .”).
165. Google deploys user-side data throughout its systems. Tr. 1789:4–16 (Lehman (Google)) (“Not one system but a great many within ranking are built on logs. This isn’t just traditional systems . . . but also the most cutting-edge machine learning systems.”) (quoting UPX0219 at -426); Tr. 1762:23–1763:3 (Lehman (Google)) (“[F]or whatever volume of data we have . . . we make copies of it, and each copy . . . goes to a different ranking component.”). Crawling, indexing, retrieval, query refinement, web ranking, search features, and wholepage ranking are all improved through scale, allowing Google to return better search results. Infra ¶¶ 167–195 (§§ III.E.1.a–g).
166. Google concedes that scale gives Google a “competitive advantage” and that scale is the “magic” behind Google’s success. UPX0203 at -906 (“We look at people. If a document gets a positive reaction, we figure it is good. If the reaction is negative, it is probably bad. Grossly simplified, this is the source of Google’s magic.”); UPX0228 at -501 (“[M]ost of the knowledge that powers Google, that makes it magical, originates in the minds of users. Users are the founts of knowledge--not us.”); id. at -503 (“After a few hundred billion rounds we start lookin’ pretty smart! This isn’t the only way we learn, but the most effective.”); UPX0189 at -218 (“But *sessions logs* are our unique competitive advantage.”); Tr. 2313:24–2315:11 (Giannandrea (Apple)) (stating that when he was head of Google Search he was “very much” against sharing click data with Apple because it was Google’s “secret sauce”) (discussing UPX0235 at -391).
a) Crawling Benefits From User-Side Data
167. Crawling benefits from user-side data. The order and frequency in which a GSE crawls the web is “one of the most important problems” in building a GSE, and user data helps determine which places to crawl more or less frequently. Tr. 2207:1–9 (Giannandrea (Apple)). Crawling websites creates a cost to the websites’ owners. Tr. 2656:19–2658:24 (Parakhin (Microsoft)).
Because of these costs, website owners give search engines with greater scale more latitude to crawl their sites since websites will get a higher return in the form of traffic. Tr. 2656:19–2658:24 (Parakhin (Microsoft)).
Similarly, websites will optimize to allow crawling by search engines with greater scale. Id. 2656:19–2658:24. For smaller search engines, the crawling costs to the website owners outweigh the benefits, so the website owners often prohibit such crawling. Id. 2656:19–2658:24.
b) Indexing Benefits From User-Side Data
168. Indexing benefits from user-side data. Tr. 2210:22–2211:4 (Giannandrea (Apple)). User-side data enables a GSE to know which pages must continue to be maintained in the index. Tr. 6310:6–20 (Nayak (Google)). Also, for efficiency, an index must be broken up into smaller pieces and organized in tiers based on the likelihood the information will be retrieved. UPX0870 at .013 (“[W]e actually divide the index into several smaller indexes called tiers.
Each page is assigned to a tier based on how fresh it needs to be and the fresher tiers are built more frequently.” (emphasis in original)). User-side data is used by GSEs to determine how to best organize their indexes. Tr. 2211:2–17 (Giannandrea (Apple)) (knowing which queries are popular is an important part of indexing “[b]ecause you would want to make sure that you had covered queries that you see frequently.”); Tr. 10274:3–10275:13 (Oard (Pls. Expert)) (Google deploys user-side data to improve its index).
c) Query Understanding And Refinement
169. To return relevant results, GSEs begin by attempting to understand the user’s query and its intent. UPX0213 at -715 (“Understanding the search query is a preliminary step in ranking.”); UPX0870 at .004 (“[W]e do our best to determine the possible intents of the query and use that to break the query into multiple forms that get passed to different systems, depending on that intention, for further processing.”); UPX0194 at -556 (“Understanding the meaning of a query is crucial to returning good answers.”).
170. Query understanding includes refining the issued query to match its intent. UPX0870 at .016 (“If we sent the raw query…to the Search servers and did a keyword search of the index, we would come up with some search results, but they might not match the user’s intent.”); UPX0194 at -556–57 (Google uses signals to “identify user intent and match it to relevant documents”). Among the important aspects of query understanding and refinement is correcting for spelling errors or word choices (i.e., adding synonyms). UPX0870 at .016–17.
171. Spelling and synonym systems used for query understanding and refinement have a significant effect on search quality. UPX0196 at -158 (“Spelling errors in search queries are very common, so correction is vital.”); UPX0194 at -556 (“Our testing shows that understanding synonyms significantly improves results in over 30% of searches . . . .”); UPX0870 at .016 (“Sometimes including synonyms for a query term in a keyword search improves results.”).
172. Google relies heavily on user-side data to improve its various synonym and spelling-based systems. Tr. 8088:21–24 (Gomes (Google)) (Google has benefited from and continues to benefit from user data to improve spelling correction, auto-complete, and synonym matching); Tr. 2272:10–2273:10 (Giannandrea (Apple)) (between 2010 and 2018, one of the ways that Google became better at spelling was through user engagement); UPX0184 at -912 (“The key issue here as I see it is that you do get better as you have more users -- that’s why we have the best spell check, the best personalized search, the best refinements, etc.”); Tr. 227:13– 228:11 (Varian (Google)) (explaining that “[t]he Google spell checker is based on our analysis of user searches compiled from our logs -- not a dictionary,” discussing UPX0862 at -707).
173. For example, Google autocomplete and “Did you mean?” systems rely on userside data to help users formulate a query or suggest a replacement for a misspecified query. UPX0224 at -914 (“1 in 10 search queries are misspelled - but it doesn’t matter, because our “Did you mean” feature is there to help. We’ve been building this spelling technology for 18 years. How? By looking for mistakes.
We look at all the ways in which people mis-spell words in queries and text all over the web, and use that to predict what you actually mean.”); UPX0857 at -015 (autocomplete predicted queries are determined based on objective factors, including popularity of search terms; the data is updated frequently to offer fresh and rising search queries); UPX0863 at -531, -553 (Google’s system for suppressing or promoting spelling suggestions is “[t]rained on user clicks for queries with suggestions from session logs”).
174. By helping users better articulate their needs, these systems help Google return more relevant results and thus improve their search quality. UPX0870 at .016 (“If we sent the raw query. . . to the Search servers and did a keyword search of the index, we would come up with some search results, but they might not match the user’s intent. . . . Using just keywords . . . you will lose some of the intent the user originally had when they entered the query”); UPX0194 at -556–57 (“Understanding the meaning of a query . . . involves more than just finding the pages that contain the keywords in your query.”); id. at -556 (Google’s synonym system “helps bridge the gap between query and document vocabulary.”).
d) Retrieval Benefits From User-Side Data
175. Retrieval is a preliminary stage of scoring documents that may be responsive to a user query. UPX0213 at -729. Ideally, the retrieval process will not gather all the responsive documents but will gather all the best documents from the index. Id.; Tr. 6330:25–6332.11 (Nayak (Google)) (“A typical query might have millions of documents on the web that match it, but there’s no way that in the fraction of a second that we need to do all this in we can look at a million or millions of documents and retrieve them. So instead, what we do is we have a retrieval process that gets us of the order of tens of thousands of documents from the index that you can actually look at.”).
176. One important system Google uses for retrieval is a deep-learning system, called RankEmbedBERT. Tr. 6451:4–6 (Nayak (Google)). On top of traditional retrieval systems, RankEmbedBERT retrieves a few more documents to be scored in the ranking phase. Id. 6451:4–6. Google trains RankEmbedBERT on click and query data. Id. 6448:20–25; UPX0868 at -610 (“We train [RankEmbedBERT] on [redacted] queries, randomly sampled from [redacted] of Qsessions.”).
e) Web Ranking Benefits From User-Side Data
177. Web ranking is the process of determining which web results provide the most useful responses for a query and organizing them in order of usefulness. UPX0870 at .002 (“At a high level, Search consists of two fundamental stages: 1. Organizing information from webpages into a search index. 2. Serving query results by sorting through the search index and finding the most relevant, useful results.”).
178. Obtaining user-side data at scale is critical to a GSE’s ability to improve its web ranking. Tr. 1801:21–1802:6 (Lehman (Google)) (acknowledging authorship and agreeing with “Exploiting user feedback, principally clicks, has been the major theme of ranking work for the past decade.” (quoting UPX0213 at -723)); id. 1777:15–1778:4 (“[W]hen thinking about . . . the value of the search results for a query, relevance is the most important consideration,” and “having user data is useful to Google in identifying relevant results for a search query.”); Tr. 3695:6–3696:10 (Ramaswamy (Neeva))
(“[O]ne of the biggest signals that all search engines have relied on for the past 20 years is this thing that you talk about, which is query click information”); Tr. 2271:3–8 (Giannandrea (Apple)) (agreeing that “engagement signals” are the “most powerful signals”); UPX0226 at -483 (“Learning from this user feedback is perhaps the central way that web ranking has improved for 15 years.”).
179. GSEs derive patterns from user-side data to gauge user satisfaction and improve the ranking of their search results. Tr. 2652:2–14 (Parakhin (Microsoft)) (“The more data of this nature we have, the more we can train algorithms to be better in predicting what is good and what is bad.”); Tr. 8090:24–8091:2 (Gomes (Google)) (agreeing that clicks are useful for understanding user needs and evaluating search results); UPX0005 at -806 (“All of these [user interactions] help [Google] understand user preference, user information consumption patterns, query intent, and more.”).
180. For example, if a user clicks on a lower-level result, that provides a “signal” to a GSE that, next time, the result should be moved up in the SERP. UPX0266 at -984 (“If you show the right answer at position @3 and people click on it more than @1 then you know that you should be ranking it right and you can learn from this.”); UPX0228 at -514 (“A click at 1 is a pat on the back. A click at 8 tells us how to improve.”); UPX0225 at -285 (“If the document gets a positive reaction, we figure it is good.
If the reaction is negative, it is probably bad. We understand documents by proxy.”); UPX0228 at -502 (“As people interact with search, their actions teach us about the world.
For example, a click might tell us that an image was better than a web result. Or a long look might mean a KP [knowledge panel] was interesting. We log these actions, and then scoring teams extract both narrow and general patterns.”); UPX0196 at -175 (“One can regard each [SERP] as a massive multiple-choice test. Each day, we get to ask humanity a billion questions of the form, ‘Which of these 10 documents is most relevant to your query?’”).
i. Google Deploys User-Side Data At Scale To Train Traditional Systems That Improve Its Web Ranking
181. Google uses traditional ranking systems to narrow the document set that the search engine retrieves from its index. Tr. 6399:10–22 (Nayak (Google)) (Google uses “core algorithms” or traditional ranking systems to sift the documents down from tens of thousands to several hundred and score the documents); Des. Tr. 64:24–65:14 (Google-PN 30(b)(6) Dep.) (Once a user puts in a query, the core ranking algorithm identifies a subset of results that might be relevant); UPX0192 at -770 (“[C]lick prediction now underlies many scoring systems: navboost . . . rankbrain . . . QBST . . . .”).
182. Generally, for Google search, traditional systems that count and tabulate results are “by far the most impactful techniques in web ranking.” UPX0191 at -184. Two examples of important traditional systems that are used in ranking are Navboost and Query-Based Salient Terms (QBST). Id.; Tr. 1837:22–1839:4 (Lehman (Google)) (Navboost and QBST are memorization systems that have “become very good at memorizing little facts about the world”); id. 1806:2–15 (“Navboost records clicks on search results for queries. . . .”); UPX0196 at -175 (“Originally, Navboost was almost a pure memorization system; that is, it remembered that document D got a click for query Q. Based on this, we would show document D to the next person who issued query Q. Over time, Navboost has acquired some cross-query generalization; that is, we conclude that document D is also relevant for some related queries Q’, Q’’, etc.”).
183. NavBoost is one of the most, if not the most, impactful systems on Google's search quality. Tr. 2214:22–2215:4 (Giannandrea (Apple)) (Navboost is a “very important” at Google); UPX0197 at -214 (“I'm pretty sure that NavBoost alone was/is more positive on clicks (and likely even on precision/utility metrics) by itself than the rest of ranking."); UPX0190 at -740 ("Navboost remains one of the most power ranking components historically...."); UPX0196 at -175 ("Navboost is the original click exploitation system and still the most potent."). QBST also has a substantial effect on Google's search quality. UPX0887 at -110 (“As a ranking signal, [QBST] is one of the most positive components of Web Search in identifying relevant docs for queries."); Tr. 1808:11–13 (Lehman (Google)) (confirming that “QBST helps identify relevant documents to respond to queries”); id. 1837:22-1839:4 (Navboost and QBST are memorization systems that have "become very good at memorizing little facts about the world").
184. Traditional ranking systems such as Navboost and QBST are trained on enormous amounts of user-side data. Tr. 6405:15–22 (Nayak (Google)) (Navboost memorizes all clicks that have been issued for all queries received in the last 13 months); Tr. 1805:6–13 (Lehman (Google)); UPX1007 at -371 (showing that QBST “[a]ggregate[s] the data over 13 months”); Tr. 1757:25-1758:3 (Lehman (Google)) (“Some of the systems that Google uses in ranking search results are machine learning systems that train on user data.”).
ii. Google Deploys User-Side Data To Train Deep-Learning Systems That Improve Web Ranking
150. Google uses newer “deep-learning” ranking systems to fine-tune the search ranking. Tr. 6399:23–25 (Nayak (Google)). The primary deep-learning systems Google uses in ranking are RankBrain and DeepRank. UPX0255 at .010.
186. RankBrain and DeepRank are trained on user-side data. Tr. 6433:9–13 (Nayak (Google)); UPX0255 at .010–.011 (showing that DeepRank uses on the order of [redacted] training examples); UPX0003 at -762 (“RankBrain . . . [t]rained on [redacted] of pairwise click preferences of titles and documents.”). Like traditional ranking systems, deep-learning systems derive patterns from user-side data to deliver better results. UPX0213 at -724 (“RankBrain uses deep learning to extract and exploit patterns in click data.”); UPX0237 at -879–80 (“Here is how RankBrain works at a high level.
First you gotta train it: You go through search logs and effectively make a giant stack of index cards. Each card lists a user query and two search results- - but just TITLES and URLs, nothing else. The search result that got a user click gets a check mark. Then you hand Brain a [redacted] of these index cards, and it thinks things over for about a week. . . . Then RankBrain is ready to go! You can show it a query and two results (again, just the title and URL), and RankBrain figures out which result a user would prefer. Then you can rank a whole set of search results by aggregating a bunch of these pairwise preferences.”).
187. Although deep-learning systems can be trained on data available from the open web (open data) or data obtained from paid human evaluators (human rater data), those systems would not perform as well as deep-learning systems trained on user-side data. Tr. 2762:23– 2763:24 (Parakhin (Microsoft)) (As compared to training on open data “[T]raining using data -- user interaction data is one of the most powerful things, and that’s what we observe consistently.”); UPX0197 at -214 (“BERT itself performs much better when trained on webanswers specific data, compared to public data (wikipedia & co).”).
188. Google’s deep-learning systems such as RankBrain, and DeepRank do not replace traditional systems in web ranking; instead they are complementary to traditional systems. Tr. 6440:13–18 (Nayak (Google)) (“[BERT] does not subsume big memorization systems, navboost, QBST, etc.” (discussing UPX0860)); id. 6430:18–22 (classifying traditional systems and machine learning systems as “additive”); UPX0256 at -188 (“And again, DeepRank doesn’t replace the prod ranking. Instead, it is ensemble with the other signals we use in Web Ranking.”); UPX0191 at -222 (“RankBrain, which is the foremost ML system in search, is still only one ranking component among many.”); id. at -223 (“NavBoost (a glorified counting-based system for memorizing clicks) is still by far the most important component in search.”).
189. In many important aspects, like fresh and longtail queries, traditional systems still outperform deep-learning models. UPX0255 at .014 (“Can we scale up ML models to be better than NavBoost? . . . [A]s far as I can tell none of these deep-learning models are as powerful as NavBoost. To some degree, this is not surprising. The Navboost Glue data is close to [redacted] in size. In contrast, models like RankBrain and RankEmbed are [redacted] in size, with DeepRank and RankBERT being significantly smaller. This likely makes it hard for the models to learn or memorize truly long-tail information on relevance or user preferences.”).
Traditional ranking systems such as Navboost are also better at handling fresh (where the responses may have recently changed) and popular queries (where many duplicate queries are seen). UPX0256 at -185 (“RankBrain could not be refreshed fast enough, compared to simple counting pipelines like Instant NavBoost. That means, for very fresh and popular queries, NavBoost would predict things better, and so we’d back-off from RankBrain to NavBoost in those cases.”); UPX0214 at -696 (identifying “freshness” as one of several areas in which “deep models are making limited or no inroads” and stating that for freshness “[v]ery fast counting systems like instant navboost beat learning approaches”).
190. Further, as deep-learning models grow in size and capability, so does their computational cost. UPX2029 at -075 (“Training is also more computationally expensive, forcing us to use 100–1000x less training data than RankBrain.”).
f) Search Features Benefit From User-Side Data
191. Search features are an important part of providing responsive results to a query. Tr. 1788:18–21 (Lehman (Google)) (“[W]eb ranking is only a part of search. . . .” (quoting UPX0219 at -426)); UPX1114 at -168 (“Small fraction of SERP is web results for many queries”). Many search features benefit from user-side data, including image and local search. UPX0219 at -426 (“[M]any search features use web results to understand what a query is about and trigger accordingly.”).
192. Like user interactions with web results, user interactions with search features allow GSEs to derive patterns in the data to improve their search feature results. UPX0251 at -882 (“Image search implicitly poses a . . . multiple-choice question -- which do you like best? Thumbnails inform the user response, the users answer is logged as a hover, click, and clickthru.”); UPX0862 at -707 (“[W]e’ve had a lot of success in using query data to improve our information about geographic locations, enabling us to provide better local search.”); Tr. 228:12– 19 (Varian (Google)) (query data improves local service by providing more information about geographic locations (discussing UPX0862 at -707)).
g) Wholepage Ranking (Ranking Of Web Results And Search Features) Substantially Benefits From User-Side Data
193. Wholepage ranking refers to the organization of the entire search results page beyond web results or “ten blue links.” UPX0213 at -727. A SERP contains many components: web results, maps, answers, news blocks, knowledge cards, and much more. Wholepage ranking defines how Google arranges these various components into a single page. UPX0213 at -727;UPX0196 at -179. Wholepage ranking benefits from user-side data. Tr. 2307:13–22 (Giannandrea (Apple)) (stating that scale is “relevant” for determining what search features to “prioritize” on a search engine results page).
i. Google Deploys User-Side Data To Train Systems Used To Rank Whole Page Results
194. Google uses a system called Tangram (formerly, Tetris) to rank and then organize whole page results. Tr. 6408:8–18 (Nayak (Google)); UPX0004 at .059 (“Tetris: Goals: Rank all results using common signals for globally optimal ranking, common place to balance IS, recall and precision”); id. at 060 (Tetris “[o]ptimally rank[s] web and non-web results using a common set of signals”); UPX0003 at -763 (illustrating that Tetris is used to score “everything on the page” including Web Answers, Video Universal, Image Universal, and web documents). Tetris/Tangram have a substantial impact on search quality. UPX0190 at -740 (“Tetris definitely helped a lot here by ranking all the features on the page better, even though we don’t have a measurement on the cumulative IS impact for Tetris alone.” (emphasis in original)); UPX1120 at -517 (“>95% of all results go through Tetris”).
195. A critical input to Tetris/Tangram is a system called “Glue.” Tr. 6408:8–18 (Nayak (Google)) (Glue is a signal within Tetris/Tangram that triggers search features in the results alongside the web result); UPX0262 at -989 (Glue “is one of the critical signals in Tetris”). Glue records all forms of user interactions (beyond just clicks) on all results (beyond web results) over a 13-month period. Tr. 1806:2–15 (Lehman (Google)) (“[T]here are other types of interactions with a search page, and there are other things on a search page besides just web search results. . . . Glue attempts to record all those other interaction types on all those other elements of the search page for different queries.”); UPX0004 at .006 (The Glue pipeline “capture[s] user-interactions”); UPX0005 at -811 (showing Glue cache is 13 months).
The user interactions Glue records over 13 months include varied user-side data like clicks, attention, hovers, scrolls etc. UPX0262 at -992 (“Glue is interested in all query events regardless of clicks, because it needs to compute abandonment and attention signals.”); UPX0005 at -806 (listing clicks, attention, refinement, and swipes as among the user interactions on a SERP).
2. Scale Is Critical To The Development Process
196. Scale is a vital element for improving a GSE. Scale enables a GSE to observe more failures and use that information to identify ways to improve search results. Des. Tr. 154:5–13, 156:9–15, 177:4–178:10 (Google-PN 30(b)(6) Dep.) (Google looks for patterns in failed-query reports and finds ways to improve queries.); Tr. 2257:11–15 (Giannandrea (Apple)) (“[T]he more queries a search engine sees the more opportunities . . . the engineers have to look for patterns and improve the algorithm.”); UPX0870 at .016 (“Analysis of logs data figures into launch decisions, ranking changes, and machine-learning data.”).
197. GSEs also run experiments to ensure system changes result in quality improvements. UPX0265 at -476 (showing experiments undertaken for 665 search launches in 2015, including 118,812 precision evaluations, 10,391 side-by-side experiments, and 7,018 livetraffic experiments). Experiments are critical for GSEs to improve their search services. Des. Tr. 148:16–149:3 (Ribas (Microsoft) Dep.) (“Experiments are very critical for both improving the quality and improving the ability to grow. . . .”); UPX0213 at -714 (“Evaluation is the foundation of ranking.”).
198. Scale allows search engines to run better experiments, in terms of accuracy and speed, to test potential system changes. Tr. 2646:7–22 (Parakhin (Microsoft)) (“If I have enough . . . traffic, I can quicker understand whether my changes are good or not or run more experiments at the same time.”); Des. Tr. 62:6–63:18 (Ribas (Microsoft) Dep.) (Testing on a broader set of queries would allow Bing to improve more.); Des. Tr. 276:19–277:2 (Stein (IAC) Dep.) (agreeing that having additional clicks and query data would allow Ask.com to run more accurate experiments); Tr. 5793:24–5795:3 (Whinston (Pls. Expert)) (“[W]hen you don’t have a lot of scale, you can’t do a lot of these experiments. And moreover, the experiments that you do will tend to have smaller samples. So it’s either going to be less precise, if you let the experiment go for the same amount of time, or it’s going to have to go a lot longer. That’s just a basic property of statistics: The bigger the sample, the more precise the results.”); UPX1059 at -304 (For experiments, “[t]he more data you collect, the narrower the confidence interval,” i.e., the more precisely the effects are measured.).
199. The scale benefits in development compound over time because of the iterative nature of the development cycle. Tr. 1791:16–1796:15 (Lehman (Google)) (better results leads to more informed user interaction, which leads to better training data, which leads to better models, which again leads to better results, and thus creates a “virtuous cycle” of improvement (discussing UPX1115 at -529)); UPX1120 at -532 (depicting the iterative nature of the cycle of product development); Tr. 10318:9–24 (Oard (Pls. Expert)) (ranking signals themselves are based upon user-side data that has been used over the years to develop those systems).
3. Scale Enables Better Targeted Ads And Improves Monetization
200. Scale also improves a search engines’ Search Ads products by improving the targeting of the ads, increasing the pool of available ads, and improving monetization of ads. Des. Tr. 110:5–17 (Jain (Google) Dep.) (Providing larger amounts of data to Google would, in turn, result in “[b]etter [s]earch ads, better organic results.”); infra ¶¶ 1030–1060 (§ VIII.A.4).
201. Google uses user data in two of the three major components of the auction that selects, ranks, and prices Search Ads. Infra ¶¶ 638–646 (§ V.C.5.b). First, Google trains its systems to predict ad click-through rates by relying on click and query data. UPX6027 at -567 (written 30(b)(6) response: “Google’s predicted click-through rate (pCTR) machine learning model uses query and click data.”).
Components of its pCTR algorithm train on quantities of data greatly exceeding that possessed by any of Google’s rivals. Tr. 8880:11–8881:9 (Israel (Def. Expert)) (acknowledging at least one component of pCTR model uses 12 months of data). Second, Google relies on click and query data to predict the quality of a Search Ad’s landing page, UPX0021 at -376.006 (2017 launch adopting pLQ model trained on logs); Google made the decision to do this after determining user data produced better predictions than alternate methods. UPX0021 at -376.003, -376.006–07.
4. Scale Is Critical To Developing Ad Improvements
202. Google tests each change it makes to its ad systems—pricing, targeting, appearance, and so on—using a series of live, A/B experiments on user traffic. Supra ¶¶ 144– 145 (§ III.C.3.b); UPX0889 at -787 (describing how Google uses logs analysis and advertiser experiments to evaluate pricing launches). Because of its massive scale, Google can get to statistical significance very quickly when conducting experiments on potential Search Ads launches. Des. Tr. 92:3–93:10 (Jain (Google) Dep.).
5. Contrary To Its Public Position, Internally Google Has Long Acknowledged The Importance Of Scale For General Search Engines
203. Google executives have long recognized the importance of user-side data at scale. For example, in 2008, Dr. Varian acknowledged the importance of scale. UPX0862 at -706–07 (concluding that “using data is integral to making Google web search valuable to our users” and that “the data in our search logs will certainly be a critical component of future breakthrough.”). But by 2009, in public comments, Dr. Varian began calling scale effects “bogus.” UPX0178 at -433; UPX0884 at -604.
204. Internally Google Search executives knew better and responded negatively to Dr. Varian’s public statements minimizing the effects of scale. UPX0183 at -250 (Dr. Varian’s public statements about scale are “bogus and misleading.”). In an August 21, 2009 email chain between Udi Manber and Melissa Mayer regarding a TIME Magazine piece in which Dr. Varian sought to minimize the value of scale, Dr. Manber wrote “I wish we [could] find a way to downplay Hal’s comments, as he was just plain wrong.
I know it reads well, but unfortunately it’s factually wrong.” Ms. Mayer responded that “[t]he key issue here as I see it is that you do get better as you have more users -- that’s why we have the best spell check, the best personalized search, the best refinements, etc. Most people who understand AT or machine learning as well as the size/scale of data would question his assertion/know that it’s unlikely.” UPX0177 at -419.
205. In an August 2009 email exchange with Dr. Varian, Dr. Manber stated that “it’s absolutely not true that scale is not important. We make very good use of everything we get. [User Interface] experiments are done on a small percentage but ranking is using a lot more.” UPX0179 at -435. Dr. Manber explained that “The bottom line is this. If Microsoft had the same traffic we have their quality will improve *significantly*, and if we had the same traffic they have, ours will drop significantly. That’s a fact. Is it the only factor? Of course not. Nothing is. Do we have other advantages? Of course we do. Is it very significant? Yes . . .. Your comments suggest very clearly that scale is not a significant factor for search, and that’s factually wrong.” UPX0179 at -435 (emphasis added).
206. In an August 2009 email to Bill Coughran, Dr. Manber addressed a statement from Dr. Varian implying scale was not important to search; Dr. Manber wrote “Scale always makes a difference in search. I am not sure what made him decide suddenly to talk about search, something he knows nothing about.” Mr. Coughran replied by stating “I plan to raise this at Monday’s OC meeting.” UPX0875 at -025.
207. Google instructs its employees to not publicly discuss or acknowledge the use of “clicks” in search. UPX0204 at -208 (“Do not discuss the use of clicks in search, except on a need to know basis with people who understand not to talk about this topic externally. Google has a public position. It is debatable. But please don’t craft your own.”); UPX1066 at -880 (instructing in Antitrust Basics for Search Team to avoid discussions of “scale”); UPX0222 at -700 (“[O]ur heavy dependence [on] user feedback signals (aka ‘clicks’) in web ranking . . . . is an area where we aim to constantly sow confusion. . . . . RankBrain seems to have helped divert people from the idea that our primary use of user feedback is actually outside of RankBrain, which is nice.”).
Continue Reading Here.
About HackerNoon Legal PDF Series: We bring you the most important technical and insightful public domain court case filings.
This court case retrieved on April 30, 2024, storage.courtlistener is part of the public domain. The court-created documents are works of the federal government, and under copyright law, are automatically placed in the public domain and may be shared without legal restriction.