This is part 2 of a series of posts I am writing to discover more about what we can learn from the information in the recently discovered Google documentation on API references.

attributes

Yesterday we started to learn more about the files that were found that possibly discuss important factors in Google’s ranking systems.

We learned that these files are not telling us the specifics of Google’s ranking system, but rather, they list attributes that possibly could be used in ranking. These files are documentation to help developers who are working with Google’s Cloud Platform API. I speculated that the attributes mentioned were pieces of information that could be used in Google’s machine learning systems, which is what we will hopefully learn a lot more about by the time this series is done!

Attributes

Attributes are a way of structuring information in a consistent form that can be used in different programs. In our case, each attribute is information that can be used in some way via Google's APIs. These APIs are tools to allow developers to programmatically access the resources available on Google’s Cloud platform. For example, if you were building an app that uses the Gemini model to chat with users, you’d interact with the Gemini API.

These APIs use a common language and structure to describe data that can be used across different applications. Each piece of information, called an attribute, can be used across multiple APIs and machine learning models accessible via Cloud. 

For example, if I was developing using Google’s APIs and wanted to use the attribute called location, it has a clearly defined structure which includes a string of characters. In the documentation, this attribute defaults to “The world”, but could be set by the user to indicate a particular locality.

location attributes

Some of the attributes listed in these APIs could be used in Search algorithms

It is quite likely that the attributes mentioned in these developer APIs are attributes that can be used by Google’s search systems. 

For any of the attributes I mention below, what we do know is that this is a piece of information that Google has the capability of storing. What we don’t know, is whether it is actually being used in ranking, and if so, how much of a role it plays. 

Here are a few examples of attributes that, if used in Google’s Search systems, are quite interesting.

  • ContentAttributions: Stores content attribution to give credit for content. “This information is used during ranking to promote the attributed page.”
  • QualityTravelGoodSitesData: Stores data about good travel sites.
  • IndexingMobileInterstitialsProDesktopInterstitials: a signal related to interstitials.
  • SpamBrainData: “This holds SpamBrain values.” also, SpamBrainScore
  • QualityTimebasedLastSignificantUpdateAdjustments (although there is a note in the code that says this has been deprecated.)
  • FatcatCompactTaxonomicClassificationCategory - The probability that a document belongs to a specific category.
  • CompressedQualitySignals - These look interesting! This attribute contains information that can be used for Mustang and TeraGoogle (both topics I may dig up more on for this blog post series - here’s an interesting article that mentions TeraGoogle as a massive search index launched in 2006. Gemini told me Mustang is Google’s primary web search index - the workhorse behind our everyday Google Searches.)

How would attributes be used in Search?

Attributes represent specific characteristics. These characteristics can be used on their own as something called signals, or they can be used in algorithms that produce a signal

What is a signal? Signals are incredibly important in Google’s algorithms. 

In Google’s documentation on How Search Works they tell us that “Search algorithms look at many factors and signals.” Some of those signals include the words of your query, relevance and usability of pages, expertise of sources and your location and settings. They say, “The weight applied to each factor varies depending on the nature of your query.”

A signal here is something that can be used in the algorithms and systems that make calculations to help determine rankings.

Some signals that Google uses for ranking come from using what they call "aggregated and anonymized interaction data to assess whether search results are relevant to queries." They say they transform that data into signals for their machine learning systems to use to better estimate relevance.

how search works - signals

I suspect that signals gleaned from aggregated and anonymized interaction data are likely used by the Navboost system.

For now, let's end here with this:

Attributes are pieces of information. These can be used in programs that communicate with Google's resources. Some or all of these attributes may be used in the systems that determine rankings.

In the next blog post of this series, we will talk more about Navboost and look at the attributes listed in these docs such as NavboostCrapsCrapsClickSignals, and NavboostGlueVoterTokenBitmapMessage

Here's the first episode of this series: What is this leaked Google code? Digging into the API Docs.

Stay tuned for more!

Marie

Newsletter

(follow along as I document everything I find interesting and important on this and other topics related to ranking and AI in Marie’s Notes.)

Join my community, stay up to date, and get excited about the future of AI in the Search Bar.

Brainstorm with Marie