SEOSEO News

A Complete List of Google’s Autocomplete Signals in Chrome


Google Chrome utilizes a machine learning model for address bar autocomplete. This model, likely a Multilayer Perceptron (MLP) processes numerous input signals to predict and rank suggestions.

auto-complete-model

Here’s a breakdown of these signals:

Input Features:

User Browsing History:

  • log_visit_count: (float32[-1,1]) Logarithmic count of user visits to the URL.
  • log_typed_count: (float32[-1,1]) Logarithmic count of the URL being typed in the address bar.
  • log_shortcut_visit_count: (float32[-1,1]) Logarithmic count of user visits to the URL via a desktop shortcut.
  • elapsed_time_last_visit_days: (float32[-1,1]) Days elapsed since the user last visited the URL.
  • log_elapsed_time_last_visit_secs: (float32[-1,1]) Logarithmic seconds elapsed since the user last visited the URL.
  • elapsed_time_last_shortcut_visit_days: (float32[-1,1]) Days elapsed since the user last visited the URL via a desktop shortcut.
  • log_elapsed_time_last_shortcut_visit_sec: (float32[-1,1]) Logarithmic seconds elapsed since the user last visited the URL via a desktop shortcut.
  • num_bookmarks_of_url: (float32[-1,1]) Count of bookmarks associated with the URL.
  • shortest_shortcut_len: (float32[-1,1]) Length of the shortest desktop shortcut for the URL.

Website Characteristics:

  • length_of_url: (float32[-1,1]) Length of the URL string.

Match Characteristics:

  • total_title_match_length: (float32[-1,1]) Total length of matches between the user’s input and the website title.
  • total_bookmark_title_match_length: (float32[-1,1]) Total length of matches between the user’s input and the bookmark titles for the URL.
  • total_host_match_length: (float32[-1,1]) Total length of matches between the user’s input and the URL host.
  • total_path_match_length: (float32[-1,1]) Total length of matches between the user’s input and the URL path.
  • total_query_or_ref_match_length: (float32[-1,1]) Total length of matches between the user’s input and the URL query/referral parts.
  • first_url_match_position: (float32[-1,1]) Position of the first match between the user’s input and the URL.
  • first_bookmark_title_match_position: (float32[-1,1]) Position of the first match between the user’s input and the bookmark titles for the URL.
  • host_match_at_word_boundary: (float32[-1,1]) Boolean indicator of whether the host match occurs at a word boundary.
  • has_non_scheme_www_match: (float32[-1,1]) Boolean indicator of whether a match occurs without considering the scheme (http/https) or “www” prefix.
  • is_host_only: (float32[-1,1]) Boolean indicator of whether the user’s input matches the host only.

Model Processing:

These features are fed into the neural network. The network architecture, including specific layers and weights, is defined within the model file.

Output:

The model outputs a prediction score (float32[-1,1]) representing the relevance of each potential autocomplete suggestion. This score is used to rank suggestions, with higher scores appearing higher in the address bar dropdown.

Model Architecture:

  1. Input Layer: 20 input features, each represented by a separate node (e.g., elapsed_time_last_shortcut_visit_dayslog_visit_counttotal_title_match_length).
  2. Concatenation Layer: All 20 input features are concatenated along axis 1, resulting in a single tensor of shape ? x 20. The “?” indicates a variable batch size.
  3. Dense Layer (FullyConnected): A fully connected layer with:
    • Weights: Shape 64 x 20, suggesting 64 neurons in this layer. The weights are quantized as int8 for efficiency.
    • Bias: Shape 64, a bias term for each neuron.
    • Activation Function: ReLU (Rectified Linear Unit).
    • Quantization: Asymmetric quantization of inputs is applied.
    • Dense Layer (FullyConnected): Another fully connected layer with:
      • Weights: Shape 1 x 64, leading to a single output neuron.
      • Bias: Shape 1, a bias term for the output neuron.
  1. Logistic Layer: This likely represents a sigmoid activation function applied to the output of the previous dense layer, producing a value between 0 and 1.
  2. Output Layer: A single output node (“sigmoid”) representing the predicted score.

Key Observations:

  • Simple Architecture: The model consists of two hidden dense layers with a ReLU activation and a final sigmoid activation for output.
  • Quantization: The model employs quantization to reduce size and improve performance, using int8 weights for the first dense layer.
  • Feature Engineering: The input features are a combination of raw values and engineered features (e.g., logarithmic transformations, match lengths, boolean indicators).



Source link

Related Articles

Check Also
Close
Back to top button
error

Enjoy Our Website? Please share :) Thank you!