{"section":"tutorials","requestedLocale":"en","requestedSlug":"search-data-pipeline","locale":"en","slug":"search-data-pipeline","path":"docs/en/tutorials/beta/vtex-data-pipeline-beta/search-data-pipeline.md","branch":"main","content":"The `search` data model contains comprehensive information about search queries, results, and user interactions with the Intelligent Search platform. This data allows analyzing search performance, product discovery, click-through rates, and search-to-purchase conversion rates.\n\nThis section includes the following information:\n\n- [Table types and relationships](#table-types-and-relationships)\n- [Search data characteristics](#search-data-characteristics)\n- [Table: request](#table-request)\n- [Table: response](#table-response)\n- [Table: response_product](#table-response-product)\n- [Table: click](#table-click)\n- [Table: impression](#table-impression)\n- [Table: impression_click](#table-impression-click)\n- [Table: impression_order_group](#table-impression-order-group)\n- [Table: request_white_label_seller](#table-request-white-label-seller)\n- [Table: request_merchandising_rule](#table-request-merchandising-rule)\n- [Table: request_field_query](#table-request-field-query)\n- [Table: request_text_filter](#table-request-text-filter)\n- [Table: request_number_filter](#table-request-number-filter)\n- [Table: request_relevance_rule](#table-request-relevance-rule)\n- [Table: request_hybrid_search](#table-request-hybrid-search)\n- [Table: request_setting](#table-request-setting)\n- [Table: request_dp_shipping](#table-request-dp-shipping)\n- [Table: request_dp_dynamic_estimate](#table-request-dp-dynamic-estimate)\n- [Table: request_dp_delivery_options](#table-request-dp-delivery-options)\n- [Analyses with search data](#analyses-with-search-data)\n- [Correlations with other data](#correlations-with-other-data)\n\n## Table types and relationships\n\nThe search data model consists of three types of tables, each with a specific role:\n\n- **Fact tables:** Store events that have taken place. Each row is a record of an action — a search, a product click, or a displayed impression. These tables have the largest data volume and are the starting point for most analyses. Example: In the `request` table, each row records a search made by a buyer, including the search term, applied filters, and the event timestamp.\n- **Bridge tables:** Establish relationships between two entities. They don't have their own business data, only keys connecting records from other tables. Example: The `impression_click` table contains only `impression_id` and `click_id`, allowing you to answer the question \"Which impressions generated clicks?\" without duplicating the data from either of the original tables.\n- **Dimension tables:** Store descriptive attributes and settings that contextualize events. This type of table changes less often and has a smaller data volume. Example: The `request_setting` table indicates the [Elasticsearch](https://www.elastic.co/elasticsearch) cluster that processed the search and the flags that were active, such as `hide_unavailable_items` or `merchandising_rules_enabled`, which allows analyzing how different settings impact the results.\n\nThe diagram below shows how the tables are organized by type and how they connect to each other:\n\n```mermaid\nflowchart TB\n    subgraph FACT[\"Fact tables (events)\"]\n        tbl_request[\"request\\n(completed search)\"]\n        tbl_response[\"response\\n(returned result)\"]\n        tbl_click[\"click\\n(click on result)\"]\n        tbl_impression[\"impression\\n(displaying results)\"]\n    end\n\n    subgraph BRIDGE[\"Bridge tables (relationships)\"]\n        tbl_response_product[\"response_product\\n(products in results)\"]\n        tbl_impression_click[\"impression_click\\n(impression → click)\"]\n        tbl_impression_order_group[\"impression_order_group\\n(impression → order)\"]\n\n        subgraph FILTERS[\"Filters and request rules\"]\n            tbl_text_filter[\"request_text_filter\"]\n            tbl_number_filter[\"request_number_filter\"]\n            tbl_field_query[\"request_field_query\"]\n            tbl_relevance_rule[\"request_relevance_rule\"]\n            tbl_merchandising_rule[\"request_merchandising_rule\"]\n            tbl_white_label_seller[\"request_white_label_seller\"]\n            tbl_hybrid_search[\"request_hybrid_search\"]\n        end\n\n        subgraph DP[\"Delivery promises\"]\n            tbl_dp_shipping[\"request_dp_shipping\"]\n            tbl_dp_dynamic_estimate[\"request_dp_dynamic_estimate\"]\n            tbl_dp_delivery_options[\"request_dp_delivery_options\"]\n        end\n    end\n\n    subgraph DIMENSION[\"Dimension table (settings)\"]\n        tbl_request_setting[\"request_setting\\n(search engine configuration)\"]\n    end\n\n    tbl_request -->|search_id| tbl_response\n    tbl_response -->|search_id| tbl_response_product\n    tbl_impression -->|impression_id| tbl_impression_click\n    tbl_impression_click -->|click_id| tbl_click\n    tbl_impression -->|impression_id| tbl_impression_order_group\n    tbl_request -->|search_id| tbl_request_setting\n    tbl_request -->|search_id| FILTERS\n    tbl_request -->|search_id| DP\n```\n\n### Usage examples\n\nBelow are three distinct flows for using the data:\n\n- Flow 1: Shows the journey of a search request and its components. Example: a buyer searches for \"running shoes\" with brand and price filters.\n\n```mermaid\nflowchart TD\n    REQ[\"request\\nBuyer searches: running shoes\"]\n\n    REQ -->|\"search_id\"| SETTING[\"request_setting\\nES Cluster: is-intelligent-search-v8-05\\nFlags: hide_unavailable_items = true\"]\n    REQ -->|\"search_id\"| RESP[\"response\\nLatency: 150ms, Match: 42 products\"]\n    REQ -->|\"search_id\"| TF[\"request_text_filter\\nbrand = Nike\"]\n    REQ -->|\"search_id\"| NF[\"request_number_filter\\nprice: 100 to 500\"]\n\n    RESP -->|\"search_id\"| RP[\"response_product\\n#1 Air Max Shoes - score: 95\\n#2 Pegasus Shoes - score: 87\\n#3 ZoomX Shoes - score: 82\"]\n```\n\n- Flow 2: Shows the full buyer journey from retrieving results → clicking → purchasing. Example: The buyer views the results, clicks product #2, and completes the purchase.\n\n```mermaid\nflowchart LR\n    IMP[\"impression\\nResults shown\\nto buyer\"]\n    IC[\"impression_click\\nLinks impression\\nto click\"]\n    CLK[\"click\\nBuyer clicked\\nproduct #2, position: 2\"]\n    IOG[\"impression_order_group\\nLinks impression\\nto order\"]\n    ORD[\"Order \\ndata model\\n- order_group\"]\n\n    IMP -->|\"impression_id\"| IC\n    IC -->|\"click_id\"| CLK\n    IMP -->|\"impression_id\"| IOG\n    IOG -->|\"order_group\"| ORD\n```\n\n- Flow 3: each search request can have multiple associated details, all linked by `search_id`. For example, a single search may have two text filters, one number filter and three active sellers at the same time.\n\n```mermaid\nflowchart TD\n    REQ[\"request\\nsearch_id: X\"]\n\n    REQ -->|\"search_id\"| TF[\"request_text_filter\\nbrand = 'Nike'\\ncategory = 'Shoes'\"]\n    REQ -->|\"search_id\"| NF[\"request_number_filter\\nprice: 100 to 500\"]\n    REQ -->|\"search_id\"| FQ[\"request_field_query\\nsku:123\"]\n    REQ -->|\"search_id\"| RR[\"request_relevance_rule\\ntype: click, weight: 5\"]\n    REQ -->|\"search_id\"| WLS[\"request_white_label_seller\\nseller_1, seller_2\"]\n    REQ -->|\"search_id\"| MR[\"request_merchandising_rule\\nrule: promo-summer-2026\"]\n    REQ -->|\"search_id\"| HS[\"request_hybrid_search\\nmodel: openai:text-embedding-3-small\\nratio: 0.5\"]\n    REQ -->|\"search_id\"| DPS[\"request_dp_shipping\\nshipping: pickup\"]\n```\n\n## Search data characteristics\n\n|  **Characteristic** | **Description** |\n| :---------------:   | :-------------: |\n|       **Data source**       | Obtained from Intelligent Search API requests and responses and Activity Flow events. |\n|       **Availability**      |                  This metric is only available through Data Pipeline.                 |\n|         **History**         |                         Historical data starts in August 2025.                        |\n| **Minimum update interval** |            One hour.            |\n\n## Table: request\n\nStores core information about buyer search queries, including the search text, filters, sorting, pagination, and search configuration. Each row represents a single search request event. Not all search requests made on the frontend are recorded in this table, as some requests are served from the cache and are not logged.\n\n> 'Bloom filters' are enabled on the following columns: `search_id`, `account_name`, `query`. <br />'Bloom filters' help ignore data files that don't contain matching values, significantly improving query performance for equality predicates on these columns.\n\nThe table fields are described below:\n\n| **Column name**  | **Column type** | **Column description** |\n| :-----------:    | :-------------: | :------------: |\n|  search_id   |  string     |  Search UUID. Unique identifier for each search request used to join with response tables and other search-related tables.   |\n|  account_name  |  string     |  Name of the account where the search was completed. Identifies the store associated with the search. |\n|  event_time  |    timestamp    | Search event timestamp. Indicates when the search request was received and processed by the search API.  |\n| origin |string  |   Request origin. Indicates where the search originated from, such as 'autocomplete', 'search', or other entry points. Used to understand user search behavior patterns. |\n| default_locale | string |  Default locale of the tenant. The store's default language and region setting (example: 'en-US', 'pt-BR').  |\n|  locale   | string |  Locale requested by the buyer. The specific language and region setting requested for this search (example: 'en-US', 'pt-BR'). It may differ from the `default_locale` if the user selects a different language. |\n| query  |  string |  Full-text search query string entered by the buyer. The search term or phrase used to find products. This may be empty for searches that only use field queries or filters.|\n| operator  |  string |  Query operator. Defines how multiple search terms are combined: 'and' requires all terms to match while 'or' requires at least one term to match. |\n| fuzzy  |  string  | Tolerance for query errors. Controls the tolerance for typos and spelling errors in the search. It can be '0' (exact match), '1' (difference of one character), '2' (difference of two characters), or 'auto' (automatic calculation). |\n|  sort_field  |  string  |  Product field used to sort the results. The may be 'relevance', 'price', 'name', or other product attributes. |\n|  sort_order   | string |  Sort order for the results. Specifies if the results are sorted in ascending ('asc') or descending ('desc') order based on the sort_field. |\n|   page  |  int  | Current page number in the search results. Used for pagination, starting at page 1. Each page is considered a separate search request. |\n|  products_per_page |  int |  Number of products per page. The page size requested for search results, typically 10, 20, 30, etc. |\n|  trade_policy  | string |  Sales channel (trade policy) of the session. Identifies the specific sales channel used for this search. |\n| delivery_promises_enabled | boolean |  Indicates whether the Delivery Promise feature is enabled on the account. |\n|  delivery_promises_active |  boolean |  Indicates whether the Delivery Promise feature is active for this search. |\n| record_created_at | timestamp |  Timestamp for when this record was created in the lakehouse. |\n|record_updated_at | timestamp  | Timestamp when this record was last updated in the lakehouse. |\n|  batch_id  |  timestamp | Identifier used when data is loaded into the table for data ingestion quality control. It also serves as a partition key.  |\n\n## Table: response\n\nTable that stores search response information. Contains metadata about the search results returned to the buyer, including redirect information, performance metrics, match count, and query processing details. Each row represents the response to a single search request, linked to the request table via 'search_id'.\n\nThe table fields are described below:\n\n| **Column name** | **Column type** | **Column description** |\n| :--------: | :-------------: | :----------: |\n|  search_id  | string     |  Search UUID. Unique identifier linking this response to the corresponding search request.   |\n| account_name |  string  | Name of the account where the search was completed. Identifies the store associated with the search.  |\n| event_time |    timestamp    | Search event timestamp. Indicates when the search request was received and processed by the search API. |\n| redirect  |string  | Redirect URL, if applicable. This field is completed when a search triggers a redirect rule (for example, to specific brand pages). Otherwise, it returns null. |\n|  latency | int | Search response latency in milliseconds. Measures the time taken to process and return the search results. |\n|  misspelled| boolean|  Indicates if there is a misspelled word in the query. |\n|  match | int | Number of matching products. Indicates the total number of items that match the search and applied filters.|\n|  operator | string | Query operator after fallback. Indicates the operator used after fallback or corrections are applied to the search. |\n|  fuzzy  | string | Query tolerance level after fallback. Indicates the final tolerance value used after any query processing or fallback logic.|\n| record_created_at | timestamp| Date and time when this record was created in the lakehouse. |\n| record_updated_at | timestamp |  Date and time of the last time this record was updated in the lakehouse.|\n| batch_id  | timestamp | Identifier generated when the data is loaded into the table. This is used for ingestion quality control and also as a partition key. |\n\n## Table: response_product\n\nTable that contains the products returned in the search response. It stores detailed information about each product in the search results, including position, availability, relevance score, and identification details. Each row represents a single product in a search results set.\n\nThe table fields are described below:\n\n| **Column name** | **Column type** | **Column description** |\n| :---------: | :-------------: | :--------------: |\n| search_id | string| Search UUID. Unique identifier linking this product result to the corresponding search request and response. |\n| account_name  | string |  Name of the account where the search was completed. Identifies the store associated with the search.  |\n| local_index  |  bigint  |  Product index relative to the current page. The position of the product within the current search results page (0-based index).  |\n| global_index | bigint  | Product index relative to the complete result set. The position of the product within the complete search result set across all pages. |\n| internal_product_id | string | Internal product ID. Unique identifier for the product variant within the search engine. When splitting SKUs by specification is enabled, this differs from product_id and includes the specification value (example: '124633-blue'). |\n|product_id |  string |  Product ID. The default product identifier that can be joined with the Catalog data model. This is the base product ID without specification details. |\n| specification | string | Product specification. The product specification value when separating SKUs by specification is enabled. |\n| available |  boolean | Indicates if the product is available. Shows if the product is currently in stock and available for purchase. |\n|  score  |bigint  |Relevance score. The numerical score assigned by the search engine indicating how relevant this product is to the query. Higher scores indicate better relevance. |\n| cosine_similarity_match | boolean  | Indicates if the product matched based on cosine similarity in hybrid search. Shows if the product was matched by semantic similarity (vector search) when hybrid search is enabled. |\n| record_created_at | timestamp | Timestamp of when this record was created in the lakehouse.|\n| record_updated_at  |  timestamp| Timestamp of when this record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier used when data is loaded into the table for quality control of data ingestion. It also serves as a partition key. |\n\n## Table: click\n\nTable containing search result clicks. Stores information about buyers clicking products in search result pages, including click position, product details, and user session information. Each row represents a single click event on a search result.\n\nThe table fields are described below:\n\n|  **Column name** | **Column type** |  **Column description** |\n| :------------: | :-------------: | :-----------------: |\n| click_id  | string | Unique identifier for the click event. UUID that uniquely identifies each search result click.  |\n| search_id  | string  |  Search UUID that generated the results. Links the click to the corresponding search request. |\n|session_id  | string  | Unique session ID from Activity Flow. Links the click to the user's browsing session.  |\n| mac_id| string | Unique ID (UUID) to identify recurring users from Activity Flow. Links the click to the device identifier of the user. |\n| account_name  |  string|VTEX account of the store where the click occurred. Identifies the store associated with the click.|\n| event_time| timestamp    | Timestamp of when the click event was ingested. Indicates when the event was received and processed by the data pipeline. |\n| client_time |    timestamp    | Timestamp of the event on the buyer's device. Indicates when the click actually occurred on the client side. May be inconsistent if the time on the user's device is incorrect. |\n| url | string  |Full URL where the click occurred. The complete web address of the page where the search results were displayed and the click happened.   |\n|  ref |   string |   URL of the page that directed the buyer to this page. The reference URL that indicates where the user came from before viewing the search results.  |\n|   workspace |  string     |     Workspace the user is visiting (example: master). Relevant for A/B testing on the IO platform.|\n| access_type  |  string     |  Type of page access. Can be 'internal' for internal VTEX URLs (myvtex domains) or 'public' for customer-facing pages. |\n| sp_variant |      string     |   Current experiment ID and variant ID from the Intelligent Search A/B testing service. Identifies the A/B test variant the user was exposed to.  |\n|search_anonymous |      string     | Anonymous ID from the Intelligent Search pixel. Anonymous identifier used for tracking and analytics.    |\n| search_session  |      string     |  Session ID from the Intelligent Search pixel. Session identifier used to track user sessions within the search context.   |\n|  page_x   | float |    X coordinate of the click on the page. Horizontal position where the user clicked in pixels.   |\n|  page_y   | float |     Y coordinate of the click on the page. Vertical position where the user clicked in pixels.    |\n|element| string|  HTML element that was clicked. Identifies the type of element that received the click event (example: 'button', 'link', 'div').  |\n| element_source  |  string     |      Identifies the origin of the event on the frontend. In the search context, this can be 'search-result' or 'search-autocomplete'.     |\n|  product_id  |  string  | Product ID of the clicked item. When SKU separation by specification is enabled, this value may not be unique because it represents the base product ID without specification details.  |\n| product_specification  |  string |Product specification of the clicked item. The value of the specification when SKU separation by specification is enabled.        |\n|  product_position   |int |  Position of the clicked product. The position of the product in the search results when it was clicked (starts at 1). |\n| record_created_at | timestamp | Timestamp of when this record was created in the lakehouse. |\n| record_updated_at | timestamp | Timestamp of when this record was last updated in the lakehouse. |\n|  batch_id |    timestamp    |  Identifier used when data is loaded into the table for quality control of data ingestion. It also serves as a partition key. |\n\n## Table: impression\n\nTable containing impressions of search results. Stores information about when search results are displayed to shoppers, including impression type, element details, and user session information. Each row represents a single impression event.\n\nThe table fields are described below:\n\n|  **Column name** | **Column type** |  **Column description**  |\n| :-----: | :-------------: | :----------: |\n| impression_id  |      string     |  Unique identifier for the impression event. UUID that uniquely identifies each search result impression.     |\n| search_id  |      string     |  UUID of the search that generated the results. Links the impression to the corresponding search request.     |\n| session_id  |      string  | Unique session ID from Activity Flow. Links the impression to the user's browsing session. |\n|  mac_id  |      string     | Unique ID (UUID) to identify recurring users from Activity Flow. Links the impression to the user's device identifier.        |\n| account_name | string |   VTEX account of the store where the impression occurred. Identifies the store associated with the impression.  |\n| event_time  | timestamp |  Timestamp of when the impression event was ingested. Indicates when the event was received and processed by the data pipeline.              |\n| client_time  | timestamp | Timestamp of the event on the buyer's device. Indicates when the search results were actually displayed on the client side. May be inconsistent if the time on the user's device is incorrect. |\n|  url  |  string | Full URL where the impression occurred. The complete web address of the page where the search results were displayed.  |\n|  ref  | string | URL of the page that directed the buyer to this page. The reference URL that indicates where the user came from. |\n| workspace | string     | Workspace the user is visiting (example: master). Relevant for A/B testing on the IO platform. |\n| access_type | string |  Type of page access. Can be 'internal' for internal VTEX URLs (myvtex domains) or 'public' for customer-facing pages.  |\n| sp_variant  | string |  Current experiment ID and variant ID from the Intelligent Search A/B testing service. Identifies the A/B test variant the user was exposed to. |\n| search_anonymous  |  string | Anonymous ID from the Intelligent Search pixel. Anonymous identifier used for tracking and analytics. |\n| search_session | string | Session ID from the Intelligent Search pixel. Session identifier used to track user sessions within the search context. |\n| impression_type  |  string |Impression type. Categorizes the type of search result impression (example: 'search', 'autocomplete', 'recommendation').   |\n|element|  string  |HTML element that was displayed. Identifies the type of element that generated the impression event (example: 'product-card', 'search-result').   |\n|  element_source  |  string  |  Identifies the origin of the event on the frontend. In the search context, it can be 'search-result' or 'search-autocomplete'. |\n| record_created_at |    timestamp    |Timestamp of when this record was created in the lakehouse. |\n| record_updated_at |    timestamp    | Timestamp of when this record was last updated in the lakehouse. |\n| batch_id|    timestamp |  Identifier used when data is loaded into the table for quality control of data ingestion. It also serves as a partition key. |\n\n## Table: impression_click\n\nTable that assigns clicks to impressions. Establishes the relationship between impression events and click events, enabling analysis of click-through and impression-to-click conversion rates. Each row represents a link between a specific impression and the click it generated. When no click is generated from the impression, a row isn't created.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| account_name | string | VTEX account of the store. |\n| impression_id | string | Unique identifier for the impression event. Links to the impression table. |\n| click_id | string | Unique identifier for the click event. Links to the click table. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n## Table: impression_order_group\n\nTable that assigns order groups to impressions. Establishes the relationship between impression events and completed purchases, enabling analysis of search-to-order conversion rates and the impact of search impressions on purchases. Each row represents a link between a specific impression and an order group from a completed purchase. When no order is placed from the impression, a row isn't created.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| impression_id | string | Unique identifier for the impression event. Links to the impression table. |\n| account_name | string | VTEX account of the store. Order groups are unique to the account_name, not globally. |\n| order_group | string | Links the impression to a specific order transaction, enabling analysis of the customer journey from search to purchase. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n## Request detail tables\n\nThe following tables provide additional details about search requests. Each table links to the `request` table via `search_id`.\n\nAll request detail tables include the standard metadata columns `record_created_at`, `record_updated_at`, `batch_id` for data lineage and quality control tracking.\n\n### Table: request_white_label_seller\n\nTable containing the list of active sellers in the session where the search occurred. It's related to the regionalization feature, which allows stores to filter search results based on sellers or specific regions. Each row represents a seller that was active during the search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this seller to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| seller_id | string | ID of the seller active in the session during the search. Used for regionalization analysis. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_merchandising_rule\n\nTable containing the list of merchandising rules considered in the search request. Merchandising rules allow merchants to modify search results to prioritize and return more relevant products to customers based on custom criteria such as brand, category, or product attributes. Each row represents a merchandising rule that was active during the search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this merchandising rule to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| merchandising_rule_id | string | Unique identifier of the merchandising rule applied to this search. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_field_query\n\nTable containing information about \"get by ID\" queries. These are queries like `sku:123` or `product:456` that fetch specific products using the identifier instead of a full-text search. Each row represents a field query used in the search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this field query to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| field | string | Product field used in the query (e.g. `product`, `sku`). |\n| query | string | Specific value searched for the given field (e.g. `123` for `sku:123`). |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_text_filter\n\nTable containing information about text filters applied to facets instead of text searches. These filters are applied to text attributes such as brand, category, or other categorical attributes to refine search results. Each row represents a single text filter applied to a search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this text filter to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| key | string | Name of the product attribute that was filtered (e.g. `brand`, `category`, `color`). |\n| value | string | Specific value selected for the filter (e.g. `apple` for brand, `electronics` for category). |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_number_filter\n\nTable containing information about numeric filters applied to facets instead of text searches. Numeric filters differ from text filters because they are applied as a range (from-to) instead of a single value. These filters are typically used for numeric attributes such as price, rating, or other measurable product characteristics. Each row represents a single numeric filter applied to a search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this numeric filter to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| key | string | Name of the numeric attribute that was filtered (e.g. `price`, `rating`, `weight`). |\n| from | string | Lower threshold of the range filter. |\n| to | string | Upper threshold of the range filter. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_relevance_rule\n\nTable containing information about relevance rules applied to search requests. Relevance rules define the order in which products are displayed in search results on product listing pages (PLP). The order changes depending on the criteria and priorities set for the search engine. Each row represents a relevance rule that was active during the search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this relevance rule to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| type | string | Type of relevance rule or boost applied (e.g. `click`, `newness`, `revenue`). |\n| composition_weight | int | Weight assigned to this rule when multiple criteria are combined. Higher values indicate greater influence on ranking. |\n| priority_index | int | Priority order of this rule when multiple priority criteria are applied. Lower indexes indicate higher priority. |\n| priority | boolean | Whether this rule is treated as a priority criterion that takes precedence over other ranking factors. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_hybrid_search\n\nTable containing details about hybrid search for queries that use that feature. Hybrid search combines traditional lexical search, which is keyword-based, with semantic search, which is vector-based, to provide more relevant results. This table stores the configuration and parameters used for hybrid search.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this hybrid search configuration to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| model | string | Identifier of the ML model used to generate embeddings for semantic search (e.g. `openai:text-embedding-3-small:1024`). |\n| ratio | float | Balance between semantic and lexical search, from 0 to 1. A value of `0.5` means equal weight for both. |\n| binning | float | Granularity level used to group similarity scores (e.g. `0.01`). |\n| similarity | float | Minimum cosine similarity score required for a product to be considered a match. |\n| products | int | Number of products to be returned from the semantic search. |\n| candidates | int | Number of products to be returned from each shard. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_setting\n\nTable containing details about the search engine settings for each search request. Stores configuration information including details about the Elasticsearch cluster, feature flags, and search behavior settings. Each row represents the settings applied to a single search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking these settings to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| elasticsearch_cluster | string | Name of the Elasticsearch cluster used to process this search (e.g. `is-intelligent-search-v8-05`). |\n| elasticsearch_group | string | Name of the Elasticsearch group used to process this search (e.g. `shared-01`). |\n| hide_unavailable_items | boolean | Whether out-of-stock or unavailable products are filtered out from search results. |\n| show_invisible_items | boolean | Whether products marked as invisible in the catalog are included in search results. |\n| merchandising_rules_enabled | boolean | Whether the merchandising rules feature is active for this search. |\n| priority_boosts_enabled | boolean | Whether the priority boost features are active for this search. |\n| secondary_boosts_enabled | boolean | Whether the secondary boost features are active for this search. |\n| diacritics_boost_enabled | boolean | Whether the diacritics (accents) boost is active for this search. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_dp_shipping\n\nTable containing delivery information from delivery promises. Stores the delivery methods selected as filters when the Delivery Promise feature is active in a search request. The delivery promise allows buyers to filter products based on delivery options, such as pickup points or specific delivery methods. Each row represents a delivery filter applied to a search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this delivery filter to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| shipping | string | Delivery method selected as a filter (e.g. `pickup-in-point`, `delivery`). |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_dp_dynamic_estimate\n\nTable containing dynamic estimate information from the delivery promise. Stores the selected dynamic delivery time estimates as filters when the Delivery Promise feature is active in a search request. Dynamic estimates allow buyers to filter products based on delivery time windows, such as same-day or next-day delivery. Each row represents a dynamic estimate filter applied to a search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this dynamic estimate filter to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| dynamic_estimate | string | Delivery time estimate selected as a filter (e.g. `same-day`, `next-day`). |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n### Table: request_dp_delivery_options\n\nTable containing information about delivery options originating from delivery promises. Stores the identifiers of delivery options selected as filters when the Delivery Promise feature is active in a search request. Delivery options are specific settings for delivery services that can be used to filter products. Each row represents a delivery option filter applied to a search request.\n\nThe table fields are described below:\n\n| **Column name** | **Type** | **Description** |\n|:--|:--|:--|\n| search_id | string | Unique identifier linking this delivery option filter to the corresponding search request. |\n| account_name | string | Name of the account where the search was completed. |\n| delivery_options | string | Hash of the JSON object describing the selected delivery option filter. Actual values are not currently available. |\n| record_created_at | timestamp | When the record was created in the lakehouse. |\n| record_updated_at | timestamp | When the record was last updated in the lakehouse. |\n| batch_id | timestamp | Identifier for data load batches; also serves as a partition key. |\n\n## Analyses with search data\n\nSome of the analyses that you can run using the search tables are listed below:\n\n- **Search performance metrics:** Calculate search latency, query success rates, and result relevance by analyzing the request and response tables. Track performance trends over time and identify slow queries that need optimization.\n- **Click-through rate analysis:** Measure click rates by product position, search query, or category. Identify the positions in the search results that generate more clicks and optimize product ranking.\n- **Search-to-order conversion:** Track the complete customer journey from search impression to click and purchase. Calculate conversion rates at each stage: **impression → click → order** to understand search effectiveness.\n- **Query analysis:** Analyze the most common search queries, identify searches with no results, and understand query patterns. Use this to improve product discovery and search relevance.\n- **Product ranking performance:** Evaluate how product positions in the search results affect clicks and conversions. Identify products that rank well but don't convert, or products that convert well but have a low ranking.\n- **Use of filters and facets:** Indentify the filters and facets that are most commonly used by buyers. Analyze how filters impact search results and conversion rates.\n- **Hybrid search effectiveness:** Compare the performance of semantic search (hybrid search) versus traditional keyword search. Measure how hybrid search affects relevance scores and conversion rates.\n- **A/B test analysis:** Use sp_variant and experiment data to analyze the impact of different search settings, relevance rules, or UI changes on user behavior and conversion.\n- **Search source analysis:** Compare performance across different search sources to understand user behavior patterns and optimize all entry points. Example: Autocomplete _versus_ full search.\n- **Impact of merchandising rules:** Measure how merchandising rules affect product visibility, clicks, and conversions. Identify the rules that are most effective for driving sales.\n\n## Correlations with other data\n\n| **Dataset** | **Description** |\n|:--|:--|\n| Navigation | Correlating search queries with navigation paths helps understand how users discover products, optimizing both search and navigation experiences. |\n| Orders | Linking impressions and clicks to order data enables search-to-purchase conversion analysis, identifying queries, positions, or filters that drive the highest conversion rates. |\n| Catalog | Joining search results with catalog data allows analysis of product discovery, attribute influence on ranking, and identification of underranked products. |\n| Inventory | Combining search data with inventory information helps identify how out-of-stock events affect search results and conversions. |\n| Cart availability | Correlating search results with cart availability data helps identify products that become unavailable during checkout, optimizing availability in search results. |\n\n### Learn more about other datasets\n\n- [Catalog](https://help.vtex.com/docs/tutorials/catalog-data-pipeline)\n- [Cart Availability](https://help.vtex.com/docs/tutorials/cart-availability-data-pipeline)\n- [Inventory](https://help.vtex.com/docs/tutorials/inventory-data-pipeline-beta)\n- [Navigation](https://help.vtex.com/docs/tutorials/navigation-data-pipeline)\n- [Payments](https://help.vtex.com/docs/tutorials/payments)\n- [Orders](https://help.vtex.com/docs/tutorials/orders-data-pipeline-beta)\n- [Prices](https://help.vtex.com/docs/tutorials/prices-data-pipeline-beta)\n- [Promotions](https://help.vtex.com/docs/tutorials/promotions)\n- [Gift Card](https://help.vtex.com/docs/tutorials/gift-card-data-pipeline)\n- [Bridge logs](https://help.vtex.com/docs/tutorials/bridge-logs-data-pipeline)\n- [Marketplace in](https://help.vtex.com/docs/tutorials/marketplace-in-data-pipeline)"}