top of page

How to Build an Advanced Search Widget with Google Vertex AI: Step-by-Step

Writer's picture: Katherine LimKatherine Lim

Building a Generative Search Widget with Google Vertex AI Agent Builder

Google Vertex AI has emerged as a game-changing platform for building and deploying machine learning solutions, with its Agent Builder standing out as a particularly powerful tool for creating intelligent conversational experiences. In this blog post, we'll break down the complexities of Vertex AI's Agent Builder and provide a comprehensive guide to implementing a sophisticated search application.


Cartoon robot with a magnifying glass

Understanding Data Stores

The Data Store is a revolutionary feature within Agent Builder that allows developers to create intelligent conversational interfaces powered by their own data. Unlike generic chatbots, this feature can:

  • Leverage your organisation's specific knowledge base

  • Provide context-aware and accurate responses

  • Integrate seamlessly with existing data sources

  • Offer a customisable conversational experience


Key Benefits

  1. Contextual Understanding: The agent goes beyond simple keyword matching, comprehending the nuanced context of user queries.

  2. Data-Driven Responses: Pulls information directly from your curated data stores

  3. Scalability: Easily handles large volumes of conversational data

  4. Customisation: Tailors responses to your specific use case


Step-by-Step: Creating an AI Search Widget with Data Store Agent Builder


Agent Builder Information Flow diagram

Prerequisite Setup

Before diving into development, ensure you have:

  • A Google Cloud account (sign in to your Google Cloud Console account)

  • Access to a Google Cloud project 

  • Access to enable the Dialogflow API (to enable the API, navigate to the Dialogflow API Service Details page, then click the Enable button to enable the Dialogflow API)

  • Access to enable the Vertex AI Search and Conversation API (to enable the API, navigate to the Vertex AI Search and Conversation console, then click Continue and activate the API)

  • Access to create a Google Cloud Storage bucket (for the example implementation)


1. Setting Up Your Data Store

Begin by creating a comprehensive data store in Vertex AI:

  • Organise your data sources (documents, FAQs, knowledge bases)

  • Ensure data is clean, structured, and well-indexed

  • Use Google Cloud Console to create and configure your Data Store

For the example implementation, create a Google Cloud Storage bucket to store the data sources. Note that files must not be in a folder as directories are ignored during the import process.

If a file is not in a supported format, this error will be logged:

Field "content.mime_type" must be one of [application/json, application/pdf, application/vnd.google-apps.document, application/vnd.google-apps.presentation, application/vnd.google-apps.spreadsheet, application/vnd.ms-excel.sheet.macroenabled.12, application/vnd.openxmlformats-officedocument.presentationml.presentation, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.openxmlformats-officedocument.wordprocessingml.document, image/bmp, image/gif, image/jpeg, image/png, image/tiff, text/html, text/plain, text/xml], got "application/xml".

2. Configuring the Data Store Agent


To begin configuring the Data Store Agent, first navigate to the Agent Builder console.


To begin configuring the Data Store Agent, first navigate to the Agent Builder console

The first step is to create an App in Agent Builder. For this implementation, the “Search for your website” app will be the most suitable choice to be able to search structured data and unstructured documents stored in Google Cloud Storage. Click “Create app”, in “Search and assistant”, find “Search for your website”, click “Create”.


The first step is to create an App in Agent Builder

Once the type of App has been chosen, the next step is configuration. In “Website search app configuration”, enter “google_products” for the “app name” and “company name”, then click “Continue”. For this implementation, keep the default settings. The Enterprise edition features are required for website search. Advanced LLM features need to be enabled to experience search summarisation and followups. Unless you have compliance or regulatory requirements to locate your data in a particular region, continue with the multi-region setting as “global”.


Once the type of App has been chosen, the next step is configuration

Next, configure the data store for the app. In “Data stores” click “Create Data Store”.


Next, configure the data store for the app

For this implementation, our data will be imported from a storage bucket. Select the “Cloud Storage” option.


Select cloud bucket as a data source

Data will be imported from a bucket which contains Google Product sample information downloaded from Wikipedia (the bucket URL has been intentionally redacted). To import the folder, type in the name of the Cloud Storage folder following the text shown in the screenshot in the section “Select a folder or a file that you want to import”. The default settings can be used as we will be importing unstructured documents (PDF, HTML, TXT, etc.) and it will be a one-time ingestion. There is a periodic option to resynchronise the data.

Import data from Cloud Storage

Next, provide “google_products” as the Data Store Name and click “Create”.


Create a data store

Then ensure that “google_products” is selected on the “Data stores” page and click “Create”. This will link the Data store to the App.

Create app

Once the app has been successfully created, then the data import process begins.

Data store import in progress

For the sample data, it took about an hour to complete. At this stage, check for errors in the document import. As the default option was a one-time ingestion, fixing import errors will require re-ingesting the data source to a new data store.

Data store import completed


Once completed, try out the Search preview by typing in some queries that are related to the sample data. In the screenshot below, the example question is about which Pixel phones have 5G. The generative response does a good job of providing an answer containing a list of Pixel phones with 5G. The search results all have links to the storage bucket, if a web site was used as the data source then we can expect to see links to the web site instead.

Example query number 1

In the screenshot below, there is an example query about whether the Pixel Watch is water resistant. Note that the generative response correctly points out that the Pixel watch is not in our data store.

Example query number 2

The final test query asks which Google products are discontinued. In the screenshot a list of discontinued Google products is generated. However the response may not be accurate as it is based on downloaded Wikipedia pages. A more accurate response could be produced from ingesting a table recording which products are current and discontinued.

Example query number 3

Integrating the widget into a web application - Frontend Implementation

Click on the “Integration” tab to find out how to add the widget to a web page. In the screenshot below, note that there are specific tabs for “widget” and “api”. The screen for “widget” is included. Be sure to add the allowed domains for the widget to further protect your integration from misuse.


Integration widget

To test the widget from your local computer, you’ll need a web server and an HTML file to embed the widget code. First, change the “authorisation type” from “JWT or OAuth based” to “Public” and add “localhost” to the allowed domains for the widget. If the error “Problem loading results, please try again” appears, note that changes may take up to 30 minutes to apply.

Change authorisation type

Create an index.html file and paste in the sample widget code replacing the “Widget Javascript Bundle” line.

Create an index.html file

For this example, Python will be used as the web server. To serve the index.html file, open a command prompt to the folder where the index.html file is located and execute the command “python3 -m http.server 8000”. You should then see the output “Serving HTTP on :: port 8000 (http://[::]:8000/) …”.

To test the widget code, browse to “localhost:8000”. Click on the input box “Search here” (it’s noted in the sample code comment that the element does not have to be an input).


Browse to localhost:8000

This will display the search widget.

Display the search widget

In the screenshot below, the query asks “what is YouTube?” and includes the generated result.

Example query about YouTube

Best Practices and Considerations


  • Data Quality: Regularly update and clean your data store by implementing a scheduled data review process to remove outdated or irrelevant information

  • Privacy: Implement robust access controls using role-based access control (RBAC) to limit access and create distinct permission levels for data managers, content editors and administrators

  • Performance: Monitor and optimise agent response times by establishing a response time monitoring framework, set up Google Cloud Monitoring alerts for chatbot response time exceeding 2 to 3 seconds, high error rates or failed query resolutions

  • Continuous Learning: Use feedback mechanisms to improve agent responses such as a user rating system after each interaction, e.g., thumbs up/down


Conclusion

Google Vertex AI's Agent Builder represents a significant leap in conversational AI technology. By providing a flexible, powerful platform for creating intelligent chat interfaces, it empowers developers to build sophisticated solutions tailored to specific organisational needs.

Next Steps

  • Experiment with different data store configurations

  • Explore advanced customisation options

  • Continuously refine your agent's performance


References


About Innablr


Discover the future of cloud technology with Innablr, your premier consultancy specialising in delivering cutting-edge solutions for businesses. With a rich track record of successful migrations and transformational projects across various industries, we stand at the forefront of the digital revolution. Our team of seasoned professionals combines deep technical knowledge with strategic insights to guide our clients through every step of their cloud journey seamlessly.


Katherine Lim, Lead Engineer @ Innablr

bottom of page