{"id":24499932,"url":"https://github.com/luillyfe/semantic-search","last_synced_at":"2025-09-20T09:48:08.423Z","repository":{"id":198977359,"uuid":"701928073","full_name":"luillyfe/semantic-search","owner":"luillyfe","description":null,"archived":false,"fork":false,"pushed_at":"2023-10-24T16:30:01.000Z","size":418,"stargazers_count":2,"open_issues_count":4,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-07-08T17:51:02.507Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://medium.com/google-cloud/building-a-semantic-search-with-vertex-ai-f3ff5303de6a","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/luillyfe.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-10-08T01:37:26.000Z","updated_at":"2024-11-14T22:18:26.000Z","dependencies_parsed_at":"2023-10-16T13:19:03.526Z","dependency_job_id":"3f297b3b-cb66-42f9-bc6e-3789f695c0e1","html_url":"https://github.com/luillyfe/semantic-search","commit_stats":null,"previous_names":["luillyfe/semantic-search"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/luillyfe/semantic-search","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luillyfe%2Fsemantic-search","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luillyfe%2Fsemantic-search/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luillyfe%2Fsemantic-search/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luillyfe%2Fsemantic-search/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/luillyfe","download_url":"https://codeload.github.com/luillyfe/semantic-search/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/luillyfe%2Fsemantic-search/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":276077894,"owners_count":25581305,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-20T02:00:10.207Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-21T22:16:24.234Z","updated_at":"2025-09-20T09:48:08.381Z","avatar_url":"https://github.com/luillyfe.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Semantic Search with Vertex AI\n\nThis project provides a step-by-step guide on how to build a semantic search engine using the Vertex AI APIs. Semantic search is a more sophisticated mechanism to find relevant content to a search rather than traditionally keyword-based. It takes relevant information that it may be not be present in the query. It does so by understanding the context of the input text that the user types on the search box like: user query history, location of user input, among others.\n\n### Limitations\n\nHERE the model we will be trained to provide answers based on semantic similarity.\n\nThis project uses the following Vertex AI APIs:\n\n### APIs:\n\n**Text Embeddings API**: To generate text embeddings for the search index and the query text.\n\n**Vector Search API**: To perform semantic search on the text embeddings.\n\n# Getting Started\n\nTo get started, you will need to create a Google Cloud project. Once you have created a project, you will need to enable the Vertex AI API.\n\n### Building the Search Index\n\nTo build the search index, you will need to first generate text embeddings for the documents that you want to include in the index. You can use the Text Embeddings API to generate text embeddings.\n\nOnce you have generated text embeddings for the documents, you will need to upload the embeddings to Vector Search. You can use the Vector Search API to upload the embeddings.\n\n### Performing Semantic Search\n\nTo perform semantic search, you will need to send a query text to the Vector Search API. The Vector Search API will return a list of documents that are semantically similar to the query text.\n\nExample\nThe following code shows how to perform semantic search using the Vertex AI APIs:\n\n## Import the Vertex AI SDK for Go\n\n```go\nimport (\n    aiplatform \"cloud.google.com/go/aiplatform/apiv1\"\n\t\"cloud.google.com/go/aiplatform/apiv1/aiplatformpb\"\n)\n```\n\n## Get the text embeddings API client.\n\n```go\npredictionServiceClient, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(vertexAIEndpoint))\n```\n\n## Generate text embeddings for a given text.\n\n```go\nresponse, err := s.client.Predict(ctx, \u0026aiplatformpb.PredictRequest{\n\t\tEndpoint:   s.endpoint,\n\t\tInstances:  []*structpb.Value{instances},\n\t\tParameters: parameters,\n\t})\n```\n\n# Training a model using AutoML\n\n```go\nctx := context.Background()\n// AI platform regional endpoint\nendpoint := \"us-central1-aiplatform.googleapis.com:443\"\n// Get a prediction client\npredictClient := ai.NewPredictionClient(ctx, endpoint)\n\n// The dataset demonstrates the use of the Text Embedding API with a vector database.\n// gs://cloud-samples-data/vertex-ai/dataset-management/datasets/bert_finetuning/wide_and_deep_trainer_container_tests_input.jsonl\nfileName := \"wide_and_deep_trainer_container_tests_input.jsonl\"\nlinesChan := make(chan []interface{})\ngo utils.ReadJSONL(fileName, ai.AIDataset{}, linesChan)\n\n// Build the text to embed #limit to one line to ease Results interpretation\nlines := \u003c-linesChan\ndataFrame := ai.NewDataFrame(predictClient.BuildInstance, lines)\n\n// Get Prediction Response\npredictionsChan := make(chan []*structpb.Value)\n\n// The prediction API for AutoML models has a restriction of 5 Instances per request.\nreqInstancesLimit := 5\nnumOfRequests := 0\nfor i := 0; i \u003c= len(dataFrame); i += reqInstancesLimit {\n\tnumOfRequests++\n\n\tend := i + reqInstancesLimit\n\tif end \u003e len(dataFrame) {\n\t\tend = len(dataFrame)\n\t}\n\n\tgo func(dataInBatch []*structpb.Value) {\n\t\tpredictClient.Predict(ctx, predictionsChan, dataInBatch)\n\t}(dataFrame[i:end])\n\t// TODO: Come up with a better rate limiting algorithm\n\ttime.Sleep(12 * time.Second)\n}\n\n// Writing vector embeddings in batches to JSONL file\nutils.WriteJSONLInBatches(\n\t\"vectorEmbeddings.json\",\n\tpredictionsChan,\n\tai.GetVectors,\n\t// Number of issued requests to the Prediction API\n\tnumOfRequests,\n)\n```\n\n# Perform semantic search on the text embeddings.\n\n## Get the Vector Search API client.\n\n```go\nclient, err := aiplatform.NewMatchClient(ctx, option.WithEndpoint(\"102531040.us-central1-145252452137.vdb.vertexai.goog\"))\nif err != nil {\n\tpanic(err)\n}\n```\n\n## Querying the model\n\n```go\nindexEndpoint := os.Getenv(\"GCP_INDEX_ENDPOINT\")\ndeployedIndexId := os.Getenv(\"GCP_INDEX_ID\")\n\n// Query the model\nqueries := []*aiplatformpb.FindNeighborsRequest_Query{\n\t{Datapoint: \u0026aiplatformpb.IndexDatapoint{\n\t\tDatapointId:   uuid.NewString(),\n\t\tFeatureVector: embedding,\n\t}},\n}\n\nrequest := \u0026aiplatformpb.FindNeighborsRequest{\n\tIndexEndpoint:   indexEndpoint,\n\tDeployedIndexId: deployedIndexId,\n\tQueries:         queries,\n}\n// Find all your neighbors\nresponse, _ := client.FindNeighbors(ctx, request)\n\n// Get the closest neighbor to your feature vector (Your query)\nresponse.GetNearestNeighbors()\n```\n\n# Conclusion\n\nThis project provides a step-by-step guide on how to build a Semantic Search engine using the Vertex AI APIs. Semantic search is a powerful tool that can be used to improve the search experience for your users.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluillyfe%2Fsemantic-search","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fluillyfe%2Fsemantic-search","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fluillyfe%2Fsemantic-search/lists"}