Cowork AI is developing various products using AI. Please show your interest! Learn more

Let's Try the Retrieval in RAG using Go

Implement a simple example of the retrieval part of Retrieval-Augmented Generation using Golang.

In my previous post, we explored the concept of RAG and its origin.

This time, we will implement the retrieval part of Retrieval-Augmented Generation using Go, along with the Google Text Embedding API and SQLite.

SQLite

SQLite is a lightweight database that can be used as a file system or in-memory database.

As explained in the previous post, other databases like PostgreSQL, Mongo, and Elastic Search can also be used, but to avoid diverse configurations directly, SQLite is chosen.

Google Text Embedding API

Google Text Embedding API is a text embedding API provided by Google, which converts sentences into vectors.

While APIs provided by Open AI or other companies can also be utilized, in this context, we use the Google Text Embedding API currently employed by Cowork AI.

A remarkable improvement has been made with a new model called text-embedding-large-exp-03-07, which recently got released. The dimensionality has increased from 768 to 3072, improving performance considerably, drawing significant attention.

In this post, we will also check the results of the new model.

1. DB Setup

To run SQLite in Go, a driver needs to be installed in advance.

go get github.com/mattn/go-sqlite3

And to set up sqlite_vec, execute the following command:

go get -u github.com/asg017/sqlite-vec-go-bindings/cgo

Then initiate sqlite3 in the code as follows. The first parameter is the driver name, and the second is the path of the .db file to be read. To open with a file storage base, enter the path of a specific file, and for an in-memory base, enter :memory:.

db, err := sql.Open("sqlite3", "main.db")
if err != nil {
    log.Panic(err)
}

Subsequently, write the schema. The author chose job as a table assuming that job postings were crawled.

-- in schema.sql
CREATE TABLE IF NOT EXISTS job(
    id TEXT PRIMARY KEY,
    company TEXT,
    title TEXT,
    content TEXT,
    embedding BLOB
)

Then read the SQL file using the go:embed feature and execute it.

// in main.go
import _ "embed"

//go:embed schema.sql
var schema string

func initSchema(ctx context.Context, db *sql.DB) error {
    if _, err := db.ExecContext(ctx, schema); err != nil {
        return err
    }
    return nil
}

2. Fetch Documents for Search

Before creating JobStore, specify sqlite_vec.Auto() in the init method to later use the functions built into sqlite_vec. The init function is called only once when the package is first loaded.

It is assumed that the crawled documents are already stored in the DB.

Define the functions responsible for creating and retrieving with QueryContext method.

Here, GetJobs only fetches all jobs. (Of course, appropriate queries may be needed depending on filtering in practice.)

// in store/job.go
package store

import (
	"context"
	"database/sql"

	sqlite_vec "github.com/asg017/sqlite-vec-go-bindings/cgo"
	"github.com/pkg/errors"
)

func init() {
	sqlite_vec.Auto()
}

type Job struct {
	ID        string
	Title     string
	Company   string
	Content   string
	Embedding []byte
	Distance  float64 // Cosine distance for display (Not stored)
}
type Jobs []Job

type JobStore interface {
	GetJobs(ctx context.Context) (Jobs, error)
	InsertOrReplaceJob(ctx context.Context, job Job) error
	Search(ctx context.Context, query []byte) (Jobs, error)
}

type jobStore struct {
	db *sql.DB
}

func NewJobStore(db *sql.DB) JobStore {
	return &jobStore{db: db}
}

func (j jobStore) GetJobs(ctx context.Context) (Jobs, error) {
	rows, err := j.db.QueryContext(ctx,
		"SELECT id, title, company, content, embedding FROM job",
	)
	if err != nil {
		return nil, errors.Wrap(err, "failed to get jobs")
	}
	defer rows.Close()

	var jobs Jobs
	for rows.Next() {
		var job Job
		if err = rows.Scan(&job.ID, &job.Title, &job.Company, &job.Content, &job.Embedding); err != nil {
			return nil, errors.Wrap(err, "failed to scan job")
		}
		jobs = append(jobs, job)
	}
	return jobs, nil
}

3. Vectorizing Existing Jobs with Embedding API

Now we vectorize existing Jobs using the Embedding API.

To use Google’s Embedding API, use aiplatform from Google Cloud.

go get cloud.google.com/go/aiplatform

The method Predict() of PredictionClient will be used, which works as follows.

3-1. Creating an Embedder

You could use the example code directly, but it might be long and hard to grasp, so separate out the Embedder code.

Though it seems complicated, most of the code references Google’s official reference.

// in embed/embedder.go
package embed

import (
	"context"
	"fmt"

	aiplatform "cloud.google.com/go/aiplatform/apiv1"
	"cloud.google.com/go/aiplatform/apiv1/aiplatformpb"
	"github.com/pkg/errors"
	"google.golang.org/api/option"
	"google.golang.org/protobuf/types/known/structpb"
)

type Embedder interface {
	Embedding(ctx context.Context, taskType string, dimensionality int, input string) ([]float32, error)
}
type embedder struct {
	client    *aiplatform.PredictionClient
	projectID string
	location  string
	model     string
}

func NewEmbedder(ctx context.Context, location string, projectID string, model string) (Embedder, error) {
	apiEndpoint := fmt.Sprintf("%s-aiplatform.googleapis.com:443", location)
	client, err := aiplatform.NewPredictionClient(ctx, option.WithEndpoint(apiEndpoint))
	if err != nil {
		return nil, errors.Wrap(err, "failed to create aiplatform client")
	}

	return &embedder{
		client:    client,
		projectID: projectID,
		location:  location,
		model:     model,
	}, nil
}

func (e embedder) Embedding(ctx context.Context, taskType string, dimensionality int, input string) ([]float32, error) {
	endpoint := fmt.Sprintf("projects/%s/locations/%s/publishers/google/models/%s", e.projectID, e.location, e.model)

	instances := []*structpb.Value{
		structpb.NewStructValue(&structpb.Struct{
			Fields: map[string]*structpb.Value{
				"content":   structpb.NewStringValue(input),
				"task_type": structpb.NewStringValue(taskType),
			},
		}),
	}

	params := structpb.NewStructValue(&structpb.Struct{
		Fields: map[string]*structpb.Value{
			"outputDimensionality": structpb.NewNumberValue(float64(dimensionality)),
		},
	})

	req := &aiplatformpb.PredictRequest{
		Endpoint:   endpoint,
		Instances:  instances,
		Parameters: params,
	}
	resp, err := e.client.Predict(ctx, req)
	if err != nil {
		return nil, errors.Wrap(err, "failed to predict")
	}
	embeddings := make([][]float32, len(resp.Predictions))
	for i, prediction := range resp.Predictions {
		values := prediction.GetStructValue().Fields["embeddings"].GetStructValue().Fields["values"].GetListValue().Values
		embeddings[i] = make([]float32, len(values))
		for j, value := range values {
			embeddings[i][j] = float32(value.GetNumberValue())
		}
	}

	return embeddings[0], nil
}

3-2. Storing Embeddings in DB using Embedder

Now, let’s vectorize the Job using Embedder. The search target was Title and Content. Hence, the code can be written as follows.

The vector type, which is the storage target, is RETRIEVAL_DOCUMENT. This generates embeddings specialized for information retrieval.

Various type-related options are prepared, so consult the official reference to choose one that fits your usage. The author, for example, uses RETRIEVAL_DOCUMENT when saving document vectors.

First, add a function to save to JobStore.

SQLite offers INSERT OR REPLACE which is similar to UPSERT or MERGE in other databases. It performs an INSERT if not exists or UPDATE if exists.

The author has written a function to handle both creation and update simultaneously as follows:

func (j jobStore) InsertOrReplaceJob(ctx context.Context, job Job) error {
    if _, err := j.db.ExecContext(ctx,
        "INSERT OR REPLACE INTO job (id, title, company, content, embedding) VALUES (?, ?, ?, ?, ?)",
        job.ID, job.Title, job.Company, job.Content, job.Embedding,
    ); err != nil {
        return errors.Wrap(err, "failed to insert or replace job")
    }
    return nil
}

Then, write the code to update the embedding field of Job after obtaining the embedding.

// in main.go
// ...
for i, job := range jobs {
    // Skip if embedding already exists (Calling embedding model is costly)
    if job.Embedding != nil {
        slog.InfoContext(ctx, "skip job vector update",
            "id", job.ID,
            "progress", fmt.Sprintf("%d/%d", i+1, len(jobs)),
        )
        continue
    }
    
    // Generate embedding
    embedding, err := embedder.Embedding(ctx, "RETRIEVAL_DOCUMENT", 3072, job.Title+" "+job.Content)
    if err != nil {
        log.Fatal(err)
    }
    
    // Update embedding field of job with serialized embedding
    job.Embedding, err = sqlite_vec.SerializeFloat32(embedding)
    if err != nil {
        log.Fatal(errors.Wrap(err, "failed to serialize embedding"))
    }
    
    // Update job
    if err := jobStore.InsertOrReplaceJob(ctx, job); err != nil {
        log.Fatal(err)
    }
    
    slog.InfoContext(ctx, "job vector updated",
        "id", job.ID,
        "progress", fmt.Sprintf("%d/%d", i+1, len(jobs)),
    )
}

By doing so, you can verify that embeddings are stored in the DB.

4. Vectorizing Queries

Next, we need to vectorize user queries, that is, query. This can be coded as follows:

For testing, I opted to write the query in the form of a question, specifying the type as QUESTION_ANSWERING allowing prose writing.

// in main.go
// ...
query := "As an expert developer handling servers, I have experience in RDB, MongoDB, ElasticSearch, etc. Additionally, I have experience building servers using cloud services like AWS and GCP. Find suitable job postings for me, please."
embedding, err := embedder.Embedding(ctx, "QUESTION_ANSWERING", 3072, query)
if err != nil {
    log.Fatal("failed to embed query:", err)
}

queryVector, err := sqlite_vec.SerializeFloat32(embedding)
if err != nil {
    log.Fatal(errors.Wrap(err, "failed to serialize query vector"))
}

5. Searching

Let’s proceed with the search now. Here, use the built-in function vec_distance_cosine to calculate the cosine distance. When sorted by ASC, the most relevant, i.e., closest document appears first. When sorted by DESC, the least relevant, i.e., farther document appears first.

Add a Search method to the JobStore.

// in store/job.go
type JobStore interface {
    // ...
    Search(ctx context.Context, query []byte) (Jobs, error)
}

// ...

func (j jobStore) Search(ctx context.Context, query []byte) (Jobs, error) {
    rows, err := j.db.QueryContext(ctx, `
            SELECT 
                j.id, 
                j.title,
                j.company,
                j.content, 
                j.embedding,
                vec_distance_cosine(j.embedding, ?) as distance
            FROM job j
            ORDER BY distance ASC
            LIMIT 10;
        `, query)
    if err != nil {
        return nil, errors.Wrap(err, "failed to search jobs")
    }
    defer rows.Close()
    
    var jobs Jobs
    for rows.Next() {
        var job Job
        if err = rows.Scan(&job.ID, &job.Title, &job.Company, &job.Content, &job.Embedding, &job.Distance); err != nil {
           return nil, errors.Wrap(err, "failed to scan job")
        }
        jobs = append(jobs, job)
    }
    return jobs, nil
}

You can then perform searches as follows:

// in main.go
// ...
    queryVector, err := sqlite_vec.SerializeFloat32(embedding)
    if err != nil {
        log.Fatal(err)
    }

    jobs, err = jobStore.Search(ctx, queryVector)
    if err != nil {
        log.Fatal(err)
    }

    for i, job := range jobs {
        fmt.Println(fmt.Sprintf("%d.", i),
            "ID:", job.ID,
            "Company:", job.Company,
            "Title:", job.Title,
            "Length of Content:", len(job.Content),
            "Distance:", job.Distance,
        )
    }

// ...

This allows you to implement the Retrieval, the R in Retrieval-Augmented Generation.

Results

0. ID:  d31fa2dd-5923-4381-84ac-520b92f40713 Company:  Baemin Title:  [Tech] Server Developer in Robot Delivery Platform Team Length of Content:  3873 Distance:  0.23126929998397827
1. ID:  500d06b1-82b4-413e-b37c-29890243dccc Company:  Banksalad Title:  Server Engineer Length of Content:  6501 Distance:  0.24647511541843414
2. ID:  6653e247-228c-420b-b902-761880d89978 Company:  Naver Title:  [NAVER] Advertisement Platform Developer (Experienced) Length of Content:  3673 Distance:  0.2595069408416748
3. ID:  b2ab0026-b31c-476a-93b2-9ee03eddab2c Company:  Baemin Title:  [Tech] CS Product Backend Developer Length of Content:  4322 Distance:  0.26557084918022156
4. ID:  025a2ba9-7a35-47b7-8a62-a3e7b989598b Company:  Kakao Title:  AI Model Platform Development Length of Content:  3246 Distance:  0.2677247226238251
5. ID:  9defd036-6d00-48c1-9d43-18fb9f887a2e Company:  Kakao Title:  DKOS (Kubernetes as a Service) Developer (Experienced) Length of Content:  3819 Distance:  0.2718163728713989
6. ID:  fc5057e1-f658-4605-a327-fe0b56c09b26 Company:  Naver Title:  [NAVER] Container Technology Developer (Experienced) Length of Content:  5755 Distance:  0.272807776927948
7. ID:  2d247818-ef6d-463e-801a-fb281a1255dc Company:  Ohouse Title:  [Focused Recruitment] Senior Software Engineer, Backend Length of Content:  6254 Distance:  0.2729299068450928
8. ID:  fde0e8fd-0cbb-445c-b2c7-c9fc58f832fd Company:  Kakao Title:  [Community] Kakao Games Service Developer (FE/BE) Length of Content:  2198 Distance:  0.2742129862308502
9. ID:  af3e0e63-ded6-4fe9-bcf8-57bdce4735b9 Company:  Dunamu Title:  Cloud Security Officer Length of Content:  3066 Distance:  0.2753821313381195

This implementation surprisingly displays job postings suitable for the person inquiring about server development.

You might suspect that only related data was inserted, but if you change the sorting criterion to DESC and also log Distance, you get results like the following:

0. ID:  7cb7e73e-f91d-4242-a035-21730cc4eb90 Company:  Line Title:  Customer Care AI Project Associate Manager Length of Content:  1570 Distance:  0.38529109954833984
1. ID:  27eaab37-2048-4440-86ed-0430d1f83748 Company:  Line Title:  Japanese Translation Specialist (Open Recruitment) Length of Content:  2174 Distance:  0.36668115854263306
2. ID:  dc46b690-7757-4eb7-beed-72a2dfce42ce Company:  Line Title:  Android Developer Length of Content:  1572 Distance:  0.3642077147960663
3. ID:  837e64ac-0772-49d5-ac11-a0b0d53df925 Company:  Line Title:  HR Operation (Special Recruitment for Veterans) NEW Length of Content:  4220 Distance:  0.36187198758125305
4. ID:  6d44cb36-9fce-4276-8626-95931403c6c0 Company:  Gangnam Unnie Title:  Talent Experience Manager Length of Content:  4997 Distance:  0.3570791482925415
5. ID:  a1730346-2ea0-43d6-bb16-5980529ebb10 Company:  Dunamu Title:  Global Market Research Intern Length of Content:  3607 Distance:  0.3563074767589569
6. ID:  2eabc4d1-21fc-4aff-a823-1bd2f7ce0a27 Company:  Kurly Title:  [Kurly] HMR MD Length of Content:  5009 Distance:  0.3562372922897339
7. ID:  e1eec84a-c0a2-4d1f-890c-3384790383b7 Company:  Danggeun Title:  Accounting Manager - Finance Length of Content:  2506 Distance:  0.35504525899887085
8. ID:  ab758bb0-cb33-4ee6-b4aa-7f054258c26e Company:  Musinsa Title:  Business Analyst (Commerce BA Team) Length of Content:  5931 Distance:  0.35138028860092163
9. ID:  a23ed914-9946-42c6-a2dc-552efcaf150f Company:  Yanolja Title:  Digital Communications Manager Length of Content:  6341 Distance:  0.3509758412837982

As expected, positions unrelated to a server developer, such as Japanese translators, are exposed.

How about an irrelevant question?

query := "Why is the sky blue?"
1. ID:  4c33e6b4-1d85-4df7-b48f-0c31cd8e88b2 Company:  Yanolja Title:  MS PowerBI Engineer (1 year, parental leave replacement) Length of Content:  5337 Distance:  0.46892309188842773
2. ID:  1d1ebbb3-9c9e-41ba-a4fa-6255dda5adc7 Company:  Naver Title:  2025 Team Naver New Hire: Tech Length of Content:  97 Distance:  0.4691943824291229
3. ID:  c3e1a0d0-239c-4504-9529-d2904d4701d6 Company:  Yanolja Title:  [Yanolja Cloud Go Global] HR Manager Length of Content:  6413 Distance:  0.47171342372894287
4. ID:  193f8760-7bca-488f-a9a5-a0d2a0513880 Company:  Yanolja Title:  Global HR Planning Manager Length of Content:  6848 Distance:  0.4720369577407837
5. ID:  cf53fcb0-9978-440d-9529-51e082c8b77e Company:  Yanolja Title:  HR Officer (1 year, contract) Length of Content:  5033 Distance:  0.47699248790740967
6. ID:  6653e247-228c-420b-b902-761880d89978 Company:  Naver Title:  [NAVER] Advertisement Platform Developer (Experienced) Length of Content:  3673 Distance:  0.4846186637878418
7. ID:  1970e9fc-654d-47e8-a4eb-5b3fdec1ac98 Company:  Gangnam Unnie Title:  Search Product Owner Length of Content:  4091 Distance:  0.48492535948753357
8. ID:  ef4376e5-ad58-4bb8-9503-9029b052f3bf Company:  Ohouse Title:  [Focused Recruitment] Senior Software Engineer, Frontend Length of Content:  6087 Distance:  0.48521751165390015
9. ID:  af3e0e63-ded6-4fe9-bcf8-57bdce4735b9 Company:  Dunamu Title:  Cloud Security Officer Length of Content:  3066 Distance:  0.4853784739971161

Results still appear, and with the top 10 returned, even completely unrelated queries yield results.

However, note the difference in Distance values. In the server developer-related query, the most relevant data had a distance around 0.2, while an unrelated query had nearly 0.5 for the most relevant data.

In actual operations, pre-test with multiple keywords, and use methods such as filtering to show results only above a certain distance, e.g., beyond 0.3.

Conclusion

Considering there was a time when converting queries to appropriate vectors for a search engine was deemed impossible, this implementation signifies an impressive advancement.

While much attention is focused on AI, fields like search and data processing continue to evolve without pause over time.

Reference

Cookies
essential