Using MongoDB $in with Go: Best Practices & Performance

If you have a Go struct like this:

type Student struct {
 Name string `bson:"name"`
 Age int `bson:"age"`
}

Say there are many student names in a slice and you want to fetch the corresponding students. For example:

var names = []string{"lily", "bob", "tom"}

We want to get a students slice

var students []Student

Avoid querying them one by one in a loop:

// don't do this
// time to spend is more than tens of ms, also unpredictable
for token := range tokens {
 searchOne("token", token)
}

Use the MongoDB $in operator to fetch all matching documents in a single query.

The $in operator selects documents where the value of a field equals any value in the specified array. To specify an $in expression, use the following prototype:

{ field: { $in: [<value1>, <value2>, ... <valueN> ] } }

Below is a more idiomatic and robust Go example using the official mongo-go-driver.

Idiomatic Go example

This example demonstrates:

passing a context.Context (with timeout)
using *mongo.Collection
unmarshalling into a typed slice

package main

import (
    "context"
    "fmt"
    "time"

    "go.mongodb.org/mongo-driver/bson"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
)

type Student struct {
    Name string `bson:"name"`
    Age  int    `bson:"age"`
}

// GetStudentsInValues returns students where key is in the provided values slice.
func GetStudentsInValues(ctx context.Context, coll *mongo.Collection, key string, values []string) ([]Student, error) {
    if len(values) == 0 {
        return nil, nil
    }

    filter := bson.M{key: bson.M{"$in": values}}
    opts := options.Find()

    cur, err := coll.Find(ctx, filter, opts)
    if err != nil {
        return nil, err
    }
    defer cur.Close(ctx)

    var students []Student
    if err := cur.All(ctx, &students); err != nil {
        return nil, err
    }
    return students, nil
}

func main() {
    // assume 'coll' is *mongo.Collection already connected
    var coll *mongo.Collection
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    names := []string{"lily", "bob", "tom"}
    students, err := GetStudentsInValues(ctx, coll, "name", names)
    if err != nil {
        fmt.Println("query error:", err)
        return
    }
    fmt.Printf("found %d students\n", len(students))
}

// find $in ops
func GetStudentsInValues(key string, values []string) (students []Students, err error) {
 filter := bson.M{key: bson.M{"$in": values}}
 cursor, err := db.Coll.Find(context.TODO(), filter)
 if err != nil {
  return nil, err
 }
 err = cursor.All(context.TODO(), &students)
 if err != nil {
  return nil, err
 }
 return
}

func main() {
 var students []Student
 var names = []string{"lily", "bob", "tom"}

 students, err := GetStudentsInValues("name", names)
}

In our (simple) measurements, fetching an array of ~30 matches using $in completed in ~20ms while individual requests in a loop (30 separate queries) took ~120ms — but results vary by network, server, and indexing.

Why prefer $in over looping queries?

Single network round-trip: one query retrieves all matches instead of many small requests.
Simpler server-side optimization: MongoDB can use indexes and internal optimizations for set membership.

Indexing and performance

$in benefits from indexes: if the queried field is indexed, MongoDB will use that index to match values from the array. However, large arrays or arrays of low-selectivity values can still be costly.

Tips:

Ensure the field you’re searching (e.g., name) is indexed for larger datasets.
Avoid sending very large arrays in one $in (chunk the input into batches, e.g. 500–1000 values) to avoid request size limits and high memory use on the server.
Use explain() in the Mongo shell or the explain command to inspect the query plan and confirm index usage.

Using non-string values (ObjectIDs, ints)

If you have ObjectID strings, convert them to primitive.ObjectID values and use them in $in:

// example: convert hex id strings to ObjectIDs
import "go.mongodb.org/mongo-driver/bson/primitive"

var hexIDs = []string{"5f3a...", "5f3b..."}
var ids []primitive.ObjectID
for _, h := range hexIDs {
    id, err := primitive.ObjectIDFromHex(h)
    if err != nil {
        // handle parse error
        continue
    }
    ids = append(ids, id)
}
filter := bson.M{"_id": bson.M{"$in": ids}}

Benchmarking: looped FindOne vs $in

Here’s a small benchmark you can use (put in a _test.go file) to compare single-query $in vs multiple FindOne calls. Run it with go test -bench . -benchmem.

func BenchmarkFindIn(b *testing.B) {
    // prepare names slice and coll
    for i := 0; i < b.N; i++ {
        _, _ = GetStudentsInValues(ctx, coll, "name", names)
    }
}

func BenchmarkFindOneLoop(b *testing.B) {
    for i := 0; i < b.N; i++ {
        for _, n := range names {
            var s Student
            _ = coll.FindOne(ctx, bson.M{"name": n}).Decode(&s)
        }
    }
}

Caveats & best practices

Sanitize and validate input before building query arrays (avoid untrusted/unlimited inputs).
Watch for the maximum BSON document size (16 MB) and request/response limits: chunk when necessary.
For very large sets consider alternative designs (temporary collection + aggregation, $lookup, or server-side pagination).

Conclusion

Prefer a single $in query over issuing many individual queries in a loop. Use correct indexing and moderate array sizes for best performance.

Using MongoDB $in with Go: Best Practices & Performance

Idiomatic Go example

Why prefer $in over looping queries?

Indexing and performance

Using non-string values (ObjectIDs, ints)

Benchmarking: looped FindOne vs $in

Caveats & best practices

Conclusion

References

Conclusion

References

Comments