Skip to main content
โšก Calmops

Using MongoDB $in with Go: Best Practices & Performance

Efficiently query multiple values in MongoDB from Go without per-item loops

If you have a Go struct like this:

type Student struct {
 Name string `bson:"name"`
 Age int `bson:"age"`
}

Say there are many student names in a slice and you want to fetch the corresponding students. For example:

var names = []string{"lily", "bob", "tom"}

We want to get a students slice

var students []Student

Avoid querying them one by one in a loop:

// don't do this
// time to spend is more than tens of ms, also unpredictable
for token := range tokens {
 searchOne("token", token)
}

Use the MongoDB $in operator to fetch all matching documents in a single query.

The $in operator selects documents where the value of a field equals any value in the specified array. To specify an $in expression, use the following prototype:

{ field: { $in: [<value1>, <value2>, ... <valueN> ] } }

Below is a more idiomatic and robust Go example using the official mongo-go-driver.

Idiomatic Go example

This example demonstrates:

  • passing a context.Context (with timeout)
  • using *mongo.Collection
  • unmarshalling into a typed slice
package main

import (
    "context"
    "fmt"
    "time"

    "go.mongodb.org/mongo-driver/bson"
    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
)

type Student struct {
    Name string `bson:"name"`
    Age  int    `bson:"age"`
}

// GetStudentsInValues returns students where key is in the provided values slice.
func GetStudentsInValues(ctx context.Context, coll *mongo.Collection, key string, values []string) ([]Student, error) {
    if len(values) == 0 {
        return nil, nil
    }

    filter := bson.M{key: bson.M{"$in": values}}
    opts := options.Find()

    cur, err := coll.Find(ctx, filter, opts)
    if err != nil {
        return nil, err
    }
    defer cur.Close(ctx)

    var students []Student
    if err := cur.All(ctx, &students); err != nil {
        return nil, err
    }
    return students, nil
}

func main() {
    // assume 'coll' is *mongo.Collection already connected
    var coll *mongo.Collection
    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    defer cancel()

    names := []string{"lily", "bob", "tom"}
    students, err := GetStudentsInValues(ctx, coll, "name", names)
    if err != nil {
        fmt.Println("query error:", err)
        return
    }
    fmt.Printf("found %d students\n", len(students))
}
// find $in ops
func GetStudentsInValues(key string, values []string) (students []Students, err error) {
 filter := bson.M{key: bson.M{"$in": values}}
 cursor, err := db.Coll.Find(context.TODO(), filter)
 if err != nil {
  return nil, err
 }
 err = cursor.All(context.TODO(), &students)
 if err != nil {
  return nil, err
 }
 return
}

func main() {
 var students []Student
 var names = []string{"lily", "bob", "tom"}

 students, err := GetStudentsInValues("name", names)
}

In our (simple) measurements, fetching an array of ~30 matches using $in completed in ~20ms while individual requests in a loop (30 separate queries) took ~120ms โ€” but results vary by network, server, and indexing.

Why prefer $in over looping queries?

  • Single network round-trip: one query retrieves all matches instead of many small requests.
  • Simpler server-side optimization: MongoDB can use indexes and internal optimizations for set membership.

Indexing and performance

$in benefits from indexes: if the queried field is indexed, MongoDB will use that index to match values from the array. However, large arrays or arrays of low-selectivity values can still be costly.

Tips:

  • Ensure the field you’re searching (e.g., name) is indexed for larger datasets.
  • Avoid sending very large arrays in one $in (chunk the input into batches, e.g. 500โ€“1000 values) to avoid request size limits and high memory use on the server.
  • Use explain() in the Mongo shell or the explain command to inspect the query plan and confirm index usage.

Using non-string values (ObjectIDs, ints)

If you have ObjectID strings, convert them to primitive.ObjectID values and use them in $in:

// example: convert hex id strings to ObjectIDs
import "go.mongodb.org/mongo-driver/bson/primitive"

var hexIDs = []string{"5f3a...", "5f3b..."}
var ids []primitive.ObjectID
for _, h := range hexIDs {
    id, err := primitive.ObjectIDFromHex(h)
    if err != nil {
        // handle parse error
        continue
    }
    ids = append(ids, id)
}
filter := bson.M{"_id": bson.M{"$in": ids}}

Benchmarking: looped FindOne vs $in

Here’s a small benchmark you can use (put in a _test.go file) to compare single-query $in vs multiple FindOne calls. Run it with go test -bench . -benchmem.

func BenchmarkFindIn(b *testing.B) {
    // prepare names slice and coll
    for i := 0; i < b.N; i++ {
        _, _ = GetStudentsInValues(ctx, coll, "name", names)
    }
}

func BenchmarkFindOneLoop(b *testing.B) {
    for i := 0; i < b.N; i++ {
        for _, n := range names {
            var s Student
            _ = coll.FindOne(ctx, bson.M{"name": n}).Decode(&s)
        }
    }
}

Caveats & best practices

  • Sanitize and validate input before building query arrays (avoid untrusted/unlimited inputs).
  • Watch for the maximum BSON document size (16 MB) and request/response limits: chunk when necessary.
  • For very large sets consider alternative designs (temporary collection + aggregation, $lookup, or server-side pagination).

Conclusion

Prefer a single $in query over issuing many individual queries in a loop. Use correct indexing and moderate array sizes for best performance.

References

Conclusion

Don’t use for loop send queries one by one against a DB, using bulk search.

References

Comments