If you have a Go struct like this:
type Student struct {
Name string `bson:"name"`
Age int `bson:"age"`
}
Say there are many student names in a slice and you want to fetch the corresponding students. For example:
var names = []string{"lily", "bob", "tom"}
We want to get a students slice
var students []Student
Avoid querying them one by one in a loop:
// don't do this
// time to spend is more than tens of ms, also unpredictable
for token := range tokens {
searchOne("token", token)
}
Use the MongoDB $in operator to fetch all matching documents in a single query.
The $in operator selects documents where the value of a field equals any value in the specified array. To specify an $in expression, use the following prototype:
{ field: { $in: [<value1>, <value2>, ... <valueN> ] } }
Below is a more idiomatic and robust Go example using the official mongo-go-driver.
Idiomatic Go example
This example demonstrates:
- passing a
context.Context(with timeout) - using
*mongo.Collection - unmarshalling into a typed slice
package main
import (
"context"
"fmt"
"time"
"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
)
type Student struct {
Name string `bson:"name"`
Age int `bson:"age"`
}
// GetStudentsInValues returns students where key is in the provided values slice.
func GetStudentsInValues(ctx context.Context, coll *mongo.Collection, key string, values []string) ([]Student, error) {
if len(values) == 0 {
return nil, nil
}
filter := bson.M{key: bson.M{"$in": values}}
opts := options.Find()
cur, err := coll.Find(ctx, filter, opts)
if err != nil {
return nil, err
}
defer cur.Close(ctx)
var students []Student
if err := cur.All(ctx, &students); err != nil {
return nil, err
}
return students, nil
}
func main() {
// assume 'coll' is *mongo.Collection already connected
var coll *mongo.Collection
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
names := []string{"lily", "bob", "tom"}
students, err := GetStudentsInValues(ctx, coll, "name", names)
if err != nil {
fmt.Println("query error:", err)
return
}
fmt.Printf("found %d students\n", len(students))
}
// find $in ops
func GetStudentsInValues(key string, values []string) (students []Students, err error) {
filter := bson.M{key: bson.M{"$in": values}}
cursor, err := db.Coll.Find(context.TODO(), filter)
if err != nil {
return nil, err
}
err = cursor.All(context.TODO(), &students)
if err != nil {
return nil, err
}
return
}
func main() {
var students []Student
var names = []string{"lily", "bob", "tom"}
students, err := GetStudentsInValues("name", names)
}
In our (simple) measurements, fetching an array of ~30 matches using $in completed in ~20ms while individual requests in a loop (30 separate queries) took ~120ms โ but results vary by network, server, and indexing.
Why prefer $in over looping queries?
- Single network round-trip: one query retrieves all matches instead of many small requests.
- Simpler server-side optimization: MongoDB can use indexes and internal optimizations for set membership.
Indexing and performance
$in benefits from indexes: if the queried field is indexed, MongoDB will use that index to match values from the array. However, large arrays or arrays of low-selectivity values can still be costly.
Tips:
- Ensure the field you’re searching (e.g.,
name) is indexed for larger datasets. - Avoid sending very large arrays in one
$in(chunk the input into batches, e.g. 500โ1000 values) to avoid request size limits and high memory use on the server. - Use
explain()in the Mongo shell or theexplaincommand to inspect the query plan and confirm index usage.
Using non-string values (ObjectIDs, ints)
If you have ObjectID strings, convert them to primitive.ObjectID values and use them in $in:
// example: convert hex id strings to ObjectIDs
import "go.mongodb.org/mongo-driver/bson/primitive"
var hexIDs = []string{"5f3a...", "5f3b..."}
var ids []primitive.ObjectID
for _, h := range hexIDs {
id, err := primitive.ObjectIDFromHex(h)
if err != nil {
// handle parse error
continue
}
ids = append(ids, id)
}
filter := bson.M{"_id": bson.M{"$in": ids}}
Benchmarking: looped FindOne vs $in
Here’s a small benchmark you can use (put in a _test.go file) to compare single-query $in vs multiple FindOne calls. Run it with go test -bench . -benchmem.
func BenchmarkFindIn(b *testing.B) {
// prepare names slice and coll
for i := 0; i < b.N; i++ {
_, _ = GetStudentsInValues(ctx, coll, "name", names)
}
}
func BenchmarkFindOneLoop(b *testing.B) {
for i := 0; i < b.N; i++ {
for _, n := range names {
var s Student
_ = coll.FindOne(ctx, bson.M{"name": n}).Decode(&s)
}
}
}
Caveats & best practices
- Sanitize and validate input before building query arrays (avoid untrusted/unlimited inputs).
- Watch for the maximum BSON document size (16 MB) and request/response limits: chunk when necessary.
- For very large sets consider alternative designs (temporary collection + aggregation, $lookup, or server-side pagination).
Conclusion
Prefer a single $in query over issuing many individual queries in a loop. Use correct indexing and moderate array sizes for best performance.
References
- https://www.mongodb.com/docs/manual/reference/operator/query/in/
- https://www.mongodb.com/docs/manual/core/indexes/
- https://pkg.go.dev/go.mongodb.org/mongo-driver/mongo
Conclusion
Don’t use for loop send queries one by one against a DB, using bulk search.
Comments