Skip to main content
โšก Calmops

Python Performance: list.index() and Faster Lookup Alternatives

Introduction

list.index() is one of Python’s most commonly misused methods for performance-sensitive code. It performs a linear search โ€” O(n) โ€” which is fine for small lists but becomes a bottleneck when called repeatedly on large collections. This article explains why, benchmarks the difference, and shows faster alternatives.

How list.index() Works

list.index(value) scans the list from left to right until it finds the first occurrence of value, then returns its position:

animals = ['cat', 'dog', 'rabbit', 'horse']
index = animals.index('dog')
print(index)  # => 1

Time complexity: O(n) โ€” in the worst case, it checks every element.

# What Python is doing internally:
def my_index(lst, value):
    for i, item in enumerate(lst):
        if item == value:
            return i
    raise ValueError(f"{value} is not in list")

The Performance Problem

The issue isn’t a single call โ€” it’s calling index() repeatedly, especially inside loops:

import time

# Large list
items = list(range(100_000))

# BAD: O(n) per lookup, O(nยฒ) total
start = time.time()
for target in range(1000):
    items.index(target)  # scans up to 100,000 elements each time
print(f"list.index: {time.time() - start:.3f}s")  # => ~0.5s

# GOOD: O(1) per lookup with dict
lookup = {v: i for i, v in enumerate(items)}

start = time.time()
for target in range(1000):
    lookup[target]  # instant hash lookup
print(f"dict lookup: {time.time() - start:.3f}s")  # => ~0.0001s

The dictionary approach is typically 1000-10000x faster for repeated lookups on large lists.

Faster Alternatives

1. Dictionary for Value โ†’ Index Mapping

When you need to look up indices frequently, build a reverse mapping once:

animals = ['cat', 'dog', 'rabbit', 'horse', 'fish']

# Build once: O(n)
animal_to_index = {animal: i for i, animal in enumerate(animals)}

# Lookup: O(1)
print(animal_to_index['dog'])     # => 1
print(animal_to_index['rabbit'])  # => 2

# Handle missing values
print(animal_to_index.get('lion', -1))  # => -1

2. Set for Membership Testing

If you only need to check whether a value exists (not its index), use a set:

animals = ['cat', 'dog', 'rabbit', 'horse']

# O(n) โ€” scans the list
'dog' in animals  # True

# O(1) โ€” hash lookup
animal_set = set(animals)
'dog' in animal_set  # True, much faster for large collections

3. bisect for Sorted Lists

For sorted lists, bisect provides O(log n) search:

import bisect

sorted_items = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

# Find insertion point (O(log n))
pos = bisect.bisect_left(sorted_items, 50)
print(pos)  # => 4

# Verify it's actually there
if pos < len(sorted_items) and sorted_items[pos] == 50:
    print(f"Found 50 at index {pos}")

4. NumPy for Numerical Arrays

For numerical data, NumPy’s np.where is vectorized and much faster:

import numpy as np

arr = np.array([10, 20, 30, 40, 50])

# Find index of value 30
indices = np.where(arr == 30)[0]
print(indices)  # => [2]

# Find all indices matching a condition
indices = np.where(arr > 25)[0]
print(indices)  # => [2, 3, 4]

# argmax / argmin
print(arr.argmax())  # => 4 (index of 50)
print(arr.argmin())  # => 0 (index of 10)

Benchmarking

import timeit
import random

# Setup
n = 10_000
data = list(range(n))
targets = [random.randint(0, n-1) for _ in range(1000)]

# list.index
def bench_list_index():
    for t in targets:
        data.index(t)

# dict lookup
lookup = {v: i for i, v in enumerate(data)}
def bench_dict():
    for t in targets:
        lookup[t]

# set membership
data_set = set(data)
def bench_set():
    for t in targets:
        t in data_set

list_time = timeit.timeit(bench_list_index, number=10)
dict_time = timeit.timeit(bench_dict, number=10)
set_time  = timeit.timeit(bench_set, number=10)

print(f"list.index: {list_time:.4f}s")
print(f"dict:       {dict_time:.4f}s  ({list_time/dict_time:.0f}x faster)")
print(f"set:        {set_time:.4f}s   ({list_time/set_time:.0f}x faster)")

Typical results for n=10,000, 1000 lookups:

list.index: 0.2500s
dict:       0.0001s  (2500x faster)
set:        0.0001s  (2500x faster)

When list.index() Is Fine

Don’t over-optimize. list.index() is perfectly acceptable when:

  • The list is small (< 100 elements)
  • You only call it once or a few times
  • You need the index of the first occurrence in an unsorted list
  • Simplicity matters more than performance
# Fine: small list, called once
colors = ['red', 'green', 'blue']
pos = colors.index('green')  # totally fine

# Fine: finding position in a short config list
VALID_MODES = ['fast', 'balanced', 'thorough']
mode_index = VALID_MODES.index(user_mode)

Choosing the Right Data Structure

Use Case Best Structure Lookup Time
Ordered sequence, rare lookup list O(n)
Frequent value โ†’ index lookup dict O(1)
Membership testing only set O(1)
Sorted data, range queries list + bisect O(log n)
Numerical arrays numpy.ndarray O(n) vectorized
Ordered + fast lookup dict (Python 3.7+ preserves order) O(1)

Practical Pattern: Build Once, Query Many Times

class FastLookupList:
    """List with O(1) index lookup via a backing dict."""

    def __init__(self, items):
        self._items = list(items)
        self._index = {v: i for i, v in enumerate(self._items)}

    def index(self, value):
        if value not in self._index:
            raise ValueError(f"{value} is not in list")
        return self._index[value]

    def __contains__(self, value):
        return value in self._index

    def __getitem__(self, i):
        return self._items[i]

    def __len__(self):
        return len(self._items)

# Usage
words = FastLookupList(['apple', 'banana', 'cherry', 'date'])
print(words.index('cherry'))  # => 2  (O(1))
print('banana' in words)      # => True  (O(1))

Resources

Comments