Skip to main content

Mutation Testing: Verify Your Tests Actually Work

Created: February 23, 2026 Larry Qu 11 min read

Introduction

You can have 100% code coverage but still have bad tests. Mutation testing verifies test quality by introducing small changes (mutations) to your code and checking if tests catch them. If tests pass with the mutated code, the mutation survives—meaning your tests are not effectively verifying behavior.

This guide covers mutation testing with Stryker (JavaScript/TypeScript) and PIT (Java), including advanced configuration, CI integration, cost optimization strategies, and how to interpret mutation scores to improve test quality.

How Mutation Testing Works

┌─────────────────────────────────────────────────────────────────┐
│                  Mutation Testing Process                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                   │
│  Original Code:                                                  │
│  ┌───────────────────────────────────────────┐                  │
│  │  function add(a, b) {                     │                  │
│  │    return a + b;                          │                  │
│  │  }                                        │                  │
│  └───────────────────────────────────────────┘                  │
│                       │                                          │
│                       ▼                                          │
│  Mutation: Change + to -                                        │
│  ┌───────────────────────────────────────────┐                  │
│  │  function add(a, b) {                     │                  │
│  │    return a - b;    ← MUTANT               │                  │
│  │  }                                        │                  │
│  └───────────────────────────────────────────┘                  │
│                       │                                          │
│                       ▼                                          │
│  Run Tests:                                                      │
│  ┌───────────────────────────────────────────┐                  │
│  │  Expect: add(2, 2) = 4                    │                  │
│  │                                           │                  │
│  │  Mutant: 2 - 2 = 0 ≠ 4                   │                  │
│  │  Result: Test FAILED → Mutant KILLED ✓   │                  │
│  └───────────────────────────────────────────┘                  │
│                                                                   │
│  Mutation Score = Killed Mutants / Total Mutants × 100           │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Key Terminology

Term Definition Example
Mutant A modified version of source code a - b instead of a + b
Killed Test detects the mutation (test fails) Expected 4, got 0
Survived Test passes despite mutation Both 2+2 and 2-2 pass
Equivalent mutant Behaviorally identical despite change if (x > 0) vs if (x >= 1)
Mutation score % of mutants killed 85% = good coverage

JavaScript/TypeScript: Stryker

Setup

npm install -D @stryker-mutator/core @stryker-mutator/typescript-checker
npx stryker init

Configuration

// stryker.conf.json
{
  "$schema": "./node_modules/@stryker-mutator/core/schema/stryker-schema.json",
  "mutator": "typescript",
  "packageManager": "npm",
  "reporters": ["html", "clear-text", "dashboard", "progress"],
  "buildCommand": "npm run build",
  "testRunner": "vitest",
  "coverageAnalysis": "perTest",
  "concurrency": 4,
  "timeoutMS": 5000,
  "mutate": [
    "src/**/*.ts",
    "!src/**/*.spec.ts",
    "!src/**/*.test.ts",
    "!src/types/**"
  ],
  "thresholds": {
    "high": 80,
    "low": 70,
    "break": 60
  },
  "ignoredMutators": [
    "StringLiteral"
  ],
  "checkers": ["typescript"],
  "tsconfigFile": "tsconfig.json"
}

Running

# Run mutation testing
npx stryker run

# Output:
# [Mutation test] Finished in 45 seconds
# Mutant killed: 45/50 (90%)
# ┌───────────────────────────────────────────────┐
# │ File          │ Mutation score                │
# │ ──────────────│─────────────────────────────── │
# │ math.ts       │ 95%   ████████████████████    │
# │ string.ts     │ 85%   █████████████████░░░    │
# │ utils.ts      │ 75%   ███████████████░░░░░ ⚠️ │
# │ payments.ts   │ 60%   ████████████░░░░░░░ 🔴 │
# └───────────────────────────────────────────────┘

Advanced Stryker Configuration

{
  "mutate": ["src/**/*.ts", "!src/**/*.spec.ts"],
  "testRunner": "vitest",
  "vitest": {
    "configFile": "vitest.config.ts"
  },
  "coverageAnalysis": "perTest",
  "concurrency": 8,
  "timeoutMS": 10000,
  "inPlace": false,
  "cleanTempDir": true,

  "htmlReporter": {
    "baseDir": "reports/mutation/html"
  },
  "dashboardReporter": {
    "project": "github.com/myorg/myrepo",
    "version": "main",
    "module": "core",
    "baseUrl": "https://dashboard.stryker-mutator.io"
  },

  "incremental": true,
  "incrementalFile": "reports/mutation/stryker-incremental.json",

  "plugins": [
    "@stryker-mutator/jest-runner",
    "@stryker-mutator/typescript-checker"
  ],

  "tempDirName": ".stryker-tmp",
  "maxTestRunnerReuse": 8
}

Test Runner Configuration

// vitest.config.ts — Stryker-compatible Vitest config
import { defineConfig } from 'vitest/config';

export default defineConfig({
  test: {
    globals: true,
    environment: 'node',
    include: ['src/**/*.spec.ts'],
    coverage: {
      provider: 'v8',
      reporter: ['text', 'json', 'html'],
      include: ['src/**/*.ts'],
      exclude: ['src/**/*.spec.ts'],
    },
    testTimeout: 10000,
    hookTimeout: 10000,
    pool: 'forks',
    poolOptions: {
      forks: {
        singleFork: true, // Required for Stryker compatibility
      },
    },
  },
});

Java: PIT

Setup with Maven

<!-- pom.xml -->
<build>
  <plugins>
    <plugin>
      <groupId>org.pitest</groupId>
      <artifactId>pitest-maven</artifactId>
      <version>1.16.0</version>
      <configuration>
        <targetClasses>
          <param>com.example.*</param>
        </targetClasses>
        <targetTests>
          <param>com.example.*</param>
        </targetTests>
        <mutators>
          <mutator>ALL</mutator>
        </mutators>
        <threads>4</threads>
        <timeoutConstant>5000</timeoutConstant>
        <outputFormats>
          <param>HTML</param>
          <param>XML</param>
          <param>CSV</param>
        </outputFormats>
        <mutationThreshold>80</mutationThreshold>
        <coverageThreshold>85</coverageThreshold>
      </configuration>
    </plugin>
  </plugins>
</build>

Setup with Gradle

// build.gradle
plugins {
    id 'info.solidsoft.pitest' version '1.15.0'
}

pitest {
    targetClasses = ['com.example.*']
    targetTests = ['com.example.*']
    mutators = ['ALL']
    threads = 4
    outputFormats = ['HTML', 'XML']
    mutationThreshold = 80
    coverageThreshold = 85
    timestampedReports = false
}

Running

# Maven
mvn org.pitest:pitest-maven:mutationCoverage

# Gradle
gradle pitest

# Results:
# ================================================================================
# >> mutation coverage report
# ================================================================================
#
# Classes : 85% (34/40)
# Methods : 80% (120/150)
# Mutants : 75% (150/200)
#
# >> UNCOVERED MUTATIONS:
# ================================================================================
# com.example.Utils.java:
#   Line 45: replaced + with -                           SURVIVED
#   Line 67: removed conditional - omitted <            SURVIVED
# com.example.PaymentService.java:
#   Line 123: replaced boolean return with false         SURVIVED

Mutation Operators

Operator Reference

arithmetic:
  - "+ → -"
  - "- → +"
  - "* → /"
  - "/ → *"
  - "% → *"

comparison:
  - "< → <="
  - "<= → <"
  - "> → >="
  - ">= → >"
  - "== → !="
  - "!= → =="

boolean:
  - "true → false"
  - "false → true"
  - "&& → ||"
  - "|| → &&"
  - "! → (removed)"

conditional:
  - "Remove if body"
  - "Remove else body"
  - "Negate condition"

string:
  - "Empty string → non-empty"
  - "Non-empty → empty"
  - ".length() → .length() + 1"

number:
  - "x → x + 1"
  - "x → 0"
  - "x → Integer.MAX_VALUE"
  - "x → -x"

void_method:
  - "Remove method call"
  - "Remove return value"

null:
  - "Return null instead of value"
  - "Non-null → null assertion"

collection:
  - "Return empty collection"
  - "Remove element from collection"

Code Examples of Mutations

// Original code
function isAdult(age: number): boolean {
  return age >= 18;
}

// Mutation 1: compare change
function isAdult(age: number): boolean {
  return age > 18;  // < changed to <=
}

// Mutation 2: boolean return flipped
function isAdult(age: number): boolean {
  return !(age >= 18);  // Negated return
}

// Mutation 3: constant change
function isAdult(age: number): boolean {
  return age >= 0;  // 18 changed to 0
}

What Gets Mutated

Operator Code Before Code After Tests Must
Arithmetic total = price + tax total = price - tax Assert correct total
Comparison if (age >= 18) if (age > 18) Test both exact boundary
Boolean return isValid return !isValid Test both true/false paths
Conditional if (x) { doA() } if (!x) { doA() } Test both branches
String name.length() name.length() + 1 Assert exact length
Number return 100 return 0 Assert exact return value
Null return user return null Handle null case
Collection items.add(item) (removed) Assert item is added

Interpreting Results

Mutation Score Guide

Score Rating Meaning Action
90-100% Excellent Tests catch almost all code changes Maintain
80-89% Good Most bugs caught, minor gaps Review surviving mutants
70-79% Warning Notable gaps in test coverage Add tests for survivors
60-69% Poor Many bugs would slip through Major test improvement needed
< 60% Critical Tests provide little value Rewrite test suite

Surviving Mutant Analysis

// Example: surviving mutant analysis

// Source: discount.ts
export function calculateDiscount(price: number, coupon: string): number {
  if (coupon === 'SAVE10') {
    return price * 0.1;  // Mutant: 0.1 → 0.2 survives!
  }
  return 0;
}

// Test that lets the mutant survive
describe('calculateDiscount', () => {
  it('returns 10% discount for SAVE10', () => {
    // This test passes even if discount is 20%
    const result = calculateDiscount(100, 'SAVE10');
    expect(result).toBeGreaterThan(0);  // Too vague!
  });
});

// Better test that kills the mutant
describe('calculateDiscount fixed', () => {
  it('returns exactly 10% for SAVE10 coupon', () => {
    const result = calculateDiscount(100, 'SAVE10');
    expect(result).toBe(10);  // Exact assertion kills the mutant
  });

  it('returns 0 for invalid coupon', () => {
    const result = calculateDiscount(100, 'INVALID');
    expect(result).toBe(0);
  });
});

Equivalent Mutants

Some mutations produce behaviorally equivalent code. These are false positives in mutation testing.

// Equivalent mutant example

// Original
function canAccess(role: string): boolean {
  return role === 'admin' || role === 'superadmin';
}

// Mutant: || → &&
function canAccess(role: string): boolean {
  return role === 'admin' && role === 'superadmin'; // Equivalent?
  // No! This is NOT equivalent — changes behavior
}

// Actual equivalent mutant:
if (x > 0) { ... }  vs  if (x >= 1) { ... }
// These are equivalent for integers (but not floats!)

CI Integration

GitHub Actions

# .github/workflows/mutation-testing.yml
name: Mutation Testing

on:
  pull_request:
    paths:
      - 'src/**'
  schedule:
    - cron: '0 6 * * 1' # Weekly on Monday

jobs:
  mutation:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22

      - run: npm ci

      - name: Run mutation tests
        run: npx stryker run
        env:
          STRYKER_DASHBOARD_API_KEY: ${{ secrets.STRYKER_DASHBOARD_API_KEY }}

      - name: Upload HTML report
        uses: actions/upload-artifact@v4
        with:
          name: mutation-report
          path: reports/mutation/

      - name: Check mutation score
        run: |
          SCORE=$(cat reports/mutation/score.json | jq -r '.mutationScore')
          if (( $(echo "$SCORE < 70" | bc -l) )); then
            echo "❌ Mutation score $SCORE is below threshold of 70%"
            exit 1
          fi
          echo "✅ Mutation score $SCORE passes threshold"

GitLab CI

# .gitlab-ci.yml
mutation-testing:
  stage: test
  image: node:22
  script:
    - npm ci
    - npx stryker run
  artifacts:
    paths:
      - reports/mutation/
    reports:
      junit: reports/mutation/junit.xml
  only:
    - merge_requests
  variables:
    STRYKER_DASHBOARD_API_KEY: $STRYKER_DASHBOARD_API_KEY

Performance Optimization

Mutation testing is computationally expensive. Optimize for speed:

{
  "incremental": true,
  "incrementalFile": "reports/mutation/stryker-incremental.json",
  "concurrency": 8,
  "coverageAnalysis": "perTest",
  "maxTestRunnerReuse": 8,

  // Only mutate changed files when possible
  "incremental": true,
  "ignoreStatic": true
}
# Run on changed files only (incremental mode)
npx stryker run --incremental

# Run on specific files for fast feedback
npx stryker run --mutate src/math.ts

# Limit to specific test files
npx stryker run --testFilter "math*"

Mutation Testing at Scale

Selective Mutation

Running all mutators on large codebases is expensive. Use selective mutation to focus on high-value operators.

{
  "mutators": [
    "ArithmeticOperator",
    "EqualityOperator",
    "ConditionalExpression",
    "BooleanSubstitution",
    "VoidMethodCall",
    "ReturnValue"
  ],
  "ignoredMutators": [
    "StringLiteral",
    "NumberLiteral",
    "ObjectLiteral"
  ]
}

Cost-Benefit Analysis

Strategy Runtime Savings Coverage Impact Use Case
Full mutation 0% (baseline) 100% Release gate, small projects
Selective mutators 40-60% 85-95% Regular CI
Incremental 70-90% 95% Per-PR testing
File-limited 90-95% 100% (changed files) Pre-commit hooks
Timed mode 50-80% Varies Large codebases

Time Budget Configuration

{
  "timeoutMS": 5000,
  "timeoutFactor": 1.5,
  "maxTestRunnerReuse": 8,

  // Stop after time budget exceeded
  "maxMutationScore": null,
  "maxMutants": null,

  // Only run mutation on critical modules
  "mutate": [
    "src/payments/**",
    "src/auth/**",
    "src/billing/**"
  ]
}

Language-Specific Tools

Language Tool Installation Integrations
JavaScript/TypeScript Stryker npm i -D @stryker-mutator/core Jest, Vitest, Mocha, Jasmine
Java PIT Maven/Gradle plugin JUnit, TestNG, Mockito
Python mutmut pip install mutmut pytest, unittest
Python Cosmic Ray pip install cosmic-ray pytest, unittest
Go go-mutesting go install go-mutesting go test
Rust mutagen cargo install mutagen cargo test
Ruby mutant gem install mutant RSpec, Minitest
Kotlin Pitest Gradle pitest plugin JUnit, KotlinTest
C# Stryker.NET dotnet tool install xUnit, NUnit, MSTest

Python: mutmut Example

pip install mutmut

# Run mutation testing
mutmut run --paths-to-mutate src/

# See results
mutmut results

# Show surviving mutants
mutmut show 1  # Show mutant #1 source diff
# Example mutmut output
# ----------
# Legend for mutated file: src/calculator.py
# ⚡ 1: def add(a, b):
# ⚡ 2:     return a - b  # ← mutant here
# ⚡ 3:
# ⚡ 4: def multiply(a, b):
# ⚡ 5:     return a / b  # ← mutant here
#
# Killed: 12/15 (80%)
# Survived: 3/15 (20%)
# Timeout: 0/15 (0%)

Go: go-mutesting Example

go install github.com/gregory91/go-mutesting/...

# Run mutation testing
go-mutesting ./...

# Run on specific package
go-mutesting ./src/math/

# Output:
# PASS    "src/math/divide.go"   "/" -> "*"
# FAIL    "src/math/multiply.go" "*" -> "/"
# Mutation score: 8/10 (80%)

Best Practices

1. Set Realistic Thresholds

Start with a 60% mutation score target for existing projects. Increase to 80%+ for new code. Use thresholds.break to fail CI below the minimum.

2. Run Incrementally

Run full mutation testing nightly or weekly. Use incremental mode in CI to only test changed files on PRs.

3. Focus on Business Logic

Prioritize mutation testing for core business logic, payment processing, and security-critical code. Infrastructure code and simple getters/setters provide less value.

4. Review Surviving Mutants

Not all surviving mutants indicate bad tests. Some may be equivalent mutants. Review each survivor and either add a test or document why it’s acceptable.

5. Combine with Coverage

Use mutation testing alongside code coverage. High coverage with low mutation score means tests exist but don’t verify behavior.

Scenario Coverage Mutation Score Meaning
Good tests 90% 90% Tests verify behavior thoroughly
False confidence 90% 40% Tests exist but don’t verify correctly
Missing tests 40% 60% Some code untested
Over-specified 95% 95% Good coverage, good verification
Integration-heavy 70% 85% Integration tests catch more per test

6. Integrate into Code Review

# PR checklist with mutation testing
pr_checklist:
  - "Mutation score ≥ 70% for modified files"
  - "No surviving mutants in critical business logic"
  - "All surviving mutants reviewed and documented"
  - "Coverage increased or maintained"

Common Pitfalls

1. Testing Only Happy Path

// ❌ Tests that miss error handling
describe('processPayment', () => {
  it('processes valid payment', () => {
    expect(processPayment(validCard)).toBe(true);
  });
  // Missing: failure cases, edge cases, timeouts
});

// ✅ Comprehensive tests
describe('processPayment', () => {
  it('processes valid payment', () => {
    expect(processPayment(validCard)).toBe(true);
  });

  it('rejects expired card', () => {
    expect(() => processPayment(expiredCard)).toThrow('Card expired');
  });

  it('rejects invalid CVV', () => {
    expect(() => processPayment(badCvv)).toThrow('Invalid CVV');
  });

  it('handles gateway timeout', () => {
    expect(() => processPayment(timeoutCard)).toThrow('Gateway timeout');
  });
});

2. Asserting Too Generically

// ❌ Weak assertions let mutants survive
it('returns users', () => {
  const users = getUsers();
  expect(users).toBeDefined();
  expect(users.length).toBeGreaterThan(0);
});

// ✅ Strong assertions kill mutants
it('returns active users sorted by name', () => {
  const users = getUsers();
  expect(users).toHaveLength(3);
  expect(users[0].name).toBe('Alice');
  expect(users[0].status).toBe('active');
  expect(users[1].name).toBe('Bob');
});

Resources

Comments

👍 Was this article helpful?