analysis

Cross-File Analysis

How the cross-file analyzer traces indirect violations through the import graph using reverse BFS, with optional AI enhancement via Claude API.

Cross-File Analysis

The cross-file analyzer is a Phase 3 detector that finds indirect violations -- risky patterns that exist in utility functions outside request handlers but are reachable from handlers through the import graph. It uses reverse BFS on the import graph and optionally enhances results with Claude API analysis.

Why Cross-File Analysis Matters

Scope-aware detectors like the runtime risk detector only flag patterns inside request handlers. But what happens when a handler calls a utility, and that utility contains readFileSync?

// utils/file-helper.ts  (NOT in a handler -- runtime detector skips this)
export function loadTemplate(name: string): string {
  return readFileSync(`./templates/${name}.html`, 'utf-8');
}
// controllers/email.controller.ts
@Controller('email')
export class EmailController {
  @Post('send')
  async send(@Body() dto: SendEmailDto) {
    const template = loadTemplate(dto.template);  // Calls the sync util!
    return this.emailService.send(dto.to, template);
  }
}

The runtime detector does not flag readFileSync in file-helper.ts because it is outside a handler. But the cross-file analyzer traces the import graph and discovers that EmailController.send() (a @Post handler) calls loadTemplate(), which blocks the event loop.

How It Works

Step 1: Collect Unflagged Patterns

During Phase 1, the runtime risk detector reports two outputs:

  • Violations: Risky patterns found inside handler scope (flagged immediately)
  • Unflagged patterns: Risky patterns found outside handler scope (passed to cross-file analyzer)

Each unflagged pattern contains:

interface UnflaggedPattern {
  ruleId: string;       // e.g., "sync-fs-in-handler"
  file: string;         // File containing the pattern
  line: number;         // Line number
  functionName: string; // Function containing the pattern
  patternType: string;  // Human-readable pattern type
  message: string;      // Description
  codeSnippet: string;  // Relevant code for AI analysis
}

Step 2: Build Reverse Import Graph

The cross-file analyzer inverts the import graph. Instead of "A imports B", it builds "B is imported by A". This allows efficient backward traversal from the file containing the risky pattern to all files that directly or indirectly use it.

Forward graph:                  Reverse graph:
controller.ts → service.ts     service.ts ← controller.ts
service.ts → utils/helper.ts   utils/helper.ts ← service.ts

Step 3: BFS Caller Tracing

For each unflagged pattern, the analyzer performs a breadth-first search up to 2 levels deep (configurable via MAX_CALLER_TRACE_DEPTH) through the reverse import graph:

  1. Find all files that import the file containing the pattern
  2. Check if any of those files contain request handlers
  3. If not, continue one more level -- find files that import the importers
  4. Classify each caller as handler, background, or utility

Step 4: Classify Callers

Each file in the call chain is classified:

ClassificationDetectionRisk Level
HandlerNestJS @Get/@Post decorators, Express app.get() callbacks, Koa ctx params, Hapi (req, h) paramsHigh -- violation generated
Background@Cron, @Process, @Interval, @Timeout, @Processor classLow -- not a request-path risk
UtilityEverything else -- intermediate modules, servicesTraced further if within depth limit

Step 5: Generate Violations

Only unflagged patterns that are reachable from at least one handler produce violations. The violation is attributed to the handler file, not the utility file:

interface CrossFileViolation extends Violation {
  sourceFile: string;       // File containing the actual risky code
  sourceLine: number;       // Line of the risky pattern
  sourceFunction: string;   // Function containing the pattern
  aiExplanation?: string;   // Optional AI analysis
  aiRecommendation?: string;
}

Cross-file violations have confidence: 'medium' (vs. 'high' for direct detections) and gateAction: 'warn' (they never block merge on their own).

Traced Violation Example

Consider this three-file chain:

File 1: utils/crypto-helper.ts

import { pbkdf2Sync } from 'crypto';

// Unflagged: sync-crypto outside handler scope
export function hashPassword(password: string, salt: string): Buffer {
  return pbkdf2Sync(password, salt, 100000, 64, 'sha512');
}

File 2: services/auth.service.ts

import { hashPassword } from '../utils/crypto-helper';

@Injectable()
export class AuthService {
  async validatePassword(password: string, user: User): Promise<boolean> {
    const hash = hashPassword(password, user.salt);
    return hash.equals(user.passwordHash);
  }
}

File 3: controllers/auth.controller.ts

import { AuthService } from '../services/auth.service';

@Controller('auth')
export class AuthController {
  @Post('login')
  async login(@Body() dto: LoginDto) {
    const valid = await this.authService.validatePassword(
      dto.password, await this.userService.findByEmail(dto.email),
    );
    if (!valid) throw new UnauthorizedException();
    return this.authService.createToken(dto.email);
  }
}

Trace:

  1. pbkdf2Sync in crypto-helper.ts is an unflagged sync-crypto pattern
  2. Reverse BFS finds auth.service.ts imports crypto-helper.ts (depth 1) -- auth.service.ts is a utility
  3. Reverse BFS finds auth.controller.ts imports auth.service.ts (depth 2) -- auth.controller.ts has @Post('login') handler
  4. Violation generated:
[auth.controller.ts:8] Calls validatePassword() which contains sync-crypto
  -- indirect runtime risk

Source: hashPassword() in utils/crypto-helper.ts:4 uses pbkdf2Sync.
  Replace with async alternative or move to a worker.

Rule ID Mapping

Cross-file violations use distinct rule IDs to differentiate them from direct detections:

Source PatternCross-File Rule ID
sync-fs-in-handlerindirect-sync-fs
sync-cryptoindirect-sync-crypto
sync-compressionindirect-sync-compression
busy-wait-loopindirect-busy-wait
unbounded-json-parseindirect-unbounded-json-parse
dynamic-buffer-allocindirect-dynamic-buffer-alloc

AI Enhancement (Optional)

When enabled, the cross-file analyzer sends suspect patterns and their caller chains to Claude for contextual risk assessment. The AI determines:

  • Whether the pattern is a real risk for each specific caller
  • Which callers are at risk and which are safe (e.g., background jobs are safe, handlers are not)
  • An explanation of the risk chain
  • A concrete recommendation for fixing the issue
  • Severity assessment (warning vs. critical)

How AI Analysis Works

  1. Up to 10 suspects (configurable via MAX_CROSS_FILE_SUSPECTS) are batched into a single Claude API call
  2. Each suspect includes:
    • The pattern type and rule ID
    • The source code snippet
    • The full caller chain with classifications
  3. Claude returns a structured JSON response classifying each suspect

AI Request Format

Stack: NestJS + Prisma

SUSPECT 0:
  Pattern: sync-crypto
  Rule: sync-crypto
  Location: hashPassword() in utils/crypto-helper.ts:4
  Code:
    return pbkdf2Sync(password, salt, 100000, 64, 'sha512');
  Callers:
    Caller 0: login() in auth.controller.ts:8 -- HANDLER (@Post)
    Caller 1: resetPassword() in cron/password-reset.ts:15 -- BACKGROUND

AI Response

[
  {
    "suspectIndex": 0,
    "isRisk": true,
    "riskyCallerIndices": [0],
    "safeCallerIndices": [1],
    "explanation": "pbkdf2Sync in hashPassword() blocks the event loop for 100-500ms when called from the @Post login handler.",
    "recommendation": "Use async crypto.pbkdf2() or move password hashing to a worker thread",
    "severity": "warning"
  }
]

Credit Costs

AI cross-file analysis uses the Anthropic API (Claude Sonnet). Costs are tracked per analysis:

  • Model: Claude Sonnet (latest)
  • Max tokens per call: 3,000 output tokens
  • Typical cost: ~$0.01-0.03 per analysis (depends on suspect count)
  • Scoring weight: ai_concern: 2 debt points per AI-confirmed concern

AI analysis is optional and requires an Anthropic API key. Without it, cross-file analysis still works using deterministic BFS tracing -- the AI layer adds contextual understanding but is not required for basic detection.

Configuration

Cross-file analysis is automatically enabled when the runtime risk detector produces unflagged patterns and the import graph has edges. No explicit configuration is needed.

Tunable constants:

ConstantDefaultDescription
MAX_CROSS_FILE_SUSPECTS10Maximum unflagged patterns to trace
MAX_CALLER_TRACE_DEPTH2Maximum BFS depth through import graph

Limitations

  • Depth limit: The default depth of 2 means patterns reachable through 3+ intermediate files are not detected. This is a deliberate trade-off between accuracy and noise.
  • Dynamic imports: Only static import and require() statements are tracked. Dynamic import() expressions are not in the import graph.
  • Cross-package: The import graph only includes files in the changed set. If the risky utility is in a separate npm package, the trace cannot follow the import across the package boundary.
  • Confidence: Cross-file violations have confidence: 'medium' because the trace is based on file-level imports, not actual call-site analysis. The function might be imported but not used in the handler path.
Technical Debt Radar Documentation