SEARCH vs SEARCH ALL - Performance Tuning
Table of Contents
Interview Question
"Your payment routing program has performance issues:
Current Implementation:
01 BANK-ROUTING-TABLE.
05 BANK-ENTRY OCCURS 50000 TIMES INDEXED BY BANK-IDX.
10 ROUTING-NUMBER PIC 9(9).
10 BANK-NAME PIC X(30).
10 SWIFT-CODE PIC X(11).
PROCEDURE DIVISION.
* Look up bank by routing number
SET BANK-IDX TO 1
SEARCH BANK-ENTRY
AT END
MOVE 'NOT FOUND' TO ERROR-MSG
WHEN ROUTING-NUMBER(BANK-IDX) = WS-INPUT-ROUTING
MOVE BANK-NAME(BANK-IDX) TO WS-BANK-NAME
END-SEARCH.
Performance Data:
- Transaction volume: 10,000/minute during peak
- Average lookup time: 250ms (unacceptable - SLA is 50ms)
- Table loaded once at program initialization
- Routing numbers are sequential (101000001 to 101050000)
Requirements:
1. Reduce lookup time to <10ms
2. No additional storage (table is 2.5MB already)
3. Must handle 'not found' gracefully
4. Maintain existing functionality
Show me the optimized solution."
What This Tests
Good Answer Should Include
1. Performance Analysis:
Current (Linear SEARCH):
- Best case: 1 comparison (first element)
- Worst case: 50,000 comparisons (last element)
- Average case: 25,000 comparisons
- Average time: 25,000 × 10µs = 250ms ✓ Matches symptom!
Optimized (Binary SEARCH ALL):
- Always: log₂(50,000) = 15.6 ≈ 16 comparisons
- Average time: 16 × 10µs = 160µs = 0.16ms ✓ Meets <10ms requirement!
2. Optimized Solution:
WORKING-STORAGE SECTION.
01 BANK-ROUTING-TABLE.
05 BANK-ENTRY OCCURS 50000 TIMES
ASCENDING KEY IS ROUTING-NUMBER
INDEXED BY BANK-IDX.
10 ROUTING-NUMBER PIC 9(9).
10 BANK-NAME PIC X(30).
10 SWIFT-CODE PIC X(11).
01 WS-SORT-NEEDED PIC X VALUE 'Y'.
PROCEDURE DIVISION.
* One-time sort at program initialization
IF WS-SORT-NEEDED = 'Y'
PERFORM SORT-BANK-TABLE
MOVE 'N' TO WS-SORT-NEEDED
END-IF.
* Binary search lookup
SEARCH ALL BANK-ENTRY
AT END
MOVE 'NOT FOUND' TO ERROR-MSG
PERFORM LOG-INVALID-ROUTING
WHEN ROUTING-NUMBER(BANK-IDX) = WS-INPUT-ROUTING
MOVE BANK-NAME(BANK-IDX) TO WS-BANK-NAME
MOVE SWIFT-CODE(BANK-IDX) TO WS-SWIFT
END-SEARCH.
SORT-BANK-TABLE SECTION.
* In-memory sort using COBOL SORT
SORT SORT-WORK-FILE
ON ASCENDING KEY SORT-ROUTING-NBR
USING INPUT-FILE
GIVING OUTPUT-FILE.
* Or use table sort (if available in compiler)
PERFORM VARYING BANK-IDX FROM 1 BY 1
UNTIL BANK-IDX > 49999
PERFORM VARYING BANK-IDX2 FROM BANK-IDX BY 1
UNTIL BANK-IDX2 > 50000
IF ROUTING-NUMBER(BANK-IDX) > ROUTING-NUMBER(BANK-IDX2)
PERFORM SWAP-ENTRIES
END-IF
END-PERFORM
END-PERFORM.
3. Key Differences Documented:
| Feature | SEARCH | SEARCH ALL |
|---|---|---|
| Algorithm | Linear (sequential) | Binary |
| Data Requirement | Any order | MUST be sorted |
| ASCENDING KEY | Not required | Required in OCCURS |
| Index Initialization | Must SET to 1 | Not needed |
| Conditions | Any (>, <, =) | Only = (equality) |
| WHEN Clauses | Multiple allowed | Single only |
| Performance | O(n) | O(log n) |
| Best for | Small tables (<100) | Large tables (>100) |
4. Edge Cases Handled:
* Validate table is sorted (in test environment)
IF TEST-MODE = 'Y'
PERFORM VARYING BANK-IDX FROM 1 BY 1
UNTIL BANK-IDX >= 50000
IF ROUTING-NUMBER(BANK-IDX) > ROUTING-NUMBER(BANK-IDX + 1)
DISPLAY 'ERROR: Table not sorted at index ' BANK-IDX
MOVE 16 TO RETURN-CODE
STOP RUN
END-IF
END-PERFORM
END-IF.
* Handle duplicate routing numbers
SEARCH ALL BANK-ENTRY
WHEN ROUTING-NUMBER(BANK-IDX) = WS-INPUT-ROUTING
* SEARCH ALL finds first match
* If duplicates possible, validate uniqueness
MOVE BANK-IDX TO WS-FOUND-IDX
IF BANK-IDX < 50000
IF ROUTING-NUMBER(BANK-IDX + 1) = WS-INPUT-ROUTING
DISPLAY 'WARNING: Duplicate routing number'
END-IF
END-IF
END-SEARCH.
5. Alternative: Hash Table Pattern
* For ultimate performance: hash table
01 HASH-TABLE.
05 HASH-BUCKET OCCURS 10000 TIMES INDEXED BY HASH-IDX.
10 BUCKET-ENTRY OCCURS 5 TIMES.
15 ROUTING-NUMBER PIC 9(9).
15 BANK-NAME PIC X(30).
COMPUTE HASH-IDX = FUNCTION MOD(WS-INPUT-ROUTING, 10000) + 1.
* Direct access - O(1) average case!
Red Flags
- ❌ Doesn't know SEARCH ALL requires sorted data
- ❌ Can't calculate performance difference
- ❌ Suggests caching without understanding binary search
- ❌ Doesn't mention ASCENDING KEY requirement
- ❌ Can't explain O(n) vs O(log n) complexity
Follow-Up Questions
- "What if the table is loaded unsorted from a file?"
- "How would you test that SEARCH ALL is working correctly?"
- "What happens if you use SEARCH ALL on unsorted data?"
- "At what table size does SEARCH ALL become worthwhile?"
Difficulty Level
Senior
Relevant Roles
Performance Analyst, Senior Developer, Architect