Skip to content

feat(optimizer): [3/N] Analyzer#533

Merged
mkuchenbecker merged 209 commits into
mainfrom
mkuchenb/optimizer-3
May 27, 2026
Merged

feat(optimizer): [3/N] Analyzer#533
mkuchenbecker merged 209 commits into
mainfrom
mkuchenb/optimizer-3

Conversation

@mkuchenbecker
Copy link
Copy Markdown
Collaborator

@mkuchenbecker mkuchenbecker commented Apr 7, 2026

Optimizer Stack

PR Content
#527 Data Model
#530 Database Repos
#531 REST service
#533 (this) Analyzer app
#534 Scheduler app
#599 Spark BatchedOFD app
#tbd Infra, docker-compose, smoke test

Summary

PR 3 of N in the optimizer stack.

Introduces apps/optimizer-analyzer, a Spring Boot CommandLineRunner that evaluates every table in table_stats against pluggable OperationAnalyzer strategies. The first strategy, OrphanFilesDeletionAnalyzer, schedules OFD operations with 24h success / 1h failure retry cadence, a 6h SCHEDULED timeout, and a 5-strike circuit breaker.

Key design choices:

  • Bulk-loads operations and history into maps (one query per type), then iterates the stats list — O(types) queries, not O(tables).
  • Uses the existing generic find() repository methods with null params.
  • Pure unit tests with Mockito — no Spring context needed.

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

Core: AnalyzerRunner — loads table_stats, pre-loads operations and history into maps, evaluates each table against all analyzers, circuit breaker logic.

Strategy interface: OperationAnalyzerisEnabled(table), shouldSchedule(table, currentOp, latestHistory), getCircuitBreakerThreshold().

Cadence policy: CadencePolicy — encapsulates time-based retry logic shared across operation types.

OFD analyzer: OrphanFilesDeletionAnalyzer — enabled via maintenance.optimizer.ofd.enabled table property.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.

25 unit tests:

  • AnalyzerRunnerTest (7 tests) — eligible table insertion, cadence skip, disabled table, shouldSchedule=false, null UUID, circuit breaker trip, below-threshold pass
  • OrphanFilesDeletionAnalyzerTest (18 tests) — isEnabled variants, shouldSchedule for no-op/PENDING/SCHEDULING/SCHEDULED with history combinations
./gradlew :apps:optimizer-analyzer:test
# BUILD SUCCESSFUL — 25 tests pass

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

mkuchenbecker and others added 14 commits April 3, 2026 11:00
Introduces the optimizer service module with:
- MySQL/H2 schema for table_operations, table_stats, table_stats_history,
  and table_operations_history
- JPA entities with JSON column support (vladmihalcea hibernate-types)
- All model/DTO/enum types: OperationType, OperationStatus, TableStats,
  CompleteOperationRequest, JobResult, OperationMetrics, etc.
- JPA AttributeConverters for JobResult and OperationMetrics JSON columns
- MapStruct mapper (OptimizerMapper) for entity→DTO conversion
- Spring Boot application shell and build wiring (settings.gradle,
  build.gradle dockerPrereqs)

No repositories, controllers, or service layer yet — those follow in
subsequent PRs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove OperationMetrics class and converter; stats are read
  directly from table_stats instead of duplicating into operations
- Remove orphanFilesDeleted/orphanBytesDeleted from history entity,
  DTO, and schema; operation-specific data belongs in the result JSON
- Add addedSizeBytes to CommitDelta for tracking write volume
- Fix OperationType javadoc to describe current state, not roadmap
- Fix TableOperationsHistoryRow javadoc: written on operation
  complete, not by Spark app directly
- Add field comments to all DTOs and request objects

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spring Data JPA repositories for all four optimizer tables with
filtered query support. Includes tests exercising save/find,
filtered queries, upsert semantics, and append-only history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address PR review comments: rename findFiltered → find across all repos,
remove redundant findByTableUuid/findByTableUuidSince from history repos,
add explicit assertion to context test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Service interface and implementation for all optimizer CRUD operations
including complete-operation lifecycle, stats upsert with history
double-write, and filtered queries. Three REST controllers expose the
endpoints. The apps/optimizer shared module provides lightweight
entity/repo copies for the analyzer and scheduler apps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align OptimizerDataServiceImpl with renamed repository methods from
optimizer-1 review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Shared JPA entities and repositories for optimizer apps (analyzer,
scheduler). All repos expose a single find method with optional filters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve repo conflicts by taking optimizer-1's clean find-only versions.
Scheduler-specific methods and streamAll removed per review feedback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These fields never belonged in the data model — remove them at the
source rather than adding then deleting in a later PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagate CompleteOperationRequest orphan field removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
H2 integration tests for OptimizerDataServiceImpl covering
completeOperation (write history, not-found) and upsertTableStats
(create, update, history append).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strengthen upsertTableStats test to verify history rows contain the raw
delta stats from each call, not just the row count.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ling

Introduces apps/optimizer-analyzer, a Spring Boot CommandLineRunner that
evaluates every table in table_stats against pluggable OperationAnalyzer
strategies. The first strategy, OrphanFilesDeletionAnalyzer, schedules
OFD operations with 24h success / 1h failure retry cadence, a 6h
SCHEDULED timeout, and a 5-strike circuit breaker.

Key design choices:
- Bulk-loads operations and history into maps (one query per type),
  then iterates the stats list — O(types) queries, not O(tables).
- Uses the existing generic find() repository methods with null params.
- Pure unit tests with Mockito — no Spring context needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker changed the base branch from mkuchenb/optimizer-2 to main May 23, 2026 00:02
- AnalyzerRunner.analyzeDatabase: @transactional; try/catch+continue per
  table save; per-table log dropped to DEBUG, aggregate INFO per database;
  comment on dual defaultLimit (Mockito tests vs @value); replace
  TODO(scale-test) inline comment with link to BDP-102738.
- CadencePolicy: extract intervalFor(status) with exhaustive switch +
  throwing default so new HistoryStatusDto values surface explicitly.
- CadenceBasedOrphanFilesDeletionAnalyzer: drop redundant @Autowired on
  the single constructor.
- CadenceBasedOrphanFilesDeletionAnalyzerTest: rename SUCCESS_INTERVAL/
  FAILURE_INTERVAL to TEST_*; type tableWithProperty as boolean; drop
  redundant tableWithoutProperty (covered by _whenTablePropertiesEmpty).
- TableOperationsHistoryDto: new after() method (replaces ad-hoc
  AnalyzerRunner::moreRecentHistory comparator); trusts service-set
  completedAt invariant.
- application.properties: ofd.success-retry-hours 24 -> 16 so a re-eval
  is guaranteed within any rolling 24h window regardless of when the
  prior run landed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@abhisheknath2011
Copy link
Copy Markdown
Member

Regarding the directory structure, can we make follow two options:

  • Root project can be apps/optimizer/analyzer
  • Root project can be existing one apps/spark and source dir can reflect optimizer.analyzer in the package name.

Address PR #533 review (abhisheknath2011):
> Root project can be apps/optimizer/analyzer

Move:
- apps/optimizer-analyzer/  → apps/optimizer/analyzer/
- package com.linkedin.openhouse.analyzer
  → com.linkedin.openhouse.optimizer.analyzer
- Gradle module :apps:optimizer-analyzer → :apps:optimizer:analyzer

Same shape will land on opt-4 (apps/optimizer-scheduler → apps/optimizer/
scheduler) in a follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mkuchenbecker added a commit that referenced this pull request May 26, 2026
…duler

Brings the analyzer rename from opt-3 (apps/optimizer-analyzer →
apps/optimizer/analyzer, package com.linkedin.openhouse.analyzer →
com.linkedin.openhouse.optimizer.analyzer).

Applies the same shape to the scheduler per PR #533 review:
- apps/optimizer-scheduler/  → apps/optimizer/scheduler/
- package com.linkedin.openhouse.scheduler
  → com.linkedin.openhouse.optimizer.scheduler
- Gradle module :apps:optimizer-scheduler → :apps:optimizer:scheduler

settings.gradle conflict resolved to include both nested modules.
…nalyzer

Per PR #533 review (abhisheknath2011, 2026-05-26):
> As analyzer is going to a service with APIs exposed, can we move the
> code to the services module instead of keeping under apps module?

Move:
- apps/optimizer/analyzer/  →  services/optimizer/analyzer/
- Gradle module :apps:optimizer:analyzer → :services:optimizer:analyzer
- Package unchanged (com.linkedin.openhouse.optimizer.analyzer)

Same shape will land on opt-4 (apps/optimizer/scheduler → services/
optimizer/scheduler).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@abhisheknath2011
Copy link
Copy Markdown
Member

Regarding the directory structure, can we make follow two options:

  • Root project can be apps/optimizer/analyzer
  • Root project can be existing one apps/spark and source dir can reflect optimizer.analyzer in the package name.

Somehow I don't see previous comment to move Analyzer from apps to services module as Analyzer is a service. So added the comment here.

@mkuchenbecker
Copy link
Copy Markdown
Collaborator Author

mkuchenbecker and others added 3 commits May 26, 2026 15:54
…per-database

Per PR #533 review (abhisheknath2011):
> Is this related to max default number of tables to be processed by on
> execution cycle of Analyzer run? Can we update the doc here and reflect
> the same on the property name as well?

Yes — it bounds the per-database working set across the three pre-load
reads. Rename for clarity:

- Property: optimizer.repo.default-limit → analyzer.max-tables-per-database
- Field:    defaultLimit → maxTablesPerDatabase
- Drop the Mockito-flavored implementation note; replace with a javadoc
  describing semantics (per-database cap; tables beyond the bound defer
  to the next cycle).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per PR #533 review (abhisheknath2011):
> Going by the paging default limit of 10k, the DBs with more that 10k
> tables would bot be processed. Can we ensure that all the tables within
> a given DB are process by adjusting the pagination list until we reach
> the end?

Previous code fetched page 0 only — tables past index 10000 were silently
dropped, not "deferred to the next cycle" as the stale doc claimed. Now
analyzeDatabase iterates pages until the page comes back smaller than
the page size (terminates on partial or empty page).

- Wrap the read + per-table loop in a while-pageNumber loop.
- Re-load currentOps and latestHistory per page (per-page bound on the
  in-memory maps; affected tables whose ops fall in a different page get
  treated as "no current op" → may produce a duplicate PENDING that the
  scheduler's cancelDuplicates handles).
- Rename property: analyzer.max-tables-per-database
  → analyzer.tables-page-size  (it caps a page, not the cycle)
- Rename field: maxTablesPerDatabase → tablesPageSize
- New test: analyze_iteratesAllPages_processesEveryTableAcrossPageBoundary
  exercises 3 tables across 2 pages with page size 2.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…se/optimizer/analyzer/AnalyzerRunner.java

Co-authored-by: Abhishek Nath <anath1@linkedin.com>
…ps/optimizer/analyzerapp (#602)

## Status

**(WIP)** — inspection branch on top of `mkuchenb/optimizer-3` (PR
#533). Splits the analyzer into library + deployable.

## Summary

Library/deployable split:

- `services/optimizer/analyzer/` — library (analysis logic;
`AnalyzerRunner`, `OperationAnalyzer`, `CadencePolicy`,
`CadenceBasedOrphanFilesDeletionAnalyzer`)
- `apps/optimizer/analyzerapp/` — deployable Spring Boot wrapper (just
`AnalyzerApplication` + `application.properties`)

The analysis code stays in `services/`; only the
`@SpringBootApplication` entry point moves to `apps/`.

## Why `analyzerapp` and not `analyzer` as the leaf name

Two Gradle leaf projects both named `analyzer` — one at
`:services:optimizer:analyzer`, one at `:apps:optimizer:analyzer` —
produced a self-referential `compileJava → compileJava` cycle.
Disambiguating the apps leaf to `analyzerapp` avoids it. The exact
Gradle mechanism wasn't pinned down (could be the leaf collision, could
be the also-colliding implicit parent at `:apps:optimizer` vs explicit
`:services:optimizer`), but the rename is the smallest change that makes
it build.

## Build changes

`services/optimizer/analyzer/build.gradle`:
- `api project(':services:optimizer')` so `OperationTypeDto`, repos,
etc. are visible on consumers' compile classpath.
- `bootJar { enabled = false }` — no `@SpringBootApplication` here
anymore.
- `jar.archiveClassifier = ''` so the library publishes `analyzer.jar`.

`apps/optimizer/analyzerapp/build.gradle` (new):
- `openhouse.springboot-ext-conventions` + Spring Boot 2.7.8 plugins.
- `implementation project(':services:optimizer:analyzer')` + Spring Boot
starter + JPA + MySQL driver.

`settings.gradle`: `include ':apps:optimizer:analyzerapp'`.

## Testing

- `./gradlew :services:optimizer:analyzer:test` — 20 tests green.
- `./gradlew :apps:optimizer:analyzerapp:bootJar` — produces
`build/analyzerapp/libs/analyzerapp.jar`.

## What does NOT change

- Package: every class keeps
`com.linkedin.openhouse.optimizer.analyzer`. `AnalyzerApplication`'s
`@SpringBootApplication` component-scan picks up the library classes
because they sit in the same package, just in a different module on the
classpath.
- The scheduler (opt-4) is unchanged in this PR.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: mkuchenbecker <mkuchenbecker@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
@mkuchenbecker mkuchenbecker merged commit 38e5467 into main May 27, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants