SPIKE: Caterpillar set-up for claude-agent by ma-gk · Pull Request #52 · patterninc/caterpillar

Mahesh Kamble (ma-gk) · 2026-03-31T10:16:22Z

SPIKE: Claude Agent Set-up for Caterpillar

Summary

Adds a comprehensive Claude Code agent configuration to the Caterpillar data pipeline tool. This introduces AI-assisted pipeline authoring, validation, debugging, and optimization through a structured set of agents, skills, rules, hooks, and commands — enabling developers to build, review, and ship pipelines faster with Claude as a copilot.

What's Included

47 new files across the .claude/ directory and a top-level CLAUDE.md project guide.

`CLAUDE.md` — Project Context File

Provides Claude with an overview of Caterpillar's pipeline structure, all 18 task types, available agents, and example pipeline locations so it can reason about pipelines correctly.

`.claude/agents/` — 9 Specialized Sub-Agents

Agent	Purpose
`pipeline-builder-interactive`	Conversational pipeline builder — asks targeted questions then writes YAML
`pipeline-lint`	Structural validation: types, required fields, credential security
`pipeline-validate`	Semantic validation: context keys, JQ expressions, data flow
`pipeline-permissions`	AWS IAM policy generation and region checks
`pipeline-optimizer`	Concurrency, batching, error handling, production-readiness review
`pipeline-review`	Orchestrates lint + validate + permissions + optimize in one pass
`pipeline-debugger`	Error diagnosis, echo probe insertion, fix suggestions
`pipeline-runner`	Builds binary and executes pipeline, interprets output
`source-schema-detector`	Detects schema from live sources (HTTP, S3, SQS, Kafka, file)

`.claude/skills/` — 22 Task-Specific Skills

One skill per Caterpillar task type (file, kafka, sqs, http, jq, split, join, replace, flatten, xpath, converter, compress, archive, sample, delay, echo, sns, http-server, aws-parameter-store, heimdall) plus two meta-skills:

pipeline-builder — schema reference for direct YAML generation
pipeline-tester — generates step-by-step test plans with probe pipelines

`.claude/rules/` — 3 Authoring Rules

pipeline-authoring.md — conventions for task naming, ordering, and structure
pipeline-security.md — credential handling, secret management guardrails
pipeline-testing.md — testing standards and probe pipeline patterns

`.claude/commands/` — 7 Diagnostic Commands

Quick-check commands for infrastructure connectivity:
check-aws, check-http, check-kafka, check-s3, check-sns, check-sqs, check-ssm

`.claude/hooks/` — 3 Lifecycle Hooks

preflight-check.sh — pre-tool-use validation before shell commands
validate-on-save.sh — auto-validates pipeline YAML on file write/edit
run-summary.sh — post-execution summary after shell commands

`.claude/scripts/` & `.claude/settings.json`

run-pipeline.sh — helper script to build and run pipelines
settings.json — permission allow/deny lists and hook configuration

Motivation

Caterpillar pipelines have a rich task ecosystem (18 types) with complex configuration options (auth, context keys, JQ transforms, AWS integrations). This set-up gives Claude the domain knowledge and guardrails to:

Author pipelines interactively with schema-aware guidance
Validate structure, semantics, and data flow before execution
Debug failures with echo probes and targeted diagnosis
Audit AWS IAM permissions and generate minimal policies
Optimize for production readiness (concurrency, batching, error handling)
Test pipelines with isolated probe pipelines and step-by-step plans

Test Plan

Verify CLAUDE.md is picked up as project context in Claude Code sessions
Test interactive pipeline builder agent creates valid YAML for common patterns (file-to-file, HTTP-to-SQS, Kafka-to-S3)
Confirm pipeline-review agent runs the full lint → validate → permissions → optimize sequence
Validate hooks fire correctly: validate-on-save triggers on .yaml writes, preflight-check runs before shell commands
Confirm permission deny list blocks destructive operations (git push --force, aws s3 rm, etc.)
Test diagnostic commands (check-aws, check-sqs, etc.) report connectivity status accurately
Run pipeline-runner agent against example pipelines in test/pipelines/examples/

prasadlohakpure · 2026-03-31T10:40:49Z

Nice start Mahesh!
Do you think it would be possible to have a single skill file for all task types?
As tomorrow if we introduce any change, we will need to update all of the skills.
e.g.

If we introduce metadata for records, all of the skills need to be updated.
If we change Record's context to change to type string : map{string:string}, the change will need to propagate to all skills.
Let me know what you think?

Mahesh Kamble (ma-gk) · 2026-03-31T13:40:56Z

Nice start Mahesh! Do you think it would be possible to have a single skill file for all task types? As tomorrow if we introduce any change, we will need to update all of the skills. e.g.

If we introduce metadata for records, all of the skills need to be updated.

If we change Record's context to change to type string : map{string:string}, the change will need to propagate to all skills.
Let me know what you think?

Thanks prasadlohakpure, I split the skills by task so the AI agent can use only the relevant instructions, which should improve accuracy, reduce hallucination, and save context.

It also makes task-specific updates easier and more scalable. For common changes that need to be applied everywhere, we can add one parent/base skill and propagate those updates through all task-specific skills.

So a hybrid approach may work best: shared parent skill for common logic, and separate skills for each task.

What do you think about the parent skill approach for handling future common changes across all skills?

SPIKE: Caterpillar set-up for claude-agent

55011a1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPIKE: Caterpillar set-up for claude-agent#52

SPIKE: Caterpillar set-up for claude-agent#52
Mahesh Kamble (ma-gk) wants to merge 1 commit intomainfrom
claude-set-up

Mahesh Kamble (ma-gk) commented Mar 31, 2026

Uh oh!

prasadlohakpure commented Mar 31, 2026

Uh oh!

Mahesh Kamble (ma-gk) commented Mar 31, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Mahesh Kamble (ma-gk) commented Mar 31, 2026

SPIKE: Claude Agent Set-up for Caterpillar

Summary

What's Included

CLAUDE.md — Project Context File

.claude/agents/ — 9 Specialized Sub-Agents

.claude/skills/ — 22 Task-Specific Skills

.claude/rules/ — 3 Authoring Rules

.claude/commands/ — 7 Diagnostic Commands

.claude/hooks/ — 3 Lifecycle Hooks

.claude/scripts/ & .claude/settings.json

Motivation

Test Plan

Uh oh!

prasadlohakpure commented Mar 31, 2026

Uh oh!

Mahesh Kamble (ma-gk) commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

`CLAUDE.md` — Project Context File

`.claude/agents/` — 9 Specialized Sub-Agents

`.claude/skills/` — 22 Task-Specific Skills

`.claude/rules/` — 3 Authoring Rules

`.claude/commands/` — 7 Diagnostic Commands

`.claude/hooks/` — 3 Lifecycle Hooks

`.claude/scripts/` & `.claude/settings.json`

Mahesh Kamble (ma-gk) commented Mar 31, 2026 •

edited

Loading