By Johnny Chan · UI/UX Designer, Hong Kong
How to Apply Harness Engineering: Rules, Skills, and Evals
A week-one harness checklist for Cursor-style agents: project contract, rules, skills, hooks, planning habits, and lightweight evals your team can actually maintain.

Concepts do not ship interfaces—a maintained harness does. This is the operational follow-up to What Is Harness Engineering: how to build layers your team will still trust in six weeks, not just after a demo. The patterns apply to Cursor-class IDEs and compatible setups (Claude Code, Codex, Kiro, or synced .cursor/.agents trees). I use the same stack when Hong Kong startups ask me to tighten design–dev handoff while agents sit in the middle of both sides.
Start with a project contract
Put AGENTS.md at the repo root and treat it like a README for machines. Cap it at one screen: how to build, how to test, where routes and components live, and one or two canonical examples—not a Figma dump. Use real paths (src/components/ui/button.tsx) so updates survive refactors.
- Exact commands for build, typecheck, and test—and when the agent must run them.
- Token file and component entry points the team actually imports from.
- Non-negotiable UX states on new flows: empty, loading, error, success.
- Hard stops: no force-push, no --no-verify, no credentials in diffs.
Layer 1: Rules (always-on context)
Files in .cursor/rules/ load every session. Add a rule only after the same mistake happens twice; bloated rules get ignored. Good rules read like runbooks: import aliases, spacing scale, links to Button.tsx. Bad rules read like brand PDFs the model cannot act on.
Layer 2: Skills (on demand)
Skills (SKILL.md, agentskills.io-compatible) load when the task matches—deploy checklist, SEO pass, research synthesis—so you are not pasting playbooks into every chat. Turn recurring team rituals into skills: open a PR, run design QA on a branch, refresh the sitemap. Designers can own copy-review or research skills without touching application code.
Layer 3: Hooks and guardrails
Hooks fire scripts around agent actions: block commits until tests pass, warn on edits under /api/payments, format on save. Community templates mix path-based risk JSON with memory files and CI gates. Calibrate strictness to blast radius—tight where money and auth live, lighter on marketing pages.
Layer 4: Tools and MCP
Tools are the agent’s hands: patch edits, ripgrep, terminal, browser, MCP bridges to Figma or Sentry. Match edit format to the model (unified diff vs search-replace). Each MCP server costs context and can fail silently—add integrations that shorten handoff, not ones you will forget to maintain.
Planning and context hygiene
- Use plan mode before multi-file work; store plans under .cursor/plans/ for the team.
- One task per conversation; @-reference prior threads instead of dumping history.
- Attach files you are sure about; let codebase search find the rest.
- Define done with checks: tests green, typecheck clean, named states in the UI.
Evals: know the harness is working
Vendors ship harness changes against benchmark suites. You can run a micro-version: five fixed tasks on your repo—add a validated field, fix a flaky test, implement a card from Figma—graded pass/fail after each rules edit. When something breaks, tag the failure (wrong file, skipped test, rogue hex) and fix the layer that failed, not a one-off prompt.
Designer–engineer collaboration checklist
- Figma owns visuals; rules point at token files, not PNGs alone.
- Design QA on agent PRs focuses on breakpoints and edge states, not only the happy path.
- Shared slash commands or skills for review so PRs look like your team wrote them.
- Monthly harness hygiene: delete stale rules; promote repeat fixes into skills.
Treat the harness like a design system for agents: versioned, intentional, and owned.
Sources
Cursor’s agent best practices (instructions, skills, hooks, plan mode), their harness improvement write-up, and community rule compilers that sync across IDEs are the right starting points. Run this checklist against your repo—what works on a Next.js marketing site may need edits for a native app.
Let's work together
Open to UI/UX projects, collaborations, and product design support in Hong Kong and remotely.
Let's Connect