TECHNICAL INTELLIGENCE BRIEF

LLM Agents • Coding Agents • Harness/Eval • AI SDLC
2026-05-27 23:40 ICT
QUALITY_GATE_PARTIAL
149 candidates

1Executive Snapshot

149
candidates scanned
0 X
social fallback
64 GitHub
repo signals
20 YT
video signals
4/7
Fabbi domains high-impact

2KOL/OG Feed Watch

PlatformAuthorTimeEngagementURLWhy CTO cares
dev_webhomescout2026-05-27T15:43:24Z2 pts / 0 commentsI built an agentic coding harness across three CLI hostsHN dev discourse proxy
dev_webdash0r2026-05-27T14:51:35Z1 pts / 0 commentsPeers – Multi-agent AI coding with measurable convergenceHN dev discourse proxy
dev_webTval2026-05-27T14:16:48Z1 pts / 0 commentsShow HN: Mneme HQ – repo-native architectural rules for AI coding agentsHN dev discourse proxy
dev_webaming5572026-05-27T13:59:01Z1 pts / 0 commentsAming Claw – Zero-orchestration multi-agent codingHN dev discourse proxy
dev_webD3F2026-05-27T12:06:19Z5 pts / 0 commentsShow HN: Unspaghettit – executable behavior specs for AI coding agentsHN dev discourse proxy
dev_webvbutsomesayw2026-05-27T04:01:44Z3 pts / 0 commentsBill Gates AI on AI (one month later)HN dev discourse proxy
dev_webarmcat2026-05-24T19:37:43Z3 pts / 0 commentsShow HN: Simple Sprite Sheet GenerationHN dev discourse proxy
dev_webjeroen_stulen2026-05-24T10:07:13Z3 pts / 4 commentsShow HN: My first app, artisanally vibe-coded in 4 monthsHN dev discourse proxy
dev_webxendo2026-05-23T11:13:35Z3 pts / 0 commentsZero – Programming Language for AgentsHN dev discourse proxy
dev_webgoodroot2026-05-21T14:59:15Z2 pts / 0 commentsShow HN: opub, donated compute for open-sourceHN dev discourse proxy
dev_webramayac2026-05-20T04:31:50Z2 pts / 0 commentsShow HN: GoPOSIX – a Go-native POSIX userland, ~97% BusyBox-compatibleHN dev discourse proxy
dev_webgruyaume2026-05-12T14:37:45Z1 pts / 0 commentsImplicit Knowledge Is a LiabilityHN dev discourse proxy
dev_webstraydusk2026-05-08T22:57:31Z1 pts / 1 commentsAsk HN: Is agent-driven QA a thing?HN dev discourse proxy
dev_webjdw642026-04-19T08:42:37Z10 pts / 5 commentsAsk HN: May be a basic question, but how can I use AI well?HN dev discourse proxy
dev_webalexblackwell_2026-04-16T15:19:54Z100 pts / 83 commentsLaunch HN: Kampala (YC W26) – Reverse-Engineer Apps into APIsHN dev discourse proxy
dev_webnicola_alessi2026-04-16T20:19:18Z1 pts / 0 commentsAsk HN: Opus 4.7 – is anyone measuring the real token cost on agentic tasks?HN dev discourse proxy
dev_webraghavchamadiya2026-04-06T20:15:26Z1 pts / 0 commentsShow HN: Repowise – Codebase intelligence for AI coding agents (open source)HN dev discourse proxy
dev_webalfredhua2026-02-28T15:32:32Z1 pts / 1 commentsShow HN: Salacia – The First Runtime OS for Agentic CodingHN dev discourse proxy
dev_webextra_cookin2026-02-26T22:07:31Z1 pts / 0 commentsShow HN: Tracecore: Benchmark AI Agents on Deterministic Coding TasksHN dev discourse proxy
dev_webjyoung1052026-02-25T10:03:54Z1 pts / 0 commentsShow HN: Frouter – Live-ping and auto-configure free AI models for coding agentsHN dev discourse proxy
dev_webgk12026-04-29T18:16:23Z4 pts / 0 commentsForgeCode: Top open source coding agent in Terminal-Bench 2.0HN dev discourse proxy
dev_webGodelNumbering2026-04-27T12:35:55Z393 pts / 148 commentsShow HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-previewHN dev discourse proxy
dev_web_nhynes2026-04-13T07:48:11Z1 pts / 0 commentsShow HN: Amber, a capability-based runtime/compiler for agent benchmarksHN dev discourse proxy
dev_webjoozio2026-04-01T12:59:36Z4 pts / 2 commentsClaude Code ranks 39th on terminal bench. The leaked source shows whyHN dev discourse proxy
dev_webbcollins342026-03-31T19:07:11Z4 pts / 2 commentsShow HN: Wozcode – double Claude Code outputHN dev discourse proxy
dev_websuis_siva2026-05-27T16:41:36Z1 pts / 0 commentsShow HN: Hm – a task runner with a Python DSL, growing into a CI/CD systemHN dev discourse proxy
dev_webtweezers0x2026-05-27T16:22:41Z4 pts / 0 commentsShow HN: Workplane – collaborative filesystem for humans and AIHN dev discourse proxy
dev_webgalaxyLogic2026-05-27T15:53:55Z3 pts / 1 commentsCodex has dethroned Claude as the king of AI programmingHN dev discourse proxy
dev_webpixelmash132026-05-27T15:14:11Z1 pts / 0 commentsShow HN: GridPath – Faster and Better Agent for Spreadsheets (Tauri, Rust)HN dev discourse proxy
dev_webalgera2026-05-27T14:59:07Z3 pts / 2 commentsShow HN: Zorilla – vibe-code a 3D game in the browserHN dev discourse proxy
githubopenai2026-05-27T16:45:19Z86279 stars / 12616 forks / 5237 issuesopenai/codexRepo/adoption/build signal
githubpproenca2026-05-27T16:42:42Z93 stars / 11 forks / 6 issuespproenca/agent-tuiRepo/adoption/build signal
githubrocketride-org2026-05-27T16:45:58Z3431 stars / 1062 forks / 130 issuesrocketride-org/rocketride-serverRepo/adoption/build signal
githublightdash2026-05-27T16:41:58Z5854 stars / 725 forks / 1687 issueslightdash/lightdashRepo/adoption/build signal
githubcan13572026-05-27T16:44:16Z7791 stars / 629 forks / 217 issuescan1357/oh-my-piRepo/adoption/build signal

3Trend Radar

Hot now: harness/eval, repo context, CLI agents — ≥126/171 relevant signals.
Emerging: multi-agent convergence, executable specs, architectural rules — 5 HN items today.
Noise: “vibe coding” demos without eval — watch only.
Watchlist: sandbox/security, token-cost metering, enterprise audit logs.

4CTO Evaluation Matrix

SignalThesisEvidenceCounter-signalFabbi implicationConfidenceDecisionNext validation
Harness-first coding agentsEval nội bộ quyết định ROI hơn model hype149 scanned; Terminal-Bench/HN/GitHub linksPublic benchmarks dễ lệch domainNEXA+SYNCA test harness82%trial20 tasks/2 tuần
Context/codebase layerRepo memory giảm hallucinationSerena/ctx/Repowise/Mneme signalsDeltas N/A do snapshot thiếuFARE context pack78%adopt3 repos pilot
Vendor-neutral CLI adapterClaude/Codex/Cursor/Copilot thay đổi nhanh6 product baselines + 64 reposAPI/ToS khác nhauAIOS/NEXA abstraction75%trial4 adapters smoke
Governed HITL workflowEnterprise cần logs + approvalsGitHub Copilot docs + social skepticismSlower dev flowSYNCA risk gates72%adoptpolicy PR template

5Repo Watch

RepoMetricSignal
openai/codex86279 stars / 12616 forks / 5237 issuesRepo/adoption/build signal
pproenca/agent-tui93 stars / 11 forks / 6 issuesRepo/adoption/build signal
rocketride-org/rocketride-server3431 stars / 1062 forks / 130 issuesRepo/adoption/build signal
lightdash/lightdash5854 stars / 725 forks / 1687 issuesRepo/adoption/build signal
can1357/oh-my-pi7791 stars / 629 forks / 217 issuesRepo/adoption/build signal
stevesolun/ctx371 stars / 47 forks / 1 issuesRepo/adoption/build signal
microsoft/skills2400 stars / 268 forks / 49 issuesRepo/adoption/build signal
gastownhall/gascity840 stars / 272 forks / 422 issuesRepo/adoption/build signal
l3gi0nXXXX/Metis-agent130 stars / 13 forks / 0 issuesRepo/adoption/build signal

6Impact Coverage

DomainNow 0-2wNext 1-2mLater 3-6mMove
FARERepo context packArchitectural rulesMemory evaladopt
NEXACLI harnessMulti-agent runnerVendor-neutral agent platformtrial
SYNCAQuality gatesRisk scoringAudit evidence lakeadopt
DOMUSMonitorWorkflow automationDomain agentsmonitor
Japan/VN/GlobalEnterprise coding-agent PoCCompliance-led offerManaged AI SDLC packagetrial

7CTO Recommendations

1. Build NEXA eval harness
ROI/time-saving 18-25%; risk 2/5; owner Head of AI Eng; TTV 10 ngày; validate: 20 terminal tasks, pass@1/cost.
2. FARE context pack
ROI/time-saving 12-20%; risk 2/5; owner Platform Lead; TTV 7 ngày; validate: 3 repos, bugfix accuracy.
3. Vendor-neutral CLI adapter
ROI/time-saving 10-15%; risk 3/5; owner DevEx Lead; TTV 14 ngày; validate: Claude/Codex/Cursor/OpenCode smoke.
4. SYNCA governance gate
ROI/time-saving 8-12%; risk 2/5; owner QA/Security Lead; TTV 10 ngày; validate: PR policy + audit logs.

8Must-read Sources / Source Appendix

  1. S01 [dev_web] I built an agentic coding harness across three CLI hosts — 2 pts / 0 comments; homescout; HN dev discourse proxy
  2. S02 [dev_web] Peers – Multi-agent AI coding with measurable convergence — 1 pts / 0 comments; dash0r; HN dev discourse proxy
  3. S03 [dev_web] Show HN: Mneme HQ – repo-native architectural rules for AI coding agents — 1 pts / 0 comments; Tval; HN dev discourse proxy
  4. S04 [dev_web] Aming Claw – Zero-orchestration multi-agent coding — 1 pts / 0 comments; aming557; HN dev discourse proxy
  5. S05 [dev_web] Show HN: Unspaghettit – executable behavior specs for AI coding agents — 5 pts / 0 comments; D3F; HN dev discourse proxy
  6. S06 [dev_web] Bill Gates AI on AI (one month later) — 3 pts / 0 comments; vbutsomesayw; HN dev discourse proxy
  7. S07 [dev_web] Show HN: Simple Sprite Sheet Generation — 3 pts / 0 comments; armcat; HN dev discourse proxy
  8. S08 [dev_web] Show HN: My first app, artisanally vibe-coded in 4 months — 3 pts / 4 comments; jeroen_stulen; HN dev discourse proxy
  9. S09 [dev_web] Zero – Programming Language for Agents — 3 pts / 0 comments; xendo; HN dev discourse proxy
  10. S10 [dev_web] Show HN: opub, donated compute for open-source — 2 pts / 0 comments; goodroot; HN dev discourse proxy
  11. S11 [dev_web] Show HN: GoPOSIX – a Go-native POSIX userland, ~97% BusyBox-compatible — 2 pts / 0 comments; ramayac; HN dev discourse proxy
  12. S12 [dev_web] Implicit Knowledge Is a Liability — 1 pts / 0 comments; gruyaume; HN dev discourse proxy
  13. S13 [dev_web] Ask HN: Is agent-driven QA a thing? — 1 pts / 1 comments; straydusk; HN dev discourse proxy
  14. S14 [dev_web] Ask HN: May be a basic question, but how can I use AI well? — 10 pts / 5 comments; jdw64; HN dev discourse proxy
  15. S15 [dev_web] Launch HN: Kampala (YC W26) – Reverse-Engineer Apps into APIs — 100 pts / 83 comments; alexblackwell_; HN dev discourse proxy
  16. S16 [dev_web] Ask HN: Opus 4.7 – is anyone measuring the real token cost on agentic tasks? — 1 pts / 0 comments; nicola_alessi; HN dev discourse proxy
  17. S17 [dev_web] Show HN: Repowise – Codebase intelligence for AI coding agents (open source) — 1 pts / 0 comments; raghavchamadiya; HN dev discourse proxy
  18. S18 [dev_web] Show HN: Salacia – The First Runtime OS for Agentic Coding — 1 pts / 1 comments; alfredhua; HN dev discourse proxy
  19. S19 [dev_web] Show HN: Tracecore: Benchmark AI Agents on Deterministic Coding Tasks — 1 pts / 0 comments; extra_cookin; HN dev discourse proxy
  20. S20 [dev_web] Show HN: Frouter – Live-ping and auto-configure free AI models for coding agents — 1 pts / 0 comments; jyoung105; HN dev discourse proxy
  21. S21 [dev_web] ForgeCode: Top open source coding agent in Terminal-Bench 2.0 — 4 pts / 0 comments; gk1; HN dev discourse proxy
  22. S22 [dev_web] Show HN: OSS Agent I built topped the TerminalBench on Gemini-3-flash-preview — 393 pts / 148 comments; GodelNumbering; HN dev discourse proxy
  23. S23 [dev_web] Show HN: Amber, a capability-based runtime/compiler for agent benchmarks — 1 pts / 0 comments; _nhynes; HN dev discourse proxy
  24. S24 [dev_web] Claude Code ranks 39th on terminal bench. The leaked source shows why — 4 pts / 2 comments; joozio; HN dev discourse proxy
  25. S25 [dev_web] Show HN: Wozcode – double Claude Code output — 4 pts / 2 comments; bcollins34; HN dev discourse proxy
  26. S26 [dev_web] Show HN: Hm – a task runner with a Python DSL, growing into a CI/CD system — 1 pts / 0 comments; suis_siva; HN dev discourse proxy
  27. S27 [dev_web] Show HN: Workplane – collaborative filesystem for humans and AI — 4 pts / 0 comments; tweezers0x; HN dev discourse proxy
  28. S28 [dev_web] Codex has dethroned Claude as the king of AI programming — 3 pts / 1 comments; galaxyLogic; HN dev discourse proxy
  29. S29 [dev_web] Show HN: GridPath – Faster and Better Agent for Spreadsheets (Tauri, Rust) — 1 pts / 0 comments; pixelmash13; HN dev discourse proxy
  30. S30 [dev_web] Show HN: Zorilla – vibe-code a 3D game in the browser — 3 pts / 2 comments; algera; HN dev discourse proxy
  31. S31 [github] openai/codex — 86279 stars / 12616 forks / 5237 issues; openai; Repo/adoption/build signal
  32. S32 [github] pproenca/agent-tui — 93 stars / 11 forks / 6 issues; pproenca; Repo/adoption/build signal
  33. S33 [github] rocketride-org/rocketride-server — 3431 stars / 1062 forks / 130 issues; rocketride-org; Repo/adoption/build signal
  34. S34 [github] lightdash/lightdash — 5854 stars / 725 forks / 1687 issues; lightdash; Repo/adoption/build signal
  35. S35 [github] can1357/oh-my-pi — 7791 stars / 629 forks / 217 issues; can1357; Repo/adoption/build signal
  36. S36 [github] stevesolun/ctx — 371 stars / 47 forks / 1 issues; stevesolun; Repo/adoption/build signal
  37. S37 [github] microsoft/skills — 2400 stars / 268 forks / 49 issues; microsoft; Repo/adoption/build signal
  38. S38 [github] gastownhall/gascity — 840 stars / 272 forks / 422 issues; gastownhall; Repo/adoption/build signal
  39. S39 [github] l3gi0nXXXX/Metis-agent — 130 stars / 13 forks / 0 issues; l3gi0nXXXX; Repo/adoption/build signal
  40. S40 [product] Anthropic Claude Code — N/A docs; Anthropic; Product baseline
  41. S41 [product] OpenAI Codex CLI — N/A live via GitHub page; OpenAI; Product/repo baseline
  42. S42 [benchmark] SWE-bench — N/A benchmark site; SWE-bench; Benchmark baseline
  43. S43 [benchmark] Terminal-Bench — N/A benchmark site; Stanford/Terminal-Bench; Benchmark baseline
  44. S44 [product] Cursor Agents — N/A docs; Cursor; IDE agent baseline
  45. S45 [product] GitHub Copilot coding agent — N/A docs; GitHub; Enterprise coding-agent baseline

9Data Quality / Scan Health

Status: QUALITY_GATE_PARTIAL. Counts: {'dev_web': 30, 'github': 64, 'papers_product': 10, 'reddit': 25, 'youtube': 20, 'x': 0, 'facebook_public': 0}. Gates: source_volume=True, social_completeness=False, cited_30_possible=True. Caveat: papers_product=0 do arXiv 429/timeout; Facebook public=0 usable; X dùng search fallback, direct unauth N/A → confidence Medium.