A working board of repos, threads, screenshots and notes I'm using to stay sharp on AI engineering, agent skills, and Claude Code. Everything is paste-and-go, with an actionable How to use on every card.
Soul-driven AI agent with permission-hardened tools, token budgets, and multi-channel access. Runs 24/7 from CLI or Telegram with SQLite-backed Second Brain memory for persistent context. Extensible via custom skills. MIT license, npm v1.0.6. Designed for autonomous long-running tasks.
How to use
1.Install: npm install mercury-agent
2.Create a soul/identity configuration file for your agent
3.Configure token budgets and permission-hardened tools
4.Connect to Telegram bot or CLI interface
5.Run the agent: mercury-agent start
6.The agent runs 24/7 with SQLite-backed Second Brain memory — it remembers context across sessions
7.Extend with custom skills by adding .js files to the skills directory
Best open-source 24/7 autonomous agent with true persistent memory. The 'soul' concept and permission-hardened tools make it safer than most agents. Perfect for long-running background tasks.
Autonomous white-box AI pentester for web applications and APIs. Analyzes source code, identifies attack vectors, executes real exploits to find vulnerabilities before they reach production. #1 trending on GitHub. AGPL-3.0 license. Built for authorized security testing.
3.Point Shannon at your web app's source code directory
4.Run the autonomous pentester: python shannon.py --target /path/to/source
5.Review the security audit report with real exploit attempts and vulnerability findings
6.Use for authorized security testing only — ensure you have permission before testing any system
First credible autonomous AI pentester that executes real exploits. Game-changer for finding vulnerabilities before production. Use responsibly and only on systems you own or have explicit permission to test.
Open-source AI workflow automation platform with ~400 MCP (Model Context Protocol) servers. 21K+ stars. MCP has achieved critical ecosystem mass — activepieces is the best open-source hub for connecting agents to tools via the emerging 'USB-C for AI' standard. Replaces Zapier/Make for AI-native workflows.
How to use
1.Self-host: docker run -d -p 8080:80 activepieces/activepieces
2.Or use the cloud version at activepieces.com
3.Browse the 400+ MCP server integrations in the marketplace
4.Build a workflow: trigger (webhook/schedule) → AI step (Claude/GPT) → action (Slack/email/DB)
5.Expose your workflow as an MCP server for other agents to call
MCP is becoming the standard for agent tool integration. activepieces is the fastest way to connect your agents to 400+ services without writing integration code.
CLI tool for configuring and monitoring Claude Code by davila7. +1,445 stars this week, 26K total. Provides templates, monitoring dashboards, and configuration management for Claude Code projects. Makes it easy to bootstrap new Claude Code projects with best-practice configurations.
How to use
1.Install: pip install claude-code-templates
2.Initialize a new project: cct init --template fullstack
3.Browse available templates: cct list
4.Monitor your Claude Code session: cct monitor
5.Export your custom config as a reusable template: cct export --name my-setup
Fastest way to bootstrap Claude Code projects with proven configurations. The monitoring dashboard alone is worth it for understanding how your agent spends tokens.
Open-source infrastructure for Computer-Use Agents by trycua. Provides sandboxed desktop environments for agents that control macOS, Linux, and Windows. 14K stars. Enables agents to use any desktop application — not just APIs. Critical infrastructure for the next generation of autonomous agents.
How to use
1.Clone: git clone https://github.com/trycua/cua
2.Install the sandbox runtime for your OS (macOS/Linux/Windows supported)
3.Create a sandboxed environment: cua create --os macos
4.Write an agent that uses the CUA API to control the desktop
5.Run your agent in the sandbox — it can use any GUI app, browser, or terminal
Unlocks computer-use agents for any desktop app — not just web. If you're building agents that need to interact with legacy software or GUIs, this is the foundation.
Garry Tan (YC CEO) open-sourced his entire personal Claude Code configuration: 23 opinionated slash-command skills that turn Claude Code into a virtual engineering team with structured roles — CEO, Designer, Eng Manager, Release Manager, Doc Engineer, QA. 88K stars. MIT license. In 60 days with gstack, Garry shipped 3 production services and 40+ features part-time while running YC full-time — 810× his 2013 coding pace.
How to use
1.Clone the repo: git clone https://github.com/garrytan/gstack
2.Run the installer: ./bin/install.sh (copies skills to your ~/.claude/commands/)
3.In any Claude Code project, type /office-hours to start a product interrogation session
4.Use /plan-ceo-review to scope features, /review for code review, /qa for browser testing
5.Use /land-and-deploy to automate deployment and /canary for post-deploy monitoring
gstack is the highest-leverage Claude Code skill pack available — 23 skills that give you a full virtual engineering team. If you use Claude Code, this is mandatory.
ByteDance's long-horizon SuperAgent with sandboxes, persistent memory, and subagents. Handles multi-hour autonomous tasks. 63K+ stars. Designed for tasks that take minutes to hours — not just single-turn completions. Features memory across sessions, parallel subagent execution, and sandboxed code execution.
2.Install: pip install -r requirements.txt && configure your LLM keys
3.Define a long-horizon task (e.g., 'Research competitors and produce a report')
4.Launch the agent: python run.py --task 'your task here'
5.Monitor subagent progress via the built-in dashboard; review memory artifacts after completion
Best open-source implementation of long-horizon agent execution with memory. The subagent coordination pattern is directly applicable to complex automation workflows.
Hugging Face's open-source ML engineer agent that reads papers, trains models, and ships ML models end-to-end. +3,157 stars this week. Represents a qualitative leap in agent autonomy — not just a coding assistant but a full research-to-production pipeline. The agent can autonomously run experiments, iterate on results, and deploy models.
Multi-agent LLM financial trading framework by TauricResearch. Specialized agents for market analysis, risk management, and trade execution working in concert. 60K+ stars, +6,152 stars this week. A strong example of domain-specific multi-agent architecture that can be studied and adapted for other verticals.
3.Configure your LLM API keys and market data sources in .env
4.Run the multi-agent system: python run_agents.py
5.Study the agent architecture — adapt the role-specialization pattern for your own domain
Best open-source example of production-grade multi-agent coordination for a specific domain. The architecture pattern is directly reusable for any vertical.
Use Claude Code for free in the terminal, VSCode extension, or Discord — like OpenClaw but with voice support. Surged +12,928 stars in a single week (April 2026), the largest single-week gain in the Claude Code ecosystem. Bypasses Anthropic's $100-200/month pricing friction. 19K+ total stars.
The canonical curated list of Claude Code skills, hooks, slash-commands, agent orchestrators, applications, and plugins. Start here when scoping new Claude Code workflows.
How to use
1.Bookmark and skim the README weekly.
2.Pick one new skill/hook to install per week.
3.Watch the contributors — they signal what's actually working.
4.PR back any high-leverage skills you build.
Treat awesome-claude-code as the radar — if it's not on the list, it's probably not yet worth integrating.
Run LLMs locally with one command. The de facto runtime for local inference — supports DeepSeek, Qwen, Gemma, Llama, GLM, and dozens more. Now integrates directly with Claude Code, Codex, and OpenCode.
April 2026 thread from Karpathy on turning raw files (articles, papers, repos, datasets, images) into evolving wikis using LLMs. The new pattern: ingest → index → continuously summarize.
How to use
1.Stand up a `raw/` directory for every source you collect.
2.Run an LLM indexer to produce structured summaries.
3.Maintain a living `wiki/` regenerated from raw + prior wiki.
4.Query the wiki, not the raw — that's where leverage compounds.
Personal knowledge bases are the next personal CRM — start now, or fall behind on compounding context.
Karpathy's `nanochat` — the simplest experimental harness for training an LLM end to end. Capstone project for LLM101n; runs on a single GPU node and covers the full ChatGPT-style pipeline for around $100.
How to use
1.Clone the repo and provision a single GPU node.
2.Follow the README to run the full pipeline (tokenizer → train → serve).
3.Swap dataset/config to fit your domain experiment.
4.Use as a teaching artifact for your team's LLM literacy.
If you can't read every line of nanochat, you don't really understand modern LLMs. Best $100 spent on engineering taste.
The Agency — a curated roster of AI specialist personalities ready to slot into your workflow. Crafted agent personas instead of generic 'helpful assistants'.
How to use
1.Browse the agent catalog and shortlist relevant specialists.
2.Drop the persona files into your agent runtime.
3.Pair specialists into pipelines (e.g., Researcher → Writer → Editor).
4.Iterate persona prompts against real tasks.
Stop writing 'you are a helpful assistant'. Borrow battle-tested personas and ship faster.
AI skill that provides design intelligence for building professional UI/UX. A heavyweight Claude Code skill that injects design taste, layout principles, and component-grade craft into any frontend project.
How to use
1.Clone or download the skill from the repo.
2.Place it under your Claude Code skills directory (`~/.claude/skills/`).
3.Activate the skill in your project and prompt Claude with frontend tasks.
4.Iterate on design quality with high-context prompts.
Treat design as a first-class skill primitive — bolt this onto any frontend project for a baseline of professional taste.
Karpathy-inspired Claude Code guidelines distilled into a single CLAUDE.md with four principles: Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution.
How to use
1.Copy `CLAUDE.md` to the root of your project.
2.Commit it — the agent reads it on every run.
3.Tune the four principles to your team's standards.
4.Use as a baseline for any AI-coded codebase.
A single, opinionated CLAUDE.md raises the floor of every Claude Code session — taste compounds.
Persistent memory layer for Claude Code. Cuts token usage by ~95%, ships a local web viewer at localhost:37777, and exposes a `mem-search` skill for semantic recall.
How to use
1.Install the plugin per the README.
2.Run a session — memory is captured automatically.
3.Open `http://localhost:37777` to inspect the memory store.
4.Use the `mem-search` skill in any future session for instant recall.
Persistent memory is the single biggest unlock for long-running Claude Code work — both quality and cost win.