Hacker Newsnew | past | comments | ask | show | jobs | submit | nickcoffee's commentslogin

Building Vox Labs, an AI-powered outbound sales platform for founders, small sales teams, and agencies. The idea is you work with it the way you'd direct an SDR: give it guidance and it handles research, enrichment, personalization, and sequencing.

Stack is React frontend, Node.js backend, Claude as the primary AI layer for orchestration.

The most interesting engineering problem has been the orchestration layer. Getting an agent to handle multi-step outbound workflows with real judgment, not just automation, takes a lot of iteration on the prompting and state management side. Also feedback from users on the best workflows to collaborate with agents.


The local-first angle is interesting, especially for CRM data. Seeing this trend in observability and data engineering use cases as well.


Been running Claude Code and the $200/month has been one of the better value decisions I've made as a founder.

The more interesting question is where the margins go as inference costs keep dropping. At some point the pricing pressure flows to users.


The practical question I keep coming back to: if the output is meaningfully different and faster, at what point does the reimplementation argument become less about the code and more about the reputation and distribution the original project built? That seems like the harder thing to replicate, and the harder thing to protect.


The supply side problem is what killed this. Asking artists to opt in to something that most of their peers openly oppose is a brutal cold outreach problem before you even get to the product. 1 in 4 artists using the free tier for their own work is actually the most telling stat in here. If the people being compensated don't want to use it themselves, the ethical framing alone isn't enough to drive adoption.

Props for the postmortem.


The acceptance criteria point translates directly outside of coding too. Using Claude Code for sales and operational workflows, having acceptable criteria upfront (along with some manual checks along the way depending on the task) definitely helps the output.


The backlog problem is real, projects that sat untouched for months are actually getting built now.

The shift I noticed is the bottleneck moved from execution to judgment pretty fast. You spend less time writing and more time deciding what actually matters. For anyone coming back to building after years in management, that's a good trade.


The behavioral dataset problem is the most interesting part of this. Response latency by role, channel preference by demographic, that data genuinely doesn't exist anywhere off the shelf, so you have to build it from real interactions. Curious how long it took before you had enough signal to start making confident decisions on, say, follow-up timing by segment.

The staffing use case makes a lot of sense as a wedge. 1000+ interviews a week across SMS and email is exactly the kind of workflow where coordinators are drowning and no one's built the right tool yet. Good luck with it.


Thank you Nick!


The human-in-the-loop framing gets undersold in these debates. From what I've seen, the people getting the most out of this stuff aren't replacing judgment, they're delegating the parts that didn't need judgment in the first place.


Nice work shipping this. The BEAM's fault tolerance model makes a lot of sense for agent workloads, been thinking about similar tradeoffs on the orchestration side. Curious what the failure recovery looks like when an agent mid-run hits a bad LLM response vs. a process crash.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: