More

dktp · 2026-05-07T01:25:20 1778117120

As another commenter implied, the title a reference to this - https://www.stilldrinking.org/programming-sucks. Which is an incredible read as well

dktp · 2026-04-27T20:18:09 1777321089

It's in their ToS to allow using Copilot subscription with OpenCode - https://github.blog/changelog/2026-01-16-github-copilot-now-...

Absolutely the cheapest way to get a lot of tokens through a solid harness for $10/month. Until now

dktp · 2026-04-27T13:20:59 1777296059

Loosely related, though I don't think Benjamin Bennett's intention was ever to improve focus/productivity

But it never ceases to amaze me the consistency and time spent sitting and smiling and other similar endeavors by Benjamin - https://www.youtube.com/@BenjaminBennetttt/streams

Cider9986 · 2026-04-27T16:15:53 1777306553

That is insane.

dktp · 2026-04-21T20:13:39 1776802419

One interesting thing I found comparing OpenAI and Gemini image editing is - Gemini rejects anything involving a well known person. Anything. OpenAI is happy to edit and change every time I tried

I have a sideproject where I want to display standup comedies. I thought I could edit standup comedy posters with some AI to fit my design. Gemini straight up refuses to change any image of any standup comedy poster involving a well know human. OpenAI does not care and is happy to edit away

Melatonic · 2026-04-21T20:14:35 1776802475

How does it determine they are well known and not just similar looking?

yreg · 2026-04-21T23:36:34 1776814594

Gemini often rejects photos of random people (even ones it generated itself) because it thinks they look too similar to some well known person.

dktp · 2026-04-21T20:20:22 1776802822

I don't know tbh. I've tried it on 10-20 various level of famous standups and Gemini refuses every time

Just for testing, I just tried this https://i.ytimg.com/vi/_KJdP4FLGTo/sddefault.jpg ("Redesign this image in a brutalist graphic design style"). Gemini refuses (api as well as UI), OpenAI does it

arjie · 2026-04-21T20:53:41 1776804821

It's not super deterministic but it didn't fail once on my attempts. See: https://imgur.com/a/james-acaster-cold-lasagne-1R7fpzQ

dktp · 2026-04-21T21:05:11 1776805511

Very interesting. It fails every single time for me. I'm in Germany, maybe Google is stricter here?

See https://imgur.com/a/77BRDQv

arjie · 2026-04-21T21:28:47 1776806927

That makes sense to me. I just Googled around like a fool and got here https://en.wikipedia.org/wiki/Personality_rights#Germany

It seems like they're trying to follow local law. What a nightmare to have to manage all jurisdictions around such a product. Surprised it didn't kill image generation entirely.

jliptzin · 2026-04-21T22:51:27 1776811887

Yea, especially when they know all that work will be completely pointless in a few years when open source / local models will be just as good and won't have any legal limitations, so people will be generating fake images of famous people like crazy with nothing stopping them

Melatonic · 2026-04-21T20:59:06 1776805146

What if you change the prompt to tell it specifically its not a famous person? Or try it without text?

BoorishBears · 2026-04-22T02:42:14 1776825734

There are models specifically for detecting well known people https://docs.aws.amazon.com/rekognition/latest/dg/celebritie...

qingcharles · 2026-04-22T19:04:01 1776884641

OpenAI wouldn't make me a Looney Tunes Roadrunner Martin Scorsese "Absolute Cinema" parody, but Gemini didn't blink about the trademark violation. Also, the output was really nice:

https://imgur.com/a/Jclezyi

vunderba · 2026-04-21T23:44:36 1776815076

Are you using Google Gemini directly? I've found the Vertex API seems to be significantly less strict.

dktp · 2026-04-21T14:12:09 1776780729

I think these pledges offload some of the risk onto Amazon/Oracle/etc

If Anthropic/OpenAI miss projections, infra providers can somewhat likely still turn around and sell it to the next guy or use it themselves. If they have more demand than expected (as Anthropic currently does), vcs will throw money at them and they can outbid the competition

If they built it themselves and missed projections it's a much more expensive mistake

It's just risk sharing. Infra providers take some of the risk and some of the upside

throwup238 · 2026-04-21T14:28:58 1776781738

> If they built it themselves and missed projections it's a much more expensive mistake

Not if their pricing comes with multiyear commitments for reserved pricing. No doubt they get a huge volume discount but the advertised AWS reserved pricing is already enough for pay for a whole 8x HX00 pod plus the NVIDIA enterprise license plus the staff to manage it after only a one year commitment. On-demand pricing is significantly more expensive so they’re going to be boxed in by errors in capacity planning anyway (as has been happening the last few months).

The economics here are absurd unless you’re involved in a giant circular investment scheme to pump up valuations.

dweekly · 2026-04-21T14:59:39 1776783579

The pricing models that are published on AWS' website almost certainly have almost nothing to do with the pricing models that are discussed behind closed doors for a $100 billion commitment.

throwup238 · 2026-04-21T15:30:48 1776785448

Of course not, but unless they’re getting the sweet heart deal of a lifetime from Amazon of all places, it’s still a hogwash. We’re talking about enough capital to build their own fab and a dozen datacenters*. This deal isn’t going to be buying existing capacity because that’s already stretched, it will be paying for new buildouts.

Afterwards Amazon will be milking the machines these commitments buy for nearly a decade. That tradeoff makes sense at a small scale (even up to $X00 million or even billions), but at $Y0 or $Z00 billion?

Color me skeptical. There are plenty of other side benefits like upgrading to the newest GPUs every few years, but again we’re talking about paying for new buildouts with upfront commitments anyway.

* obviously the timelines, scientific risk, and opportunity cost make this completely infeasible but that’s the scale we’re talking about. It’s a major industrial project on the scale of the thirty year space shuttle program (~$200 billion).

coredog64 · 2026-04-21T19:30:43 1776799843

You can get a significant AWS discount with an annual spend starting around $1M/year.

dktp · 2026-04-18T17:12:27 1776532347

The idea is that smarter models might use fewer turns to accomplish the same task - reducing the overall token usage

Though, from my limited testing, the new model is far more token hungry overall

manmal · 2026-04-18T17:34:21 1776533661

Well you‘ll need the same prompt for input tokens?

httgbgg · 2026-04-18T18:09:31 1776535771

Only the first one. Ideally now there is no second prompt.

manmal · 2026-04-18T18:17:19 1776536239

Are you aware that every tool call produces output which also counts as input to the LLM?

squeaky-clean · 2026-04-19T02:58:45 1776567525

Are you aware that a lot of model tool calls are useless and a smarter model could avoid those?

Are you aware that output tokens are priced 5x higher than input tokens?

manmal · 2026-04-19T04:17:55 1776572275

> a lot of model tool calls are useless

That’s just wrong. File reads, searches, compiler output, are the top input token consumers in my workflow. None of them can be removed. And they are the majority of my input tokens. That’s also why labs are trying to make 1M input work, and why compaction is so important to get right.

Regarding output - yes, but that wasn’t the topic in this thread. It’s just easier to argue with input tokens that price has gone up. I have a hunch the price for output will go up similarly, but can’t prove it. The jury’s out IMO: https://news.ycombinator.com/item?id=47816960

httgbgg · 2026-04-19T16:32:25 1776616345

This has no bearing on my comment. The point is that a better model avoids dozens of prompts and tool calls by making fewer CORRECT tool calls, with the user needing no more prompts.

I’m surprised this is even a question; obviously a better prompter has the same properties and it’s not in dispute?

dktp · 2026-03-03T18:01:08 1772560868

Opus 4.5 became significantly cheaper than Opus 4.1

dktp · 2026-02-24T13:15:45 1771938945

From recent personal examples

We have a somewhat complicated OpenSearch reindexing logic and we had some issue where it happened more regularly than it should. I vibecoded a dashboard visualizing in a graph exactly which index gets reindexed when and into what. Code works, a little rough around the edges. But it serves the purpose and saved me a ton of time

Another example, in an internal project we made a recent change where we need to send specific headers depending on the environment. Mostly GET endpoint where my workflow is checking the API through browser. The list of headers is long, but predetermined. I vibecoded an extension that lets you pick the header and allows me to work with my regular workflow, rather than Postman or cURL or whatever. A little buggy UI, but good enough. The whole team uses it

I'm not a frontend developer and either of these would take me a lot of time to do by hand

dktp · 2026-02-02T22:58:30 1770073110

My best guess is that Nvidia is unhappy with how OpenAI is fishing for compute with its competitors (Jensen had some opinions on the AMD-OpenAI deal when it was announced). If this actually becomes a feasible reality, it gives OpenAI (and co) negotiating power - which is bad for Nvidia

Nvidia might have wanted more exclusivity/attachment. And OpenAI still seems to have no problem raising money. So maybe there was just a commitment mismatch

Pure speculation though

dktp · 2026-02-02T15:48:22 1770047302

I would agree. I've been using VSCode Copilot for the past (nearly) year. And it has gotten significantly better. I also use CC and Antigravity privately - and got access to Cursor (on top of VSCode) at work a month ago

CC is, imo, the best. The rest are largely on pair with each other. The benefit of VSCode and Antigravity is that they have the most generous limits. I ran through Cursor $20 limits in 3 days, where same tier VSCode subscription can last me 2+ weeks