The Flash version is 284B A13B in mixed FP8 / FP4 and the full native precision ...

sbinnee · 2026-04-24T04:31:56 1777005116

Price is appealing to me. I have been using gemini 3 flash mainly for chat. I may give it a try.

input: $0.14/$0.28 (whereas gemini $0.5/$3)

Does anyone know why output prices have such a big gap?

girvo · 2026-04-24T05:56:27 1777010187

Output is what the compute is used for above all else; costs more hardware time basically than prompt processing (input) which is a lot faster

tokenmaxxinej · 2026-04-24T06:24:09 1777011849

input tokens are processed at 10-50 times the speed of output tokens since you can process then in batches and not one at a time like output tokens

regularfry · 2026-04-24T11:10:07 1777029007

I'm going to blow my bandwidth allowance again this month, aren't I.