The trick here is that their xmame (Docker) may not be the same build on all hosts, so it may not play the ROMs all in the same way or support all ROMs. A standard works to improve interoperability between different builds/hosts/etc as well as provide an expected set of operations and their results. If all they provide is just one version of one product and call that standardized, that's like releasing a new version of Internet Explorer and calling it a web standard.
Well, this is a good first step in the "free market" standardization process, though: get a public implementation out of what you would imagine standard-conformance to look like. Then, let the other guys (e.g. Heroku) get out their competing implementations. Then, find the similarities, resolve the differences, and write it down. Now you've got a standard.
In practice that does not work. Things get broken, people end up having to support 20 edge cases to use this "universal", "standardized" thing. Depends on the implementation, though.
"HTML 1.0" was the particular standard I had in mind. I guess I'm too used to coding multiplatform Javascript, but "end[ing] up having to support 20 edge cases to use this 'universal', 'standardized' thing" sounds like success in my books--in that you now have a (painfully) interoperating ecosystem, where before you had none. And it all gradually gets smoothed out as the spec evolves over the years, until you can't really tell the difference from a BDUF spec.