- 100MB 'image' (ie executable code; the executable itself plus all the OS libraries loaded.)
- 40MB heap
- 50MB "mapped file", mostly fonts opened with mmap() or the windows equivalent
- 45MB stack (each thread gets 2MB)
- 40MB "shareable" (no idea)
- 5MB "unusable" (appears to be address space that's not usable because of fragmentation, not actual RAM)
Generally if something's using a lot of RAM, the answer will be bitmaps of various sorts: draw buffers, decompressed textures, fonts, other graphical assets, and so on. In this case it's just allocated but not yet used heap+stacks, plus 100MB for the code.
Edit: I may be underestimating the role of binary code size. Visual Studio "devenv.exe" is sitting at 2GB of 'image'. Zoom is 500MB. VSCode is 300MB. Much of which are app-specific, not just Windows DLLs.
Turning these numbers into "memory consumption" gets complicated to the point of being intractable.
The portions that are allocated but not yet used might just be page table entries with no backing memory, making them free. Except for the memory tracking the page table entries. Almost free....
A lot of "image" will be mmapped and clean. Anything you don't actually use from that will be similarly freeish. Anything that's constantly needed will use memory. Except if it's mapped into multiple processes, then it's needed but responsibility is spread out. How do you count an app's memory usage when there's a big chunk of code that needs to sit in RAM as long as any of a dozen processes are running? How do you count code that might be used sometime in the next few minutes or might not be depending on what the user does?
This assumes that executable code pages can be shared between processes. I'm skeptical that this is still a notable optimization on modern systems because dynamic linking writes to executable memory to perform relocations in the loaded code. So this would counteract copy on write. And at least with ASLR, the result should be different for each process anyway.
ld writes to the GOT. The executable segment where .text lives is not written to (it's position independent code in dynamic libraries).
ASLR is not an obstacle -- the same exact code can be mapped into different base addresses in different processes, so they can be backed by the same actual memory.
That’s true on most systems (modern or not), but actually never been true on Windows due to PE/COFF format limitations. But also, that system doesn’t/can’t do effective ASLR because of the binary slide being part of the object file spec.
I can't reconcile this with the code that GCC generates for accessing global variables. There is no additional indirection there, just a constant 0 address that needs to be replaced later.
Assuming the symbol is defined in the library, when the static linker runs (ld -- we're not talking ld.so), it will decide whether the global variable is preemptable or not, that is, if it can be resolved to a symbol outside the dso. Generally, by default it is, though this depends on many things -- visibility attributes, linker scripts, -Bsymbolic, etc. If it is, ld will have the final code reach into the GOT. If not, it can just use instruction (PC) relative offsets.
I'm not sure if you're just trolling, but I'll give the same example I gave before (you can get even wilder simplifications -- called relaxations -- with TLS, since there are 4 levels of generality there). I'm not sure what you meant by "changing isntructions", but in the first case the linker did the fixup indicated by the relocation and in the second reduced the generality of the reference (one less level of indirection by changing mov to lea) because it knew the symbol could not be preempted (more exactly, the R_X86_64_REX_GOTPCRELX relocation allows the linker to do the relaxation if it can determine that it's safe to)
OK, I spent a few additional minutes digging into this. It's been too long since I looked at those mechanisms. Turns out my brain was stuck in pre-PIE world.
Global variables in PIC shared libraries are really weird: the shared library's variable is placed into the main program image data segment and the relocation is happening in the shared library, which means that there is an indirection generated in the library's machine code.
Dynamic linking doesn't have to write to code. I'm not familiar with other platforms, but on macOS, relocations are all in data, and any code that needs a relocation will indirect through non-code pages. I assume it's similar on other OSes.
This optimization is essential. A typical process maps in hundreds of megabytes of code from the OS. There are hundreds of processes running at any given time. Eyeballing the numbers on an older Mac I have here (a newer one would surely be worse) I'd need maybe 50GB of RAM just to hold the code of all the running processes if the pages couldn't be shared.
As pointed out below, quite a lot of that isn't in RAM - see "working set".
There's a common noob complaint about "Linux using all my RAM!" where people are confused about the headline free/buffers numbers. If there's a reasonable chance data could be used again soon it's better to leave it in RAM; if the RAM is needed for something else, the current contents will get paged out. Having a chunk of RAM be genuinely unallocated to anything is doing nothing for you.
Nitpick: What you're describing is the disk cache. If a process requests more memory than is free, the OS will not page out pages used for the cache, it will simply either release them (if they're on the read cache) or flush them (if they're on the write cache).
Of course it's doing something for you. Room to defrag other areas of RAM, room to load something new without moving something else out of the way first.
Your perspective sounds like the concept that space in a room does nothing for you until/unless you cram it full of hoarded items.
If you didn't have the "random" buffers, you'd complain how slow it is. Syntax highlighting? Needs a boatload of caching to be efficient. Code search? Hey, you want a cached code index. Plugins? Gotta run your python code somewhere.
Run vi/nano/micro/joe - they're optimizing for memory to some extent. vi clocks in at under 8 MB. You're giving up a lot of "nice" things to get there.
And how does that breakdown in vmmap? I'm guessing that's working set vs. the whole virtual memory allocation (which is definitely always an overestimate and not the same as RAM)
Some ten years ago I used an earlier version of https://unity.com/how-to/analyze-memory-usage-memory-profili... to accidentally discover a memory leak that was due to some 3rd party code with a lambda that captured an ancient, archived version of Microsoft's C# vector which had a bug. There were multiple layers of impossibility of me finding that through inspection. But, with a functional tool, it was obvious.
Ten years before that I worked on a bespoke commercial game engine that had its own memory tracker. First thing we did with it was fire up a demo program, attach the memory analyzer to it, then attach a second instance of the memory analyzer to the first one and found a memory error in the memory analyzer.
Now that I'm out of gamedev, I feel like I'm working completely blind. People barely acknowledge the existence of debuggers. I don't know how y'all get anything to work.
A quick google for open-source C++ solutions turns up https://github.com/RudjiGames/MTuner which happens to have been updated today. From a game developer, of course XD
> I look at memory profiles of rnomal apps and often think "what is burning that memory".
As a corrolary to this: I look at CPU utilization graphs. Programs are completely idle. "What is burning all that CPU?!"
I remember using a computer with RAM measured in two-digit amounts of MiB. CPU measured in low hundreds of MHz. It felt just as fast -- sometimes faster -- as modern computers. Where is all of that extra RAM being used?! Where is all of that extra performance going?! There's no need for it!
Next time you see someone on HN blithely post "CPU / RAM is cheaper than developer time", it's them. That is the sort of coder who are collectively wasting our CPU and RAM.
If you ran a business, would you rather your devs work on feature X that could bring in Y revenue, or spend that same time reducing CPU/RAM/storage utilization by Z% and gives the benefit of ???
You work on both. Sometimes you need to prioritize one, sometimes the other. And the benefit of the second option is "it makes our product higher quality, both because that is our work ethic but also because our customers will appreciate a quality product".
The business is only going to care about the bottom line. If it's not slow enough to cause business problems, they are not going to say "here's a week to make software faster"
Likewise engineers are only going to care about doing their job. If the business doesn't reward them from taking on optimization work, why would they do it?
This is not true of all engineers and all businesses. Some businesses really do 'get it' and will allow engineers to work on things that don't directly help stated goals. Some engineers are intrinsically motivated and will choose to work on things despite that work not helping their career.
What I'm really getting is, yes, engineers choose "slower" technologies (e.g. electron, React) because there are other benefits, e.g. being able to get work done faster. This is a completely rational choice even if it does lead to "waste" and poor performance.
I agree with this. What you focus on depends on the circumstances. I believe PaulG likes to say that premature optimization is the root of all evil. Early on, you’re trying to ship and get a functioning product out the door — if spending a bit of money on extra RAM at that time helps you, it’s worth it. Over time, as you are trying to optimize, it makes sense to think more about memory management, etc.
There is probably some low hanging fruit to be harvested in terms of memory optimizations, and it could be a selling point for the next while as the memory shortage persists
That is why we have slow, bloated software. The companies that create the software do not have to pay any of the operational costs to run it.
If you have to buy extra RAM or pay unnecessary electrical or cooling expenses because the code is bad; it's not their problem. There is no software equivalent to MPG measurements for cars where efficient engine designs are rewarded at the time of purchase.
Even an editor running under Inferno plus Inferno itself would be lighter than the current editors by large. And that with the VM being weighted on. And Limbo it's a somewhat high level language...
> I remember using a computer with RAM measured in two-digit amounts of MiB
Yes, so do I. It was limited to 800x600x16 color mode or 320x200x256. A significant amount of memory gets consumed by graphical assets, especially in web browsers which tend to keep uncompressed copies of images around so they can blit them into position.
But a lot is wasted, often by routing things through single bottlenecks in the whole system. Antivirus programs. Global locks. Syncing to the filesystem at the wrong granularity. And so on.
FWIW a two digit amount of MB is usually at least 16MB (though with low hundreds of MHz it was probably at least 32MB if not 64MB) and most such systems could easily do 1024x768 at 16bit, 24bit or 32bit color. At least my mid-90s PC could :-P (24bit color specifically, i had some slow Cirrus Logic adapter that stored the framebuffer in triplets of R,G,B, probably to save RAM but at the cost of performance).
I too wonder that. And it is true on an OS level as well. The only worthwhile change in desktop environments since the early 2000s has been search as you type launchers. Other than that I would happily use something equivalent to Windows XP or (more likely) Linux with KDE 3. It seems everything else since then has mostly been bloat and stylistic design changes. The latter being a waste of time in my opinion.
Of course, some software other than desktop environments have seen important innovation, such as LSPs in IDEs which allows avoiding every IDE implementing support for every language. And SSDs were truly revolutionary in hardware, in making computers feel faster. Modern GPUs can push a lot more advanced graphics as well in games. And so on. My point above was just about your basic desktop environment. Unless you use a tiling window manager (which I tried but never liked) nothing much has happened for a very long time. So just leave it alone please.
>The only worthwhile change in desktop environments since the early 2000s has been search as you type launchers.
Add to that: unicode handling, support for bigger displays, mixed-DPI, networking and device discovery is much less of a faff, sound mixing is better, power management and sleep modes much improved. And some other things I'm forgetting.
There are some people who would exclude all of those an enhancements because they don't care about them (yes, even Unicode, I've seen some people on here argue against supporting anything other than ASCII)
Unicode is a fair point, I do speak a language that has a couple of letters that are affected. And of course many many more people across the world are way more affected by that. I didn't really consider that part of the desktop environment though, but I could see the argument for why it might (the file manager for example will need to deal with it, as would translations in the menus etc).
I was primarily thinking about enhancements in the user interactions. Things you see on a day to day basis. You really don't see if you use unicode, ASCII, ISO-somenumbers, ShiftJIS etc (except when transferring information between systems).
Basically, the short answer is that most memory managers allocate more memory than a process needs, and then reuse it.
IE, in a JVM (Java) or dotnet (C#) process, the garbage collector allocates some memory from the operating system and keeps reusing it as it finds free memory and the program needs it.
These systems are built with the assumption that RAM is cheap and CPU cycles aren't, so they are highly optimized CPU-wise, but otherwise are RAM inefficient.
Completely agree, it would be very helpful to get even just a breakdown of what the ram is being used for. It's unfortunately a lot of work to instrument.
> sublime consumes 200mb. I have 4 text files open. What is it doing?
To add to what others have said: Depending on the platform a good amount will be the system itself, various buffers and caches. If you have a folder open in the side bar, Sublime Text will track and index all the files in there. There's also no limit to undo history that is kept in RAM.
There's also the possibility that that 200MB includes the subprocesses, meaning the two python plugin hosts and any processes your plugins spawn - which can include heavy LSP servers.
It's partly because there are layers of abstractions (frameworks, libraries / runtimes / VM, etc). Also, today's software often has other pressures, like development time, maintainability, security, robustness, accessibility, portability (OS / CPU architecture), etc. It's partly because the complexity / demand has increased.
>>I'm always confused as hell how little insight we have in memory consumption.
>>I look at memory profiles of rnomal apps and often think "what is burning that memory".
Because companies starting with Microsoft approach it as an infinite resource, and have done so literally for generations of programmers — it is now ancient tradition.
Back in the x86 days when both memory and memory handles were constrained (64k of them, iirc) I went to a MS developer conference. One problem starting to plague everyone was users' computers running out of memory when actual memory in use was less than half, and the problem was not that memory was used, but all available handles were consumed.
I randomly ended up talking to the (at the time) leader of the Excel team, so I thought I'd ask him about good practices, asking "Does it make sense to have the software look at the task and make an estimate of the full amount of RAM required and allocate it off one handle and track our usage ourselves within that block?" I was speechless when he answered: "Sure, if you wanted to optimize the snot out of it — we just allocate another handle."
That two-line answer just blew my mind and instantly explained so much about problems I saw at the time, and since.
It also made sense in the context of another talk they gave at a previous conference where the message was they anticipate the increased power of the next generation of hardware and write their new version for that hardware, not the then-current hardware. It makes sense, but in the new light, it seems almost like a cousin of planned obsolescence — "How can we squander all the new power Intel is giving us?". And the result was decades after word processing and spreadsheets had usable performance on 640K DOS machines, new machines with orders of magnitude more power and RAM, actually run slower from a user perspective.
I'm hoping this memory crunch (having postponed a memory upgrade for my daily driver and now noticing it is 10x the price) will at least have the benefit of driving developers to maybe get back some craft of designing in optimization.
Software engineers seem to be more and more abstracted from the hardware they use. You also (rarely) back in the day had to worry about things like IRQ ports and optimising for tiny amounts of latency.
Personally I am fine with programmers not spending tons of time optimising down to every last piece because we do have so much more ram and compute relative to the old days. My bigger issue is that things are also a laggy mess even when there is plenty of resources available. I understand these things go hand in hand but I would much rather see more optimisations for the things users will actually notice than just going for metrics. A nice combo of the two would be ideal.
That being said what's probably most appalling is the amount some modern programs hard crash even when they have plenty of resources.
A lot of programs over-allocate on virtual memory, but don't actually use it, and the OS is smart enough to just pretend like it allocated it. I'm sure there's probably some justification for it somewhere, but it's hard not to see it as some absurd organically achieved agreement. Developers used to ask for more memory than their application actually needed and caused all sorts of OOM problems for end users. OS developers realised this and made the OS lie to the app to tell it it got what it asked for, and only give it memory as needed. Now developers just can't be bothered to set any realistic amount of memory, because what's the point, the OS is going to ignore it anyway.
Electron really loves to claim absurd amounts of memory, e.g. slack has claimed just over 1TB of virtual memory, but is only using just north of 200MB.
The 390GB is because of implementation details of GPU drivers. It's not wrong but it also doesn't matter at all.
(The only number that actually means anything there is the first one, but the label for it is basically meaningless. Use `footprint` in Terminal for a better explanation of memory use.)
I look at memory profiles of rnomal apps and often think "what is burning that memory".
Modern compression works so well, whats happening? Open your taskmaster and look through apps and you might ask yourself this.
For example (lets ignore chrome, ms teams and all the other bloat) sublime consumes 200mb. I have 4 text files open. What is it doing?
Alone for chrome to implement tab suspend took YEARS despite everyone being aware of the issue. And addons existed which were able to do this.
I bought more ram just for chrome...