Anatomy of a program in memory gustavo duarte




















The next post discusses how the kernel keeps track of these memory areas. Coming up well look at memory mapping, how file reading and writing ties into all this and what memory usage figures mean. Comments Responses to Anatomy of a Program in Memory 1. JP on January 27th, am Thank you! Your posts are some of the most informative Ive ever found on the internet. Im just researching this subject right now and your timing couldnt have been better! Gustavo Duarte on January 27th, am JP: sweet, glad it came at a good time.

Youre welcome! Your written are very good! I have a question. Im studying some code and I need documentation on Linux internals, specifically on memory management about processes. Can you suggest me any books or other documentation? Gustavo Duarte on January 27th, am michele: Take a look at the end of this post.

It has a list of Linux kernel books. It is dry, but the authors put monumental effort into going through everything. The Intel manuals are free and also excellent. These books are the best resource I know of.

I hope to write more material for this blog and eventually maybe have a short Intro to the Linux Kernel document online. However this is subject to my work schedule and so on. Ill read the doc you suggest me and keep reading your blog thanks! Ive been reading your blog for a while now and found a lot of your other posts really informative and easy to read. Keep up the good work! Quite shocked to know that windows takes double the kernel memory compared to Linux.

Im wondering, though, why does the kernel space consume 1gb? That seems like a lot.. And if you dont mind divulging a trade secret, what do you use to draw your diagrams? Reader1 on January 27th, am Great post. You left some stuff out though. Rarely is constant data such as strings held in. Good post though, I like the graphics. Programming links GreenwaysRoad Blog on January 27th, am [ Jose V. By the way, this made it to the front page of reddit, so brace for incoming traffic.

Only for completeness, can you include the programs parameters? I think they go in the bottom of the stack, but Im not sure. In any case, of course, they are put by the kernel when a execXX in called. Saludos Daily Links 18 CloudKnow on January 27th, am [ Sushant Srivastava on January 27th, am Thank you for this wonderful post.

Gustavo Duarte on January 27th, am Thank you all for the feedback! Thanks for the question though I clarified this in the post. Both the Linux and Windows kernel are extremely well built. Its hard to find areas where one really has an edge, imho. Two outstanding pieces of software. When the kernels were designed, 2 or 3GB seemed like a lot So partially its an evolutionary artifact, one that.

But also, it is good for performance to give the kernel an ample window into memory. I think the next couple posts should clarify why this is. Reader1: thanks for the corrections.

Ill add the heap randomization to the post. Regarding ELF sections, I thought about them, but Im always balancing what to include in these blog posts. I try hard to keep it crisp, covering one area well, but without dumbing anything down. But theres so much interconnected stuff, its not always clear where to put the line. I think Im going to leave ELF sections out for now though. The tradeoff between conciseness and completeness.

Im planning a post covering the stack in detail, and talking about buffer overflows, and I think thatd go in well there. Carlos on January 27th, am Thanks for the post, it is very informative. Can you tell me what software do you use to create graphics? Gustavo Duarte on January 27th, am I use Visio for the diagrams.

Chaitanya Gupta on January 27th, am As everyone has said, great post. I am looking forward to your follow up posts. What did you use to create them? NoName on January 27th, pm This blog includes very interesting articles. Continue your good work and never give up! These are some really informative posts you write. You should consider writing a book on the internals of systems level software. David on January 27th, pm Nice post.

Very clear. Ill keep reading you. Well written and concise. Looked at your Physical with memory post and its good too. John on January 27th, pm You mentioned, Each thread in a process gets its own stack. Could you clear up my confusion? Sesh on January 27th, pm I will try to thank you in a simple way: for a long time doubts about where string literals stay in memory would linger in my mind but so far I was not able to find any easy explanation anywhere.

This post makes it clear now. Thank you very much. Cant wait for the next articles in this series. He explains [ Raam Dev on January 27th, pm I just finished an Introduction to C Programming class and this beautifully written post is a godsend for helping me further my understanding of memory management. Thank you so much! Gustavo Duarte on January 28th, am First off, thank you all for the feedback!

It is great to hear that the post helped out a little bit. Contributing to the community is one of the major reasons I write this stuff, though it doesnt hurt that its fun.

I see a few issues though: 1 I want to keep the content free, no matter what; 2 the color would be gone in a normal book; 3 the links would be gone. Lately Ive been thinking about maybe collecting all the stuff once theres enough, and having an online book of sorts. I really had no idea where this blog would go, though now its becoming a bit clearer. So yea, Im munching on it. John: you are correct.

Basically the set of threads in a thread group share all the memory regions except for the stack and thread-local-storage. Does this help clear it up? I could dig up the relevant links to kernel code if youd like to see the stuff in action. Let me know. Nice articles. This will be much helpful to newbie and those who wants to learn about computers and students. Keep it up! John on January 28th, am Gustavo, thank you so much. All is clear. I used to have a book on the linux kernel where the code was also annotated, but I unfortunately just dont have the time, so your excellent posts and articles are greatly appreciated!

Ulver on January 28th, am Interesting arcticle, Very didactly and with figures!! Chanux on January 28th, am Great post. I want learn the art of writing great articles like this. I was looking for a point to get in to kernel level stuff. There wont be any better source than this. Subscribed to RSS. Looking for Twitter. Drawing are neat! What soft do you use for them? Ben Fowler on January 28th, pm Once again, great article! I think this blog is one of the best website Ive seen on introductory OS internals Ive seen yet.

Anything beyond that, I need to start reading my copy of Hennessey and Patterson I thought Id spotted a typo in one of the diagrams, but no it turns out youve really shown attention to detail in these articles. Nice work. This convention is why the post about Intel CPU caches shows a blue index for the virtually [ Good job. JP on January 29th, pm If you decide to do a small online book with such great content on all aspects of OS management, I will gladly buy it!

Thanks IvanM on February 2nd, am Very clear explanation of memories either phyisical or virtual Thanky you again! Prabhu on February 2nd, am Hi Gustavo, The explanation was very clear and informative. One rquest. Gustavo Duarte on February 2nd, am Thank you all for the feedback. Prabhu: thats a great topic. Theres a good book about this called Linkers and Loaders.

Its from , not sure how much has changed since. Im going to add this to my write queue, though I have no idea when the post would actually come out Here is gonzo [ Asmita on February 4th, am Its a great post Very helpfull. Im really waiting for the next one as Im not too clear for Heap system. Keep writing. Thanks a lot for sharing these helpfull contents.

Software Quality Digest No bug left behind on February 4th, pm [ Nix on February 5th, am Another excellent series on linkers is Ian Lance Taylors article series starting near the bottom of and proceeding onwards for several pages.

Nix on February 5th, am Oh, curses. Fixed link to the linkers series start Gustavo Duarte on February 5th, am Nix: great link, thanks! Raminder on February 7th, am Hi Gustavo, thank you for all your excellent articles. I have a question, two actually. As youve said each thread has its own stack area. How are these stack areas located with respect to each other? The second question is similar.

Does each thread has two. Gustavo Duarte on February 10th, am Raminder: youre welcome! Sorry for the delay in an answer here, but Ive been swamped with work these past few days. Can you drop me an email so that I can let you know when Ive replied here? Gustavo Duarte on February 13th, am Raminder: Sorry for the delay, Ive been working a bit lately.

Per our email, Ill talk about Windows only. I dont know where in the virtual space the thread stacks go. I googled briefly but didnt see an answer, so I think the easiest thing to do is to write a short test program to spawn a few threads calling a function that prints the address of a local variable. If you do a loop of 10 or so the pattern should become clear.

The kernel stack is kept in kernel-mode data structures and cant be touched from user mode. Hope this helps. Jakcy on February 17th, pm I am from China. Although my English is not so good, but I like your articles. I am ready to reading all your articles on your blog. La anatoma de un programa en memoria Mbpfernand0s Blog on February 26th, am [ Alex on March 8th, pm Great post. Some questions, so what I understood is that a process can not use more. I remember that Ive seen processes that are using more than 3gbm as far as top is concerned, but I could be wrong 32bit system.

Also for example, for top, why isnt the 1gb, reserved for the kernel, added in the VIRT space? Justin Blanton Anatomy of a program in memory on March 12th, pm [ Nagareddy on March 20th, pm Very useful and comprehensiv , to point.. How do you know windows size limits? Gustavo Duarte on March 23rd, pm Nagareddy: which size limits?

But regardless of which limits, they probably are either from the Windows Internals book, Windows header files, or Intel literature Gustavo Duarte on March 23rd, pm Alex, Thanks! Thats why for example the memcached folks tell you to run multiple instances when your box has more than 3GB running in bit mode with PAE. Regarding the numbers in top, that would be interesting to see.

It could be a quirk with the numbers themselves, or it could be that theres some exception going on but in general your understanding is correct processes cant use more than 3GB.

Its just a design issue why worry about it since its there for every process? Does it grow upwards or downwards? I remember it grows upwards. In your both figures, its drawn differently. Narayanan on April 30th, am Hi , I ve doubt regarding malloc allocting memory. How does malloc stores information about the size of the pointer as free uses only pointer variable as argument and not the size.

Thanks in advance Keith Johnson on April 30th, am Awesome post! Indeed, memory management cannot be overlooked. Gustavo Duarte on May 3rd, pm maverick: in x86 Linux it grows as shown in the diagrams, but this varies by CPU architecture and kernel.

Narayanan: Malloc does its own house keeping to know how much was allocated to each pointer. The best place to check this out is reading the libc source code for malloc and free. Keith: thanks!

Brian on June 4th, am Thanks for this post Gustavo. I have a question, though. Im mainly a sysadmin, not a low-level developer, but I need to understand this stuff for low-level debugging at the system level. Near the top of this post, you mention ring 2 or lower as if we should all just know what that even means, and Im sorry to say that I do not. Could you point me to a doc thatll explain that, or could you expand on what this notion of rings relates to? In the example above, Firefox has used far more of its virtual address space due to its legendarymemory hunger.

The distinct bands in the address space correspond to memory segments like the heap,stack, and so on. Keep in mind these segments are simply a range of memory addresses and have nothing todo with Intel-style segments.

Anyway, here is the standard segment layout in a Linux process:. When computing was happy and safe and cuddly, the starting virtual addresses for the segments shown abovewere exactly the same for nearly every process in a machine.

This made it easy to exploit securityvulnerabilities remotely. An exploit often needs to reference absolute memory locations: an address on thestack, the address for a library function, etc.

Remote attackers must choose this location blindly, counting onthe fact that address spaces are all the same. When they are, people get pwned. Thus address spacerandomization has become popular. Linux randomizes the stack, memory mapping segment, and heap byadding offsets to their starting addresses. Unfortunately the bit address space is pretty tight, leaving littleroom for randomization and hampering its effectiveness.

The topmost segment in the process address space is the stack, which stores local variables and functionparameters in most programming languages. Calling a method or function pushes a new stack frame onto thestack.

The stack frame is destroyed when the function returns. This simple design, possible because the dataobeys strict LIFO order, means that no complex data structure is needed to track stack contents a simple. Pushing and popping are thus very fast and deterministic.

Also, theconstant reuse of stack regions tends to keep active stack memory in the cpu caches, speeding up access. Each thread in a process gets its own stack. It is possible to exhaust the area mapping the stack by pushing more data than it can fit.

This is the normalmechanism whereby stack size adjusts to demand. However, if the maximum stack size has been reached, wehave a stack overflow and the program receives a Segmentation Fault.

While the mapped stack area expandsto meet demand, it does not shrink back when the stack gets smaller. Like the federal budget, it only expands. Dynamic stack growth is the only situation in which access to an unmapped memory region, shown in whiteabove, might be valid. Any other access to unmapped memory triggers a page fault that results in aSegmentation Fault. Some mapped areas are read-only, hence write attempts to these areas also lead tosegfaults.

Below the stack, we have the memory mapping segment. Here the kernel maps contents of files directly tomemory. It is also possible to create ananonymous memory mapping that does not correspond to any files, being used instead for program data.

InLinux, if you request a large block of memory via malloc , the C library will create such an anonymousmapping instead of using heap memory. Speaking of the heap, it comes next in our plunge into address space. The heap provides runtime memoryallocation, like the stack, meant for data that must outlive the function doing the allocation, unlike the stack.

Most languages provide heap management to programs. This first post describes how programs are laid out in. Each process has its own set of page tables, but there is. In Linux, kernel space is constantly present. Kernel code and data. By contrast, the mapping for the user-mode portion of the.

In the example above, Firefox has. This made it easy to exploit. An exploit often needs to reference. Remote attackers must choose this location. Thus address space randomization has become. Unfortunately the bit. The stack frame is destroyed when the function returns. Pushing and. Also, the constant reuse.

Each thread in a process gets its own stack. This is the normal mechanism. However, if the maximum stack size. While the mapped stack area.



0コメント

  • 1000 / 1000