TAMU Operating System 2 Memory Management

Dynamic memory allocation

When does new memory get allocated in the kernel?

New process gets created (fork())
New program gets loaded into memory (execve())
Process stack grows
Process expands heap (malloc())
Process “creates” (attaches to) shared memory segment (shmat())
Map a file to memory (mmap())

Fragmentation

External fragmentation
- Has enough memory, but the free memory are distributed and cannot be grouped to host a large block
Internal fragmentation
- Use memory blocks
- Wasted space inside blocks

Allocation

Use free list to maintain free spaces.

Buddy system allocation:

Maintain multiple free lists, one for each power-of-2 size.

Allocation:

Increase size of request to next power of 2*.
Look up segment in free lists.
If exists, allocate.
If none exists, split next larger segment in half, put first half(the “buddy”) on free list, and return second half.

De-Allocation:

Return segment to free list.
Check if buddy is free. If so, coalesce.

Address spaces

All user programs think their memory spaces starts from 0x0, which are virtual addressing.

Paging

Naive relocation causes (external) fragmentation. Solve it by partition memory into equal-size portions.

Paging

Last frame may not be completely full.
Average internal fragmentation per block is typically half frame size.
- Large frames cause more fragmentation.
- Small frames cause more overhead (page table size, disk I/O)

Page table

Issues:

Page table must store one entry per page.
Naive page table can become very large.
- 32-bit address space, 4kB pages, 4B entries
- 4MB for each process

Hierarchical (Multilevel) paging

Page the page table

Multi-level paging

Too many levels will increase the search time
- Needs to read memory more times

Inverted page table

Inverted paging

Pros:

Scales with physical memory
One table for whole system

Cons:

Long search times for large physical memories

Hashed page table

Hashed paging

Pros:

Scales with physical memory
One table for whole system

Cons:

How about collisions?

Translation Lookaside Buffers (TLB)

Use CPU to cache page number and frame number, allowing fast lookups.

TLB

Split virtual address, use page number.
Look in the TLB to see if we find translation entry for page.
- If YES, use frame number.
- If NO, system must locate page entry in main-memory-resident page table, load it into TLB, and start again.

Note:

One TLB per address space (process)
- Need to flush TLB for every process switch.
One TLB for entire system
- Add address space ID in TLB entries

Software-managed TLBs, MIPS style

TLB MIPS

Principle: Make do with as little hardware as possible.
Apart from a register with the ASID, the MMU is just the TLB.
- The rest is all implemented in software!
When TLB cannot translate an address, a special exception (TLB refill) is raised.
- Software then takes over.

TLB Refill Exception:

Figure out if this virtual address is valid.
- If not, trap to handling of address errors.
If address is valid, construct TLB entry.
If TLB already full, select an entry to discard.
Write the new entry into the TLB.

Segmentation

Users think of memory in terms of segments (data, code, stack, objects, …). Segmentation allows for efficient allocation and management of semantically-related data segments

Segmentation

Pros:

Data in a segment typically semantically
related
Protection can be associated with segments
- read/write protection
- range checks for arrays
Data and code sharing

Cons:

External fragmentation

Segmented paging

To elimimate external fragmentation.

Segmented paging

Paging on x86

Paging x86

CR0 controls whether to use paging
CR3 loads the page directory table address (the last entry in this table points to the table itself)
The page fault is Exception 14
- err_code stores error code
- offending address stored in CR2
use bit set whenever page is referenced
- to identify recently-used pages
dirty bit set whenever page is written
- to minimize I/O write operations
- don’t need to write while not dirty
- clear dirty bit when paging in

Page table entry

Page out

We may swap out entire processes from memory to disk
Can partially swap out memory of processes
- Processes never use all their memory at once

Page fault

Process of a page fault:

Page fault

If all frames are used:

Select a victim frame
Update page table entry of victim to be invalid
Page victim out

Policies

Locality of Reference: A program that references a location n at some point in time is likely to reference the same location n and locations in the immediate vicinity of n in the near future.

We can duplicate valid bit by a software-valid bit and have the kernel turn off the valid bit. The other bits can then be simulated in software.

Paging replacement policies:

FIFO
- Replace the page that has been in memory for the longest period of time.
Optimal
- Replace that page which will not be used for the longest period of time (in the future)
LRU
- replace the page that has not been accessed for longest period of time (in the past)
2nd chance: Associate a use_bit with every frame in memory.
- Upon each reference, set use_bit to 1.
- Keep a pointer to next “victim candidate” page.
- To select victim: If current frame’s use bit is 0, select frame and increment pointer. Otherwise delete use bit and increment pointer.
Enhanced 2nd chance
- Consider read/write activity of page: dirty bit
- Algorithm same as 2nd Chance Algorithm, except that we scan for pages with both use bit and dirty bit equal to 0.
- A dirty page is not selected until two full scans of the list later.

Each process requires a certain minimum set of pages to be resident in memory to run efficiently. Thrashing will happen when the basic needs of processes in a system cannot be satisfied.

Work set model

The memory management strategy follows two rules:

Rule 1: At each reference, the current working set is determined and only those pages belonging to the working set are retained in memory.
Rule 2: A program may run only if its entire current working set is in memory.

Problems:

Difficulty in keeping track of working set.
Estimation of appropriate window size.
- too small -> page out useful pages
- too big -> waste memory

Page buffering

Victim frames are not overwritten directly, but are removed from page table of process, and put into:
- free frame list (clean frames)
- modified frame list (modified frames)
Victims are picked from the free frame list in FIFO order.
If referenced page is in free or modified list, simply reclaim it.
Periodically (or when running out of free frames) write modified frame list to disk and add now clean frames to free list.

Recursive page table lookup

Idea is use the last entry in page directory table to point to the table itself.

Access PTE:

PTE access

Access PDE:

PDE access

Reference

This is my class notes while taking CSCE 611 at TAMU. Credit to the instructor Dr. Bettati.

AI 2

Algorithm 17

Amazon 1

Authorization 1

Blog 3

Bootstrap 1

C++ 1

CCpp 5

CSS 2

Cloud 3

Code 1

Crawler 1

DNS 1

Database 17

DeepLearning 1

Design 16

Development 1

Docker 1

English 1

Express 1

GDB 1

Go 3

Google 4

HTML 3

IOS 1

Java 17

Javascript 4

Jekyll 1

Linux 4

MacOS 2

MachineLearning 16

Markdown 4

Mobile 1

MongoDB 2

Multi-threading 3

NAS 1

Network 11

NeuralNetwork 10

Node 1

OS 8

Public-speaking 1

Python 15

RESTful 1

Rails 9

React 1

Redis 1

Ruby 6

Shell 2

Spring 2

System 16

TCP 1

TDD 1

Thread 2

Vim 1

awk 1

git 1

jQuery 1

media 1

network 1

php 1