File System Journaling: Mechanisms, ext3, and NTFS Recovery

Posted by Anonymous and classified in Computers

Written on in English with a size of 3.75 KB

Journaling Motivation and Necessity

File System Check (FSck) ensures metadata consistency after crashes but is slow and requires deep file system knowledge. Recovery time should ideally depend on the number of recent writes.

File System Transactions and ACID Properties

Transactions provide ACID guarantees:

  • Atomicity
  • Consistency
  • Isolation
  • Durability

These are used to treat file system operations (like file creation) as transactions. Recovery ensures committed transactions are applied and uncommitted ones are discarded.

ext3 Journaling File System

ext3 is a journaling file system using physical redo logging, adding journaling to existing ext2 structures.

Redo Logging Mechanism in ext3

The process involves writing updates to a journal first, then committing the transaction, and then performing in-place writes. Replay on crash recovery is idempotent.

Physical Block Logging Details

Entire physical blocks are logged even for small updates (e.g., inode updates).

ext3 Write Strategies and Protocol

Common strategies include:

  • Serial writes: Safe but slow.
  • Simultaneous writes: Fast but risky due to reordering.

ext3 uses a staged protocol: write everything except Transaction End (TxEnd), then write TxEnd, then checkpoint.

Journal Structure and Management

The journal is a circular buffer with a superblock recording the start and end. Entries are deallocated after checkpoint.

ext3 Journaling Modes

Data Mode
Logs both data and metadata (most consistent, highest cost).
Ordered Mode (Default)
Logs metadata; data is written in-place before journaling.
Writeback Mode
No data logging; fastest but carries the risk of junk data post-crash.

Transaction Batching

ext3 collects multiple updates into one transaction using an in-memory dirty block list to reduce redundant updates.

Summary of ext3 Operation (Ordered Mode)

  1. Write data.
  2. Journal Transaction Begin (TxBegin) plus metadata.
  3. Write Transaction End (TxEnd).
  4. Checkpoint metadata.
  5. Update journal superblock.

Advanced Logging Techniques

Redo Logging

Logs new values first, commits, then updates in-place. Replay redoes committed transactions.

Undo Logging

Logs undo instructions first, then updates in-place, then commits. Recovery undoes uncommitted transactions.

Combined Redo and Undo Logging

This combines benefits: redo lets commits happen before in-place updates; undo allows flushing dirty blocks early. Recovery involves a forward redo pass and a backward undo pass.

NTFS Journaling and Recovery

NTFS, the Windows file system, uses redo plus undo logging, journaling metadata only. It supports file compression and encryption. Special files like $MFT, $LogFile, and $Bitmap contain critical metadata.

NTFS Operation Logging

NTFS uses operation logging (e.g., “set bit in bitmap”) with smaller log entries than ext3. Each file system operation gets its own transaction. Sub-operations include redo, undo, and a link to the previous operation.

NTFS Crash Recovery Process

First, redo all sub-operations (even for uncommitted transactions), then undo only those from uncommitted transactions.

Rationale for Two Recovery Passes

Why both passes? Because log entries might have hit the disk before the crash. If the transaction didn’t commit, it must be undone.

Related entries: