String and Linked List Data Structures Explained

Posted by Anonymous and classified in Computers

Written on in English with a size of 6.48 KB

1. Storage of Strings

At its core, a string is a sequence of characters. How computer systems store and manipulate these characters determines code performance and memory usage. Computer memory is linear, so strings must be mapped to sequential data structures using three primary methods:

A. Fixed-Length Storage (Static Allocation)

Each string variable is assigned a fixed number of bytes at compile time.

  • Mechanics: If a string is smaller than the allocated size, the remaining space is padded with blanks or null characters. If it exceeds it, the string is truncated.
  • Pros/Cons: Very fast to access, but highly inefficient with memory.

B. Variable-Length Storage (Contiguous Array)

The string occupies only the space it needs, stored side-by-side in memory. Systems use one of two methods to identify the end:

  1. Null-Terminated (C-Style): A special sentinel character (\0) is placed at the end of the character array.
  2. Length-Prefix (Pascal-Style): The initial byte(s) store an integer representing the exact string length.

C. Linked List Storage (Non-Contiguous Allocation)

Characters are stored in discrete nodes scattered across the memory heap, where each node points to the next.

  • Mechanics: A node can store a single character (high memory overhead due to pointers) or a chunk of characters (more balanced).
Storage TypeRandom Access (O(1))Memory EfficiencyDynamic Resizing
Fixed-LengthYesPoor (Waste via padding)No
Variable-Length (Array)YesExcellentExpensive (Reallocation)
Linked ListNo (O(n))Poor (Pointer overhead)Excellent

2. Fundamental String Operations

Standard string operations function under the hood with specific algorithmic time complexities (where N is the string length):

Length

  • Length-Prefixed: O(1) time as the length is pre-calculated.
  • Null-Terminated: O(N) time as the system must iterate until it hits \0.

Concatenation

  • Process: The system allocates a new block of memory of size Length(S1) + Length(S2), copying S1 and then S2 into it.
  • Time Complexity: O(M + N).

Substring

  • Process: The system locates the starting index and copies the specified number of sequential characters into a new memory location.
  • Time Complexity: O(K), where K is the length of the extracted substring.

3. Structural Editing Operations

Modifying a contiguous string array requires shifting adjacent elements to maintain the sequential layout.

Insertion

  • Process: To insert characters, all elements from index i to the end must be shifted right to create a gap.
  • Time Complexity: O(N).

Deletion

  • Process: Target characters are removed, and remaining characters to the right must be shifted left to close the gap.
  • Time Complexity: O(N).

Replacement

  • Process: If the new substring is the exact same length, it overwrites in place (O(K)). Otherwise, it triggers a deletion followed by an insertion (O(N)).

4. Pattern Matching

Pattern matching involves finding a substring (Pattern, length M) within a larger body of text (Text, length N).

A. Naive / Brute-Force Algorithm

  • How it works: Aligns the pattern at the start and checks character by character. If a mismatch occurs, it shifts the pattern right by one position.
  • Worst-Case Complexity: O(N × M).

B. Knuth-Morris-Pratt (KMP) Algorithm

  • How it works: Uses a "Longest Prefix Suffix" (LPS) table to skip impossible alignments, never stepping backward in the main text.
  • Worst-Case Complexity: O(N + M).

C. Boyer-Moore Algorithm

  • How it works: Scans characters from right to left. Uses "bad character" and "good suffix" rules to shift the pattern forward by large chunks.
  • Worst-Case Complexity: O(N × M), but average-case is sub-linear (O(N / M)).

5. Linked List Fundamentals

A Linked List is a fundamental data structure where each node carries data and a map pointing to the next node.

Introduction & Memory Representation

Unlike an array, a linked list stores elements in independent units called Nodes:

  1. Data: The value stored.
  2. Next (Pointer/Link): A reference to the next node.
  3. Head & Tail: The Head marks the start; the last node points to NULL.

Array vs. Linked List

FeatureArrayLinked List
MemoryStatic / ContiguousDynamic / Scattered
SizeFixedDynamic
AccessFast (O(1))Slow (O(n))
Insertion/DeletionSlow (O(n))Fast (O(1))

6. Linked List Operations

  • Traversing: Visiting every node from Head to NULL (O(n)).
  • Searching: Linear search comparing target values (O(n)).
  • Insertion: Efficient pointer updates (O(1) at start, O(n) at end).
  • Deletion: Re-routing pointers to bypass the target node (O(1) at start, O(n) at end).

7. Types of Linked Lists

  1. Singly Linked List: One-way navigation; last node points to NULL.
  2. Doubly Linked List: Two pointers per node (next and previous) for bidirectional traversal.
  3. Circular Linked List: No NULL pointers; the last node points back to the head.
  4. Header Linked List: Includes a unique header node containing metadata about the list.

Related entries: