#+ANKI_DECK: study_deck_02 #+title: Notes * DONE 0242. Valid Anagram :easy:arrays:hashing:counting: :PROPERTIES: :ROADMAP: [[file:../roadmap.org::*0242. Valid Anagram][0242. Valid Anagram]] :END: ** Approach Frequency counter with fixed-size array (Alpha Frequency Array trick). Early exit on size mismatch. Single pass over both strings. ** C++ #+begin_src cpp class Solution { public: bool isAnagram(std::string s, std::string t) { if (s.size() != t.size()) return false; std::array freq{}; for (int i = 0; i < s.size(); i++) { freq[s[i] - 'a']++; freq[t[i] - 'a']--; } return std::all_of(freq.begin(), freq.end(), [](int x){ return x == 0; }); } }; #+end_src ** Why ~std::array~ over C-style arrays? *** Type safety - ~std::array~ carries its size in the type system - C-style ~int freq[26]~ decays to ~int*~ when passed to functions — size info lost - ~std::array~ knows its size: ~freq.size()~ always returns 26 *** Value semantics - ~std::array~ can be copied, assigned, returned from functions like any value - C arrays decay to pointers, can't be assigned: #+begin_src cpp int a[5] = {1,2,3,4,5}; int b[5]; b = a; // ERROR: array type 'int[5]' is not assignable std::array sa = {1,2,3,4,5}; std::array sb; sb = sa; // OK: copies all elements #+end_src *** No pointer decay - C arrays silently decay to ~T*~ in many contexts — source of bugs - ~std::array~ never decays; pass by reference explicitly: ~void f(const std::array& a)~ *** Bounds checking (optional) - ~.at(i)~ throws ~std::out_of_range~ on bad index - ~[i]~ is unchecked (same as C array) — zero overhead if you want it - C arrays have no checked access option *** Works with STL algorithms - ~std::all_of~, ~std::sort~, ~std::find~ etc. work directly on ~std::array~ - C arrays need explicit begin/end: ~std::all_of(std::begin(arr), std::end(arr), ...)~ - ~std::array~ has ~.begin()~, ~.end()~, ~.size()~ *** Cons of ~std::array~ - Slightly more verbose syntax: ~std::array~ vs ~int[26]~ - Template parameter required — can't use runtime size (use ~std::vector~ then) - Compile-time dependency: size must be constexpr ** Why ~std::all_of~? *** Declarative intent - "All elements satisfy predicate" — reads like English - vs manual loop: ~for (int i=0; i<26; i++) if (freq[i]!=0) return false;~ — imperative, more mental parsing *** Zero overhead - Compiles to same machine code as hand-written loop - Optimizer inlines the lambda, unrolls if beneficial - No performance penalty vs manual loop *** Composability - Can chain with other STL: ~std::any_of~, ~std::none_of~, ~std::count_if~ - Lambda can be arbitrarily complex without changing the loop structure - Easy to swap predicate without restructuring code *** Cons - Slightly harder to debug (breakpoint inside lambda vs explicit loop) - For trivial checks, manual loop may be more readable to some - Requires ~~ include ** How does the compiler know 26 is okay? *** Compile-time constant - 26 is a literal — known at compile time - Template parameter ~N~ in ~std::array~ requires constexpr - Compiler sees: "allocate space for 26 ints right here, right now" *** Where does the memory live? ~std::array~ is an aggregate containing ~int data[26]~. - If local variable → *stack* allocation - If global/static → *data/bss* segment - If member of class → wherever the object lives Stack allocation is just moving the stack pointer: #+begin_src asm sub rsp, 104 ; 26 * 4 bytes = 104 bytes ; freq is now at [rsp], zero-initialized by {} #+end_src *** Zero-initialization - ~std::array freq{};~ — value-initialization, all zeros - Compiler may emit ~memset~ or just zero the stack frame - C-style ~int freq[26];~ is *uninitialized* — garbage values - C-style ~int freq[26] = {};~ or ~int freq[26]{};~ also zero-initializes *** Size is part of the type - ~std::array~ and ~std::array~ are *different types* - Can't accidentally mix them — type error at compile time - C arrays: ~void f(int a[26])~ actually becomes ~void f(int* a)~ — size lost *** Compile-time evaluation - ~freq.size()~ is constexpr 26 — no runtime overhead - Loop ~for (int i = 0; i < 26; i++)~ — compiler may unroll entirely - With ~constexpr~ arrays, entire computation can happen at compile time ** What about ~std::unordered_map~? When alphabet is *not* bounded (Unicode, arbitrary keys): - ~std::map~ — O(log n) per op, ordered, tree-based - ~std::unordered_map~ — O(1) average, hash table, unordered For lowercase a-z, neither beats array: - Array: ~freq[c - 'a']~ — direct index, one memory access - Map/unordered_map: hash + probe/compare — multiple accesses, branches ** Questions for later - How does the stack pointer move for arrays of different sizes? - What's the alignment requirement for ~std::array~? - Can ~std::array~ be ~constexpr~? - What about ~std::array~ vs ~std::vector~ for dynamic sizes? - How does the optimizer decide to unroll the ~all_of~ loop?