403 lines
9.7 KiB
Org Mode
403 lines
9.7 KiB
Org Mode
|
|
* Ufds: Union-Find Disjoint Set :cpp:datastructure:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Write a minimal Ufds class with vector parent, constructor initializing parents to self
|
|||
|
|
** Back
|
|||
|
|
#+begin_src c++
|
|||
|
|
#include <vector>
|
|||
|
|
#include <numeric>
|
|||
|
|
|
|||
|
|
class Ufds {
|
|||
|
|
private:
|
|||
|
|
std::vector<int> parent;
|
|||
|
|
public:
|
|||
|
|
Ufds(int n) : parent(n) {
|
|||
|
|
std::iota(parent.begin(), parent.end(), 0);
|
|||
|
|
}
|
|||
|
|
};
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
* Ufds find() with path compression :cpp:datastructure:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Write the ~find()~ method for Ufds with path compression
|
|||
|
|
** Back
|
|||
|
|
#+begin_src c++
|
|||
|
|
int find(int x) {
|
|||
|
|
if (parent[x] != x) {
|
|||
|
|
parent[x] = find(parent[x]);
|
|||
|
|
}
|
|||
|
|
return parent[x];
|
|||
|
|
}
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
* Concise Ufds find() :cpp:datastructure:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is this correct? What does it do?
|
|||
|
|
#+begin_src c++
|
|||
|
|
int find(int x) {
|
|||
|
|
if (parent[x] == x) return x;
|
|||
|
|
return parent[x] = find(parent[x]);
|
|||
|
|
}
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Yes, correct and equivalent to the longer version.
|
|||
|
|
|
|||
|
|
- Base case: if ~x~ is its own parent, return ~x~
|
|||
|
|
- Recursive case: find root of parent, then assign and return
|
|||
|
|
|
|||
|
|
The assignment ~parent[x] = find(parent[x])~ returns the result while compressing the path.
|
|||
|
|
|
|||
|
|
* Ufds constructor: initializer list vs body :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What is the difference between these two Ufds constructors?
|
|||
|
|
#+begin_src c++
|
|||
|
|
// Initializer list
|
|||
|
|
Ufds(int n) : parent(n) {
|
|||
|
|
std::iota(parent.begin(), parent.end(), 0);
|
|||
|
|
}
|
|||
|
|
|
|||
|
|
// Body style
|
|||
|
|
Ufds(int n) {
|
|||
|
|
parent.resize(n);
|
|||
|
|
std::iota(parent.begin(), parent.end(), 0);
|
|||
|
|
}
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Initializer list: direct constructs ~parent~ with size ~n~ (one allocation)
|
|||
|
|
|
|||
|
|
Body style: default constructs ~parent~, then resize (potential double allocation)
|
|||
|
|
|
|||
|
|
Both produce same result, initializer list is more efficient.
|
|||
|
|
|
|||
|
|
* What does std::vector::resize do? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What does ~resize(n)~ do on a ~std::vector<int>~?
|
|||
|
|
** Back
|
|||
|
|
Resizes the vector to have ~n~ elements:
|
|||
|
|
|
|||
|
|
- If ~n~ > current size: adds elements (value-initialized, usually 0)
|
|||
|
|
- If ~n~ < current size: truncates the vector
|
|||
|
|
- Reallocates if capacity < n
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
std::vector<int> v;
|
|||
|
|
v.resize(5); // v = {0, 0, 0, 0, 0}
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
* Ufds destructor: when to omit :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Should Ufds have a destructor? ~Ufds() {}~ vs ~Ufds() = default~
|
|||
|
|
** Back
|
|||
|
|
Omit entirely if only using ~std::vector~ (auto-cleanup)
|
|||
|
|
|
|||
|
|
If you must write one, prefer ~= default~ to explicitly show intent
|
|||
|
|
|
|||
|
|
Empty ~{}~ works but ~= default~ is clearer intent for future maintenance
|
|||
|
|
|
|||
|
|
* Correct Ufds constructor: is this correct? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is this correct?
|
|||
|
|
#+begin_src c++
|
|||
|
|
Ufds(int n) {
|
|||
|
|
p.resize(n);
|
|||
|
|
std::iota(p.begin(), p.end(), 0);
|
|||
|
|
}
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Yes, correct. This is the body-style initialization.
|
|||
|
|
|
|||
|
|
One potential issue: ~p~ is default-constructed first, then ~resize~ may reallocate.
|
|||
|
|
|
|||
|
|
For efficiency, prefer initializer list: ~Ufds(int n) : p(n)~
|
|||
|
|
|
|||
|
|
* Incorrect Ufds constructor: correct or wrong? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is this correct?
|
|||
|
|
#+begin_src c++
|
|||
|
|
std::vector<int> p;
|
|||
|
|
Ufds(int n) {
|
|||
|
|
p = new std::vector<int>(n);
|
|||
|
|
}
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Wrong.
|
|||
|
|
|
|||
|
|
~new std::vector<int>(n)~ returns a ~vector<int>*~ (pointer), but ~p~ is a ~vector<int>~ (not a pointer).
|
|||
|
|
|
|||
|
|
Can't assign a pointer to a vector.
|
|||
|
|
|
|||
|
|
* What does new return? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What does ~new Type~ return? And ~new Type[n]~?
|
|||
|
|
** Back
|
|||
|
|
~new Type~ returns a pointer to that type: ~Type*~
|
|||
|
|
~new Type[n]~ returns a pointer to an array: ~Type*~
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
int* p1 = new int; // single int
|
|||
|
|
int* p2 = new int[10]; // array of 10 ints
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
* What does delete do? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What does ~delete~ do? ~delete[]~?
|
|||
|
|
** Back
|
|||
|
|
~delete~ frees a single object allocated with ~new~
|
|||
|
|
~delete[]~ frees an array allocated with ~new[]~
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
int* p1 = new int;
|
|||
|
|
int* p2 = new int[10];
|
|||
|
|
delete p1; // free single object
|
|||
|
|
delete[] p2; // free array
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
Mismatching ~delete~ with ~new[]~ causes undefined behavior.
|
|||
|
|
|
|||
|
|
* Is this correct Ufds with C-style array? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is this correct?
|
|||
|
|
#+begin_src c++
|
|||
|
|
class Ufds {
|
|||
|
|
private:
|
|||
|
|
int* parent;
|
|||
|
|
public:
|
|||
|
|
Ufds(int n) : parent(new int[n]) {}
|
|||
|
|
~Ufds() { delete[] parent; }
|
|||
|
|
};
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Correct. This manually manages the heap-allocated array.
|
|||
|
|
|
|||
|
|
- ~new int[n]~ allocates array of ~n~ ints on heap
|
|||
|
|
- ~delete[] parent~ in destructor frees it
|
|||
|
|
|
|||
|
|
This works but requires manual cleanup — error prone compared to ~std::vector~.
|
|||
|
|
|
|||
|
|
* Equivalent std::array version: correct or wrong? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is this correct?
|
|||
|
|
#+begin_src c++
|
|||
|
|
#include <array>
|
|||
|
|
#include <numeric>
|
|||
|
|
|
|||
|
|
class Ufds {
|
|||
|
|
private:
|
|||
|
|
std::array<int, 10> parent;
|
|||
|
|
public:
|
|||
|
|
Ufds() {
|
|||
|
|
std::iota(parent.begin(), parent.end(), 0);
|
|||
|
|
}
|
|||
|
|
};
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Correct for compile-time fixed size.
|
|||
|
|
|
|||
|
|
But size ~10~ is hardcoded — not parameterized.
|
|||
|
|
|
|||
|
|
~std::array<T, N>~ requires ~N~ be a compile-time constant.
|
|||
|
|
|
|||
|
|
For runtime size, must use ~std::vector~.
|
|||
|
|
|
|||
|
|
* std::array vs vector vs C-style array :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Compare ~std::array<T, N>~, ~std::vector<T>~, and ~T*~ for Ufds parent array
|
|||
|
|
** Back
|
|||
|
|
| Type | Size | Cleanup | Use when |
|
|||
|
|
|------------------+--------------+-------------------+----------------------------|
|
|||
|
|
| ~std::array<T, N>~ | compile-time | automatic | size known at compile time |
|
|||
|
|
| ~std::vector<T>~ | runtime | automatic | size known at runtime |
|
|||
|
|
| ~T* p = new T[n]~ | runtime | manual (~delete[]~) | legacy code only |
|
|||
|
|
|
|||
|
|
* Vector vs Array: is vector backed by array? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is ~std::vector~ backed by an array? How does it support variable length?
|
|||
|
|
** Back
|
|||
|
|
Yes, ~std::vector~ allocates a contiguous memory block (like a heap array).
|
|||
|
|
|
|||
|
|
Typical implementation: three pointers
|
|||
|
|
- ~start~ — pointer to data
|
|||
|
|
- ~finish~ — pointer to last element
|
|||
|
|
- ~end_of_storage~ — pointer to end of allocated capacity
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
vector<int> v(5);
|
|||
|
|
v.push_back(1); // may trigger reallocation if capacity exceeded
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
When capacity is exceeded, vector: allocates larger block, copies elements, deallocates old block.
|
|||
|
|
|
|||
|
|
* Vector size vs capacity :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What is the difference between ~size()~ and ~capacity()~ in ~std::vector~?
|
|||
|
|
** Back
|
|||
|
|
- ~size()~ — number of actual elements stored
|
|||
|
|
- ~capacity()~ — allocated storage space
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
std::vector<int> v(3); // size=3, capacity=3
|
|||
|
|
v.push_back(1); // size=4, capacity may be >=4
|
|||
|
|
v.push_back(1); // size=5, capacity may be >=5
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
~reserve(n)~ pre-allocates capacity without resizing.
|
|||
|
|
|
|||
|
|
* Comparison: which Ufds storage to use? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
When would you choose ~std::array~, ~std::vector~, or ~T* new~ for Ufds parent array?
|
|||
|
|
** Back
|
|||
|
|
~std::array<T, N>~ — only if N is known at compile time
|
|||
|
|
|
|||
|
|
~std::vector<T>~ — if size is determined at runtime (typical Ufds case)
|
|||
|
|
|
|||
|
|
~T* new T[n]~ — legacy code only; ~vector~ is safer and equivalent performance
|
|||
|
|
|
|||
|
|
For Ufds: ~std::vector~ is the idiomatic choice because size is runtime-determined.
|
|||
|
|
|
|||
|
|
* C-style array key properties :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What are key properties of C-style arrays (raw arrays)?
|
|||
|
|
** Back
|
|||
|
|
#+begin_src c++
|
|||
|
|
int arr[5]; // fixed size, stack-allocated
|
|||
|
|
int* p = new int[n]; // dynamic size, heap-allocated
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
- Decay to pointer on function call (lose size info)
|
|||
|
|
- No bounds checking
|
|||
|
|
- ~sizeof(arr)~ gives bytes, not element count
|
|||
|
|
- Must manage lifetime manually for heap arrays
|
|||
|
|
|
|||
|
|
C-style arrays in C++ are generally avoided in favor of ~std::array~/~std::vector~.
|
|||
|
|
|
|||
|
|
* std::iota for Ufds initialization :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
How to use ~std::iota~ to initialize Ufds parent array to ~[0, 1, 2, ..., n-1]~
|
|||
|
|
** Back
|
|||
|
|
#+begin_src c++
|
|||
|
|
#include <numeric>
|
|||
|
|
|
|||
|
|
std::vector<int> parent(n);
|
|||
|
|
std::iota(parent.begin(), parent.end(), 0);
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
~iota~ fills the range with consecutive values starting from given start value
|
|||
|
|
|
|||
|
|
* Path compression in find() :cpp:datastructure:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What does "path compression" mean in ~find()~?
|
|||
|
|
** Back
|
|||
|
|
After finding the root, point each visited node directly to the root:
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
parent[x] = find(parent[x]); // compresses path
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
This flattens the tree structure, making future ~find()~ calls O(α(n)) ≈ O(1)
|
|||
|
|
|
|||
|
|
* Original Ufds mistakes: is this correct? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
Is this correct?
|
|||
|
|
#+begin_src c++
|
|||
|
|
class Udfs {
|
|||
|
|
int parent[];
|
|||
|
|
vector<int> p;
|
|||
|
|
public:
|
|||
|
|
Ufds(int n) {
|
|||
|
|
p = new vector<int>;
|
|||
|
|
}
|
|||
|
|
~Ufds() {
|
|||
|
|
p ? free p;
|
|||
|
|
}
|
|||
|
|
};
|
|||
|
|
#+end_src
|
|||
|
|
** Back
|
|||
|
|
Wrong. Multiple errors:
|
|||
|
|
|
|||
|
|
1. ~int parent[]~ — illegal incomplete array type
|
|||
|
|
|
|||
|
|
2. ~Ufds(int n)~ constructor but class is ~Udfs~ (typo)
|
|||
|
|
|
|||
|
|
3. ~p = new vector<int>~ — can't assign pointer to vector
|
|||
|
|
|
|||
|
|
4. ~free p~ — can't free a vector, and ~p~ is not a pointer
|
|||
|
|
|
|||
|
|
5. ~Ufds~ destructor but class is ~Udfs~ (typo)
|
|||
|
|
|
|||
|
|
* Heap allocation: what and why? :cpp:cpp:
|
|||
|
|
:PROPERTIES:
|
|||
|
|
:ANKI_NOTE_TYPE: Basic
|
|||
|
|
:END:
|
|||
|
|
** Front
|
|||
|
|
What does "heap allocated" mean? Why use it?
|
|||
|
|
** Back
|
|||
|
|
Heap allocation with ~new~ lives until ~delete~ is called or program ends.
|
|||
|
|
|
|||
|
|
Stack allocation (local variables) is自动 freed when out of scope.
|
|||
|
|
|
|||
|
|
#+begin_src c++
|
|||
|
|
int arr[10]; // stack — freed when function returns
|
|||
|
|
int* p = new int[10]; // heap — lives until delete[]
|
|||
|
|
#+end_src
|
|||
|
|
|
|||
|
|
Heap needed when: size unknown at compile time, or object must outlive scope.
|