2023 Day 25: Snowverload

cantanima · December 26, 2023, 2:31am

Finally

Day 25 strikes me as much harder than most Day 25’s, and I ain’t the only one. I mean, come on, 2020 was the year where you basically played a tribute to Zork. (Or maybe it was 2019, and 2020 was the relatively easy discrete log puzzle.)

But today’s puzzle? Part 1 alone should count as a two- or even three-part problem. I won’t even bother covering up the “spoilers”, as I bet revealing them won’t spoil much at all.

The moment I read the problem, I knew it was a graph theory problem. I don’t know much graph theory – I’m not that kind of mathematician; I did real math with nonlinear polynomials and stuff; see my solution to Day 24 or whatever – so I immediately went to look up stuff on graph connectivity, which I’d heard and read something. That led me to “min cuts”, a new idea to me, then to Karger’s algorithm, and that looked both appealing and tractable, so I said, “aha! let me implement this!”

Three hours and a boatload of annoyance later, I was still struggling to implement Karger’s algorithm. I think I was close, and I may yet get it working, but at this point it was time to spend time with my family, so I did.

When I came to look at it, a stroke of divine inspiration hit – ok, maybe not divine, but it is Christmas so what do I know – and I had the idea to generate a bunch of random paths through the graph and count the edges used. It stood to reason that the edges used most were the load-bearing ones that guaranteed connectivity.

A couple of hours and some deep frustration later, I was still struggling to get that working, inasmuch as my top 3 hits consistently disagreed with the example’s 3 edges to remove. I finally surrendered to the temptation to peruse Reddit. Per Reddit, or at least the Reddit solutions I could be bothered to read, most people employ one of two approaches.

Karger’s algorithm to find the min cut of graph.
A Monte Carlo method that picks two random points a fair number of times, on each turn finds a shortest path (shortest seems to be important, which prompted me to switch from using DFS to BFS – the one time I start with DFS it turns out to be counterproductive!), and keeps a running tally of which edges are used most often. The 3 edges that show up most are, with extremely high probability, the edges you want.

Notice anything familiar? Me, too. I asked myself whether I should feel elated or depressed that most other people were taking one of these two approaches, too – elated that, hey, that’s what the experts were doing; depressed that, yeah, I can’t seem to cut it.

I downloadeded Conscious-Money-7663’s Python solution to use as a sanity check against my code. Its top 3 hits agree with the example’s counsel, so I guess something’s wrong with my code.

After an hour or two of fruitless searching for a bug, I said “What the heck” and tried my code on the input.

It worked.

It ain’t pretty, but I’ll take the ! From there it was spectacularly easy to finish the puzzle (very easy, same as every Day 25 I’ve done, which now covers 2018…2023).

At some point I’ll go back & try to figure out (a) why my Karger’s Algorithm doesn’t work, and (b) why my Monte Carlo method doesn’t work on the example. But for now I can go and now, hope my wife doesn’t divorce me, my parents don’t disown me, and my kids haven’t uploaded too many videos complaining that I was spending Christmas trying to solve a stupid puzzle instead of… well, there wasn’t anything else to do, as far as I know, or I’d have done it. But I still got an evil or two today.

Merry Christmas, even if belated.

cantanima · December 26, 2023, 2:57pm

Oh, I guess I did forget one thing that’s a genuine spoiler: once you identify and remove the three edges that need replacement, you still need to count the components in the two now-disconnected graphs. For that, I picked one vertex at random, then checked if I could find a route to every other node. The nodes where that succeeded were placed in one set of nodes; the ones where it failed were placed in a second. It’s a bit slow, but it does the job.

Also: I’ve discovered that my code doesn’t always produce the correct answer on the puzzle input; usually it does, but if I squint at it the wrong way, comment or un-comment some IO.Put_Line, and recompile, then it computes the wrong connections… once or twice, then it does again. So I still have to debug it.

cantanima · December 26, 2023, 4:41pm

I’m pretty sure I figured this out. It started when I ran the Python code a few times in a row and… got some failures! It succeeded more often than not, unlike my Ada approach, which failed more often than it succeeded. But that gave me the insight I needed.

The reliability of the random approach depends strongly on the paths chosen.

If you choose the same pair of nodes multiple times, you will generate the same path multiple times, which will bias the results.
There are multiple paths from one node to another, and some choices of path seem to work better than others.

As to point 1, my Ada code kept repeating certain pairs more often. For instance, for one run of the data, where 100 random pairs were chosen, the Ada code chose 4 pairs 3 times; the Python code chose only 2 pairs 3 times.

As to point 2, the Ada code followed explored paths in alphabetical order. The moment it found a shortest path, it returned. The Python code did not; it followed… whatever order Python chooses for dictionary values, I dunno. Seemed random to me. While that has its disadvantages, it seemed to help the Python.

To make the Ada-implemented Monte Carlo method work reliably, I realized that

The choice of node pairs ought to be distinct, so you get a better view of the graph. The top hit is reliable under this approach.
Remove the top hit edge from the graph and rerun.

This seems to get me the correct choices each time, on both the example graph and the input data. I’ll comment and post my code later today.

Still need to fix up Karger’s algorithm, though.

jcmoyer · December 28, 2023, 11:09pm

I ended up implementing the stoer-wagner algorithm for this problem but your solution is quite a bit faster! 2.8s vs 4.2s total runtime. It’s really cool that randomized algorithms work so well for this puzzle.

cantanima · December 29, 2023, 12:22am

Yeah, not only is it my first Monte Carlo algorithm ever, but I remember a professor’s explaining patiently to me why I shouldn’t allow the possibility of error in a Monte Carlo algorithm to bother me. This one is especially nice in that it’s easy to obtain a certificate of correctness: you can easily check whether it divides the graph into two.

Nice! I had seen that one, too, and even saw a C++ implementation on Wikipedia, but after glancing at it a few minutes decided for some reason that I wanted nothing to do with it… Maybe because it looks a little more complicated than Karger’s, which I had already started on? Not sure. As I write above, I don’t actually know much graph theory.