I liked this problem! Too bad it didn’t like me! Every year I struggle with problems that are either graph theory or 3-dimensional, and this featured both.
I take refuge in the fact that my approaches were essentially sound. Where I went wrong was in
the implementation (part 1: discounting a case as not worthy of consideration, when uh-oh it was big-time important), or
understanding the question (part 2: the question is vaguely worded, and I’d argue it’s misleadingly worded).
(Some wags might well point out that I was only wrong in the two most important parts of the problem, but you shan’t intimidate me from making a spectacle of my failures all the same.)
What finally worked for me:
For part 1, I use an array to map indices of junction boxes to their circuit numbers. Of course, this first requires knowing the distances between every pair of junction boxes, but that’s not hard. It isn’t necessary to sort the 1000 shortest connections; it suffices simply to find them.
For part 2, a little creativity allows you to reuse your solution to Part 1, but unless you know in advance exactly how many connections you need to make – very unlikely – you’ll definitely need to sort the connections by distances.
Meanwhile, I see from the leaderboard that some of you have already completed Day 9.
I didn’t sort anything by distance initially. Since there are “only” 1000 boxes you can reasonably brute force the next shortest pair. My first solver ran in 12 seconds, but sorting by distance shaved only 4 seconds off. I saw some people mentioning union-find for part 2, which I guess is where my other 8 seconds are - I just bfs the graph after connecting boxes to count its component size.
The main issue with my approach is that I didn’t use an optimal data structure for edges. In my solution, I used an adjacency matrix to store connections. Walking a complete circuit involves picking a box, looking at what boxes are connected to it, and picking those, recursively. Unfortunately this means finding neighboring boxes once is a linear scan over a row in a 1000x1000 matrix (and it’s recursive!). I plan to take another look after day 12.
Actually you can do that without a matrix. It is what I did initially, but with a bug in the implementation.
First you say circuit #i is associated with box #i for each i, then when you connect i and j, you merge all vertices of circuit j into i (it is where I had a subtle bug). All 1D arrays. But in the problem, you could have in the sorted shortest connection list, A connected to B, B to C and later A to C. The A to C would not be detected if you use a circuit-wise grouping and not and pairwise connection matrix. I found the aforemented bug after switching from direct circuit merging to a pairwise matrix. For my data, the switch did not change anything but the algorithm was closer to the text.
My solution was almost identical to that. When I did time alr run it says 0.24s, but if I just run the compiled binary directly with time bin/day8 it says 0.04s.
I think one difference between our solutions is that I assigned a number to a junction box instead of a connection, and I called it a group number. Coord N starts with group number N. Then, whenever I connect two boxes X and Y, I change every box that has the same group number as Y to the group number of X, and I subtract 1 from the group count. As soon as group count becomes 1 I compute that value for the second part. I also add 1 to the number of connections when I make a connection and when that hits 1000, I print out the value for the first part.
Like you, I computed all the distances up front. I stored them in a binary heap. Then I just keep pulling the top element off the binary heap until I am done.
I spent a lot more time on part 1 than I actually wanted to, and then got a ridiculously wrong answer, so I gave up and went on to Day 9. But my subconscious kept thinking about it, and soon suggested that my optimization of counting the length of circuits was incorrect. So I went back and de-optimized it, and got the same answer. But that indicated that the error was in building circuits, and a few minutes of reviewing that showed my stupid mistake. After that, part 2 only took a few minutes.