Breaking and Fixing

The worst and best thing happened to me this week.

So, I mentioned before that we had to replace our last old 32-bit computer with a GIANT new 64-bit one. Data-taking had only started again on Friday after the beech marten incident, so we had to be extra careful to not mess anything else up when we went down to do the replacement.

This old computer is used exclusively for debugging the TRT when data isn't being collected, but it's located in the same rack as some of the data-taking computers. The data-taking computers have hundreds of fiber-optic cables attached to them, hundreds of meters long, receiving data from deep inside ATLAS. The old computer was a normal Dell tower, easy to unplug and move out of the way by simply parting the fibers like a curtain. The new computer, though, was a massive 40 kg beast that needed rails to be installed in the computer rack, as well as three outlets to provide power. It was not easy to push the beast inside of the rack while also avoiding the many optical cables. I had two TRT-experts-in-training with me to help, and over the span of two hours, we screwed in the new rails and barely squeezed in this monster into the space left by the old computer. We were sweaty and even bleeding by the end of it (D had pinched his finger between the rack and the computer). After the new drivers were installed, everything was done. Success! Time to go to lunch.

Of course, it's never that easy. Later that night at 9 pm, we get a call saying that 1/128th of our detector isn't taking data. Where exactly- what part of the detector? Of course, it's in a part of the TRT that feeds data into a computer right above where we installed the devil machine. We had tried to be careful about the optical wires, but it turns out we had broken one. We had pulled a bit too hard on a drooping fiber during the installation. None of us had noticed it at the time.

God, it was the worst feeling.

Yes, mistakes happen. But data taken had JUST started again. And this would affect the quality of the data in a noticeable way. A dark spot in the detector- an ugly void. What could we do to fix it? There might be spare cables already threaded from ATLAS to the computers, but we wouldn't really know until the LHC turned off in two weeks time for a few days of development. Then we could then go into ATLAS without being irradiated to investigate. God, two weeks of a stunted detector. I didn't sleep at all that night.

And I kept on thinking- it's my fault. I am responsible for this. Maybe I'm not really cut out for this kind of work. People trusted me with the simple task of installing a new computer without ruining anything else, and I let them down. ATLAS is an important experiment. It's unique in the entire world. If I'm working near the cables that bring in data from the detector, why the fuck wouldn't I be more aware? More attentive to what was happening? I felt like a huge fraud. I felt like I should have gone into something where nothing I did really mattered- where I wouldn't be able to fuck up important things. Let the adults of the world handle these grown-up matters. To a kid like me, give me a task that I couldn't really mess up. Something simple and easily undoable with no consequences.

The next day we came in early. The run coordinator had some comforting words to say- "The only people who don't make mistakes are those who don't do anything." Instead of feeling bad, it was time to find a solution. So we brainstormed of things to do to fix it. We called and emailed the experts at CERN in fiber optics. And CERN, oh my God, it is filled with competent people. It turns out there exists a $40,000 machine that you can use to splice fiber optic cables together through arc-welding, and CERN has a few of these machines. The solutions was evident- cut off the part of our cable that was broken, and splice the good part to a short spare cable. After lunch, the expert came underground with us and spliced us a new connector. It was SO COOL to see the machine in action! Fiber optic cables are essentially glass, so the machine heats the two ends to 1200 C and fuses them together without a seam. And just like that- we had a new cable. Technology is amazing and it made me so grateful that there are inventors out there who come up with things like this, to solve such otherwise difficult and intractable problems. We weren't expecting the fix to be so quick! It kinda caught us by surprise. Because it was so fast, we didn't have the material on hand for re-wrapping the wire, and did a hack job of wrapping up the fiber cable with thick tubing and masking tape- but once we get the new material we'll fix it. But it works now! The TRT is bright and whole!

I beat myself up the whole night, and it turns out that the solution was quicker to do than many other projects I've worked on for the detector. So for next time- a lesson to myself:

  1. Don't lose hope.
  2. Dwelling and feeling sorry for myself takes away time from actually thinking of ways to fix the problem. The best way to make up for the mistake is not to beat myself up but actively think of solutions.
  3. People want to help more than they want to blame. There are so many people who will help in times of crisis, it's just a matter of asking them and letting them know I need the help.
  4. Once the problem is fixed, create a system to reduce the chance that it happens again. The best thing to do is save someone else from possibly going through the same despair and self-loathing in the future.