AQTIVATE Article: You don’t need to know everything to make good predictions
May 15, 2025AQTIVATE Article: The art of compression: From cartography to quantum simulations
April 16, 2026Rethinking the Code: My Quest to Make Scientific Software as Adaptable as its Hardware
By Shiting Long
Modern scientific computing isn’t just about using faster machines—it’s about rethinking how software works with those machines. DD-αAMG (Domain Decomposition Aggregation-based Algebraic Multigrid) is a high-performance software library designed to solve the Dirac equation in the context of Lattice Quantum Chromodynamics (Lattice QCD). My research looks at how long-standing high-performance computing codes, like DD-αAMG, can be redesigned to run efficiently on today’s diverse architectures. The big question is simple to ask but hard to answer: how do we make complex scientific programs both fast and adaptable when the hardware they run on keeps changing?

So far, we’ve explored ways to reorganize data so that processors can handle multiple operations at once using SIMD parallelism. While this sounds like it should give large speedups, the reality is more complicated. The application turns out to be limited not by computation, but by how quickly data can move through memory. Even more interestingly, reading and writing data are usually not equally fast, and this imbalance can quietly limit performance. It suggests that designing algorithms today requires thinking carefully about how they interact with memory, not just how many calculations they perform.
Another challenge is that there is no single best solution. Different supercomputers favor different data layouts and algorithm designs, and something that works well on one system may perform poorly on another. We also observed that improving the mathematical side of an algorithm—like reducing the number of iterations—can sometimes hurt how well the machine is utilized, and vice versa. This creates a constant trade-off between algorithmic efficiency and hardware efficiency.
What makes this work exciting is that many questions are still open. How do we balance precision and speed? Can we predict how an algorithm will behave on a new machine? And is it possible to design software that performs well everywhere? These challenges don’t just affect supercomputers—they reflect a broader shift in computing, where understanding the relationship between software and hardware is becoming just as important as raw processing power.
