Efficiency of machine learning optimizers and meta-optimization for nanophotonic inverse design tasks
Nathaniel Morrison, Eric Y. MaThe success of deep learning has driven the proliferation and refinement of numerous non-convex optimization algorithms. Despite this growing array of options, the field of nanophotonic inverse design continues to rely heavily on quasi-Newton optimizers such as L-BFGS and basic momentum-based methods such as Adam. A systematic survey of these and other algorithms in the nanophotonics context remains lacking. Here, we compare 24 widely used machine learning optimizers on inverse design tasks. We study two prototypical nanophotonics inverse design problems—the mode splitter and wavelength demultiplexer—across various system sizes, using both hand-tuned and meta-learned hyperparameters. We find that Adam derivatives, as well as the Fromage optimizer, consistently outperform L-BFGS and standard gradient descent, regardless of system size. While meta-learning has a negligible-to-negative impact on Adam and Fromage, it significantly improves others, particularly AdaGrad derivatives and simple gradient descent, such that their performance is on par with Adam. In addition, we observe that the most effective optimizers exhibit the lowest correlation between initial and final performance. Our results and codebase (github.com/Ma-Lab-Cal/photonicsOptComp) provide a valuable framework for selecting and benchmarking optimizers in nanophotonic inverse design.