I will discuss what it means for a method to be optimal in optimization and machine learning. When a method matches a lower bound, does that mean the method is good? How can we develop lower bounds and optimality results that are meaningful? Can theoretical results actually direct progress in what we do? Some of this talk will be speculative, some will cover my and others results, and some will likely be polemic.