Introduction to lower bounds in optimization
June 12, 2025
This content is available as a PDF here.
Most of the bounds that are described in the optimization litterature are upper bounds of the form
To formalize this notion, we consider algorithm, here sequences
We shall note that one can thinks of a more general way to define first-order methods, but for the sake of the results we aim to prove, such level of generality is enough.
With this assumption in mind, how to design a function adversarial to these type of schemes? The idea is to find a function such that the gradient at step
where
Set
that is only the first coordinate is updated after one iteration. What happens now that we have access to
One can check that for any
-
Theorem 1 (Lower-bound for smooth convex optimization). For any
, , , , there exists a convex function that is and -smooth such that any sequences satisfying Assumption 1 is such that where is a minimizer of .
Remark that the rate
-
Theorem 2 (Lower-bound for smooth strongly convex optimization). For any
, , , there exists a -strongly convex function that is and -smooth such that any sequences satisfying Assumption 1 is such that for all , we have where is the unique minimizer of .
Note that it is common in the litterature to see Theorem 2 without the factor
-
Lemma 1 (Minimizers of
). Let , , , then defined in (1) is a -strongly convex (eventually convex if ) -function such that its gradient is -Lipschitz.If
, it has a unique minimizer satisfyingIf
, it has a unique minimizer satisfyingfor
, and for , where .
-
Proof. We drop the exponents
in the definition of . The function being a quadratic form, it is and its partial derivatives readThus, the Hessian matrix is given (for any
) bywhere
is a (thresholded) discrete Laplacian operator with Dirichlet boundary conditions that is tridiagonalObserve that we have (since
is a quadratic form) Note that:-
1. The Hessian is definite (resp. semi-definite) positive if
(resp. ). Indeed, Since for any , the result follows depending on the value of . -
2. Since
, we haveHence,
Thus, we have
.Let us characterize the (unique) solution
of the minimization of over . We aim to solve to find a critical point (which will be a minimum since we just proved that the Hessian is at least semidefinite positive), that isProjecting this relation on each coordinate
, we get thatwhich leads to
Similarly, we have
Consider
defined by for and and . We have the relationWe can rewrite it as the second-order linear recursion
. The associated trinom is whose discriminant is given byWe distinguish two cases:
-
1. If
, then the unique root is given by . -
2. If
, then the roots are given by
Case
. We have the affine relation with constraints and . In turn, we have and thusThe associated optimal value is given by
Case
. The solution can be written asThus, we have
, hence , and in turn we haveHence,
□
-
We now turns to the proofs of Theorem 1 and Theorem 2.
-
Proof of Theorem 2. The proof follows the same strategy as before, but we start with a bound on the iterates instead of the objective function. Assume that
, otherwise let . Consider and . We rewrite the coordinate of asOn one hand, we have:
where we used that for all
, we haveBounding the tail of the geometric sums, we obtain
On the other hand, observe that for
, one has , thusSince
, we haveand in turn,
Thus,
Observe thatCombining it with (6) and (7), we have
proving (4). The value bound (5) is obtained by applying the following inequality
to
. □
More refined versions of these lower bounds, which provide tighter constants, can be found in Drori and Taylor (2022). However, these improvements come at the price of significantly more involved and technical proofs.
References
-
Drori, Yoel and Adrien Taylor (Feb. 1, 2022). “On the Oracle Complexity of Smooth Strongly Convex Minimization”. In: Journal of Complexity 68, p. 101590.
-
Nemirovski, Arkadi and David Yudin (1983). Problem Complexity and Method Efficiency in Optimization. Wiley. 388 pp.
-
Nesterov, Yurii (2018). Lectures on Convex Optimization. Vol. 137. Springer.