Lipschitz Continuity, convexity, subgradients
First, some definitions:
Lipschitz continuity
A function is called L-Lipschitz over a set S with respect to a norm if for all we have:
Some people will equivalently say is Lipschitz continuous with Lipschitz
constant .
Intuitively, is a measure of how fast the function can change.
Convexity, subgradients
function is convex if :
In a picture, the line between two points is an upper bound on the function evaluated at any point in the line. Here we see an example for , , . Compare it to concave functions.
An equivalent definition of a convex function is that such that:
In this case, we have that is a linear lower bound for the function, which is tight at Here we see a example in 1 dimension for , , .
Subgradients
- is a subgradient of at if inequality (1) holds.
- is the subdifferential of at , or set of subgradients of at
- If is differentiable, .
Relationship between Lipschitz constant and norm of subgradients
There is a nice Lemma (2.6 from this survey) that relates L to the norm of the subgradients of . It states the following:
Let be a convex function. Then, is L-Lipschitz over with respect to norm if and only if for all and we have that , where is the dual norm.
In other words, Lipschitz continuity over some norm implies a bound on the dual norm of the subgradients (and thus the gradients, if the function is differentiable) of the function - and vice versa.
First, we will prove this bound. Then, we will give examples of its applications to some functions and intuition.
Proof (from the survey): Assume that is L-Lipschitz. Choose some , .
Let . Note that by definition, .
Also note that , by the way we defined it.
From the definition of subgradient, we have the following inequality:
Since we assume that is Lipschitz continuous, we have:
Combining the above two inequalities, we conclude that if is L-Lipschitz, .
Now we prove the other direction (if , is L-Lipschitz). We now use the definition of subgradient again, multiplied by -1:
Combining this with the general dual norm result that , we get:
Which by definition means that is L-lipschitz.
This gives us intuition about Lipschitz continuous convex functions: their gradients must be bounded, so that they can't change too much too quickly.
Examples
We now provide some example functions. Lets assume we are using the L2 norm. First, Lets look at :
Is it Lipschitz continous? This function happens to be differentiable, so we have that Clearly, the subgradients are not bounded (they go to infinity as goes to infinity), so this function is NOT Lipschitz continuous. Intuitively, as grows, the change in the axis grows faster and faster, much faster than the change in the x axis times some constant. Is the gradient of Lipschitz continuous? Well, the subgradient of the gradient is , so it is clearly bounded. Thus, we conclude that the gradient of is Lipschitz continuous with .
Now, let :
In this case, it is easy to see that the subgradient is from , at 0 and from . From the theorem, we conclude that the function is Lipchitz continuous with . The gradient is not continuous, so it is also not Lipschitz Continuous (the gradient at the point of discontinuity would be infinite).