The partial derivative of a multivariate function is the derivative of the function with respect to one of the variables that defines the function while the other variables being taken as constant. For example, the partial derivative of a bivariate function with respect to \( x \) is given as
\( \displaystyle D_xf=f_x(x,y)=\frac{\partial f(x,y)}{\partial x} \)
The above equation can be computed numerically just as any of the univariate function but with a change, as we impose a rid on a plane or higher dimensional domain instead of an axis.
A second order partial differential equation can be written as
\( \displaystyle D_{xx}f=f_{xx}(x,y)=\frac{\partial^2f(x,y)}{\partial x^2} \)
A second order mixed partial equation is when we first differentiate with respect to one variable and then the other. For exaple, a second order mixed partial derivative with respect to \( x \) and then \( y \) can be written as
\( \displaystyle D_{xy}f=f_{xy}(x,y)=\frac{\partial}{\partial y}\frac{\partial f(x,y)}{\partial x} \)
A highert order partial and mixed derivaties looks like
\( \displaystyle \frac{\partial^{i+j+k}f}{\partial x^i\partial y^j\partial z^k}=\partial_x^i\partial_y^j\partial_z^kf \)
For example,
\( z=f(x,y)=x^2+xy+y^2 \)
The graph of this function defines a surface in Euclidean space. To every point on this surface, there are infinite number of tangent lines. Partial differentiation is the act of choosing one of these lines and finding the slope. Usually, the lines of most interest are those that are parallel o the \( xz \)-plane, when holding \( y \) constant, and those that are parallel to the \( yz \)-plane while holding \( x \) constant.

To find the slope of the line tangent to the function at (1,1) and parallel to the \( xz \)-plane, we treat \( y \) as a constant. By finding the derivative of the equation while assuming that \( y \) is a constant, we find that the slope of \( f \) at the point \( (x,y) \) is
\( \displaystyle \frac{\partial z}{ \partial x}=2x+y \)

It evaluates to 3 at point (1,1). The following code snippet sloves this problem.
// f = x^2 + xy + y^2
val f: RealScalarFunction f = new AbstractBivariateRealFunction() {
override public double evaluate(double x, double y) {
}
};
// df/dx = 2x + y
MultivariatefinitiDifference dx = new MultivariateFiniteDifference(f, new int[]{1})
system.out.println(String.format("Dxy(1,1) %f", dx.evaluate(new DenseVector(1, 1))));
The output is:
Dxy(1,1) = 3.000000
Gradient
The gradient function, \( \nabla f \), of a multivariate real-valued function, \( f \), is a vector-valued function or a vector field from \( \mathbf{R^n} \) to \( \mathbf{R^n} \). It takes a point \( \mathbf{x}=(x_1,…,x_n) \) in \( \mathbf{R^n} \) and it outputs a vector in \( \mathbf{R^n} \). The gradient at any point, \( \nabla f(x) \) is a vector. It has components that are the partial derivaties of \( f \) at \( \mathbf{x} \). Mathemnatically, it can be represented as
\( \nabla f(\mathbf{x})= \begin{bmatrix}
\frac{\partial f}{\partial x_1}(x)\\ \vdots\\
\frac{\partial f}{\partial x_n}(x)
\end{bmatrix}\)
Each components, \( \frac{\partial f}{\partial x_i}(x),\,x=1,…,n \) is the partial derivative of the function along an axis and is the rate of change in that direction. So, the gradient vector is like the first order derivative or the slope or the tangent in the case of a univariate function. It can be interpreted as the direction and rate of fastest increase. If the gradient of a function is non-zero at a point \( \mathbf{x} \), the direction of the gradient is the direction in which the function increases most quickly from \( \mathbf{x} \), and the magnitude of the gradient is the rate of increase in that direction, which is the greatest absolute directional derivative. Furthermore, the gradient is the zero vector at a point if and only if it is a stationary point, where the derivativ vanishes. The gradient thus plays an important and fundamental role in optimization theory.
For an example, consider a room where the temperature is given by a scalar field \( T \). Now the multivariate function or the field takes a point \( (x,y,z) \) and gives a temperature value \( T(x,y,z) \). At each point in the room, the gradient of \( T \) at that point shows the direction in which the temperature rises most quickly, moving away from \( (x,y,z) \). The magnitude of the gradient determines how fast the temperature rises in that direction. More generally, if a multivariate function \( f \) is differentiable, then the dot product between \( \nabla f \) an a unit vector \( \nu \) is the slope or rate of change of the function in the direction of \( \nu \), called he directional derivative of \( f\nu \). The multivariate version of Taylor expansion shows that the best linear approximation of a function can be expressed in terms of the gradient.
\( f(x)\approx f(x_0)+\nabla f(x_0).(x-x_0) \)
\( x \) and \( x_0 \) are points in the \( \mathbf{R^n} \) space. \( f \) maps \( \mathbf{R^n}\rightarrow\mathbf{R} \) and \( \nabla f:\mathbf{R^n}\rightarrow\mathbf{R^n} \). The dot product in the last term gives a real number.
Consider this function for example,
\( f(x,y)=x\,exp(-(x^2+y^2)) \)

The following code computes the gradients and gradient function of the function.
// f = x * exp(-(x^2 + y^2))
val f: RealScalarFunction f = new AbstractBivariateRealFunction() {
override public double evaluate(double x, double y) {
return x * exp(-(x * x + y * y));
}
};
Vector x1 = new DenseVector(0, 0);
Vector g1_0 = new Gradient(f, x1);
System.out.println(String.format("gradient at %s = %s", x1, g1_0));
GradientFunction df = new GradientFunction(f);
Vector g1_1 = df.evaluate(x1);
System.out.println(String.format("gradient at %s = %s", x1, g1_1));
Vector x2 = new DenseVector(-1, 0);
Vector g2 = df.evaluate(x2);
System.out.println(String.format("gradient at %s = %s", x2, g2));
Vector x3 = new DenseVector(1, 0);
Vector g3 = df.evaluate(x3);
System.out.println(String.format("gradient at %s = %s", x3, g3));
The output is:
gradient at [0.000000, 0.000000] = [1.000000, 0.000000]
gradient at [0.000000, 0.000000] = [1.000000, 0.000000]
gradient at [-1.000000, 0.000000] = [-0.367879, 0.000000]
gradient at [1.000000, 0.000000] = [0.367879, 0.000000]

Jacobian
\( \displaystyle J_{ij}=\frac{\partial f_i}{\partial x_j} \)
\( \displaystyle J=\begin{bmatrix}
\frac{\partial f}{\partial x_1} & \cdots & \frac{\partial f}{\partial x_n}
\end{bmatrix} \)
\nabla^Tf_1\\
\vdots\\
\frac{\partial f_1}{\partial x_1} & \cdots & \frac{\partial f_2}{\partial x_n}\\
\vdots & \ddots & \vdots\\
\frac{\partial f_m}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_n}
\end{bmatrix} \)
\( f(y)=f(x)+J_f(y-x)+O(||y-x||) \)
The error term approaches zero much faster than the distance between \( y \) and \( x \) similarly as \( y \) approaches \( x \). So, summarizing we can say that the Jacobian can be regarded as the “first-ordered derivative” of a vector-valued function of several variables.
x\\
y
\end{bmatrix}=\begin{bmatrix}
f_1(x,y)\\
f_2(x,y)
\end{bmatrix}=\begin{bmatrix}
x^2y\\
5x+siny
\end{bmatrix}\Big) \)
\( \displaystyle \mathbf{J_F}(x,y)=\begin{bmatrix}
\frac{\partial f_1}{\partial x} & \frac{\partial f_1}{\partial y}\\
\frac{\partial f_2}{\partial x} & \frac{\partial f_2}{\partial y}
\end{bmatrix}=\begin{bmatrix}
2xy & x^2\\
5 & cosy
\end{bmatrix} \)
\( ||\mathbf{J_F}(x,y)||=2xy\,cosy-5x^2 \)
The following code solves this example.
val f: RealVectorFunction F = new RealVectorFunction() {
override public Vector evaluate(Vector v) {
double x = v.get(1);
double y = v.get(2);
double f1 = x * x * y;
double f2 = 5 * x + sin(y);
return new DenseVector(f1, f2);
}
override public int dimensionOfDomain() {
return 2;
}
override public int dimensionOfDomain() {
return 2;
}
};
Vector x0 = new DenseVector(0, 0);
Matrix J00 = new Jacobian(F, x0);
System.out.println(String.format("the Jacobian at %s = %s, the det = %f",
x0,
J00,
MatrixMeasure.det(J00)));
RntoMatrix J = new JacobianFunction(F); // [2xy, x^2], [5, cosy]
Matrix J01 = J.evaluate(x0);
System.out.println(String.format("the Jacobian at %s = %s, the det = %f",
x0,
J01,
MatrixMeasure.det(J01)));
Vector x1 = new DenseVector(1, PI);
Matrix J1 = J.evaluate(x1);
System.out.println(String.format("the Jacobian at %s = %s, the det = %f",
x1,
J1,
MatrixMeasure.det(J1)));
The output is:
the Jacobian at [0.000000, 0.000000] = 2x2
[,1] [,2]
[1,] 0.000000, 0.000000
[2,] 5.000000, 1.000000, , the det = 0.000000
the Jacobian at [0.000000, 0.000000] = 2x2
[,1] [,2]
[1,] 0.000000, 0.000000
[2,] 5.000000, 1.000000, , the det = 0.000000
the Jacobian at [1.000000, 3.141593] = 2x2
[,1] [,2]
[1,] 6.283185, 1.000000
[2,] 5.000000, -1.000000, , the det = -11.283185
Hessian
\( H(f(x))=J(\nabla f(x)) \)
Now, suppose \( f:\mathbf{R^n}\rightarrow\mathbf{R} \). Now, if all second order derivatives of \( f \) exist and are continuous over the domain of the function, then the Hessian matrix \( H \) of \( f \) is a square \( n\times n \) matrix, usually defined and arranged as the following
\( \displaystyle H_f=\begin{bmatrix}
\frac{\partial^2f}{\partial x_1^2} & \frac{\partial^2f}{\partial x_1\partial x_2} & \cdots & \frac{\partial^2f}{\partial x_1\partial x_n}\\
\frac{\partial^2f}{\partial x_2\partial x_1} & \frac{\partial^2f}{\partial x^2_2} & \cdots & \frac{\partial^2f}{\partial x_2\partial x_n}\\
\vdots & vdots & \ddots & \vdots\\
\frac{\partial^2f}{\partial x_n\partial x_1} & \frac{\partial^2f}{\partial x_n\partial x_2} & \cdots & \frac{\partial^2f}{\partial^2_n} \end{bmatrix} \)
\( \displaystyle (H_f)_{i,j}=\frac{\partial^2f}{\partial x_i\partial x_j} \)
\( \displaystyle y=f(x+\Delta x)\approx f(x)+\nabla f(x)\Delta x+\frac{1}{2}\Delta x^TH(x)\Delta x \)
val f: RealScalarFunction F = new AbstractBivariateRealFunction() {
override public double evaluate(double x, double y) {
return x * y; // f = xy
}
};
Vector x1 = new DenseVector(1, 1);
Hessian H1 = new Hessian(f, x1);
System.out.println(String.format("the Hessian at %s = %s, the det = %f",
x1,
H1,
MatrixMeasure.det(H1)));
Vector x2 = new DenseVector(0, 0);
Hessian H2 = new Hessian(f, x2);
System.out.println(String.format("the Hessian at %s = %s, the det = %f",
x2,
H2,
MatrixMeasure.det(H2)));
RntoMatrix H = new HessianFunction(f);
Matrix Hx1 = H.evaluate(x1);
System.out.println(String.format("the Hessian at %s = %s, the det = %f",
x1,
Hx1,
MatrixMeasure.det(Hx1)));
Matrix Hx2 = H.evaluate(x2);
System.out.println(String.format("the Hessian at %s = %s, the det = %f",
x2,
Hx2,
MatrixMeasure.det(Hx2)));
The output is:
the Hessian at [1.000000, 1.000000] = 2x2
[,1] [,2]
[1,] 0.000000, 1.000000
[2,] 1.000000, 0.000000, , the det = -0.999999
the Hessian at [0.000000, 0.000000] = 2x2
[,1] [,2]
[1,] 0.000000, 1.000000
[2,] 1.000000, 0.000000, , the det = -1.000000
the Hessian at [1.000000, 1.000000] = 2x2
[,1] [,2]
[1,] 0.000000, 1.000000
[2,] 1.000000, 0.000000, , the det = -0.999999
the Hessian at [0.000000, 0.000000] = 2x2
[,1] [,2]
[1,] 0.000000, 1.000000
[2,] 1.000000, 0.000000, , the det = -1.000000