Finite Differences

$$f(x)=a_ix^i$$

Taylor expansion at x=x0 (n=0,1,2,…)
$$ f(x)=\sum_{n=0}^{\infty}\frac{1}{n!}x^n\frac{d^n}{dx^n}f(x_0)$$

Forward, Backward and Central Difference

$$
\begin{align}
\Delta_hf(x)=&\frac{f(x+h)-f(x)}{h}\
\widetilde{\Delta}{h}f(x)=&\frac{f(x)-f(x-h)}{h}\equiv-\Delta{-h}f(x)\
\Delta_{2h}f(x)=&\frac{f(x+h)-f(x-h)}{2h}\equiv \frac{1}{2}[\Delta_hf(x)+\widetilde\Delta_hf(x)]\
\end{align}
$$

Higher order and Underestimed Coefficients

$$\Delta_{h}x^{N}=\sum_{i=1}^{N}C[N][i]x^{N-1}h^{i-1}=Nx^{N-1}+\frac{N(N-1)}{2!}x^{N-2}h+\cdots+h^{N-1}$$

Computing C[n][i] recursively (Pascal Triangle):

1
2
3
C[n][i] = 0;
for(n = 0; n<N+1; n++) C[n][0] = 1;
C[n+1][i+1] = C[n][i] +C[n][i+1];

高h^n的中心差分的消去记得看作业

Integrals to Sums

$$\Delta_hy(x)=f(x)\quad\Longrightarrow\quad y(Nh)=\sum_{i=0}^{N-1}f(ih)h+c;$$

不同积分法看作业

Gaussian Integration

$$\int_{-1}^1f(x)dx\simeq\sum_{i=1}^Nw_if(x_i)$$

Where all coefficients $w_i$

$$w_i=\frac{2(1-x_i^2)}{[(n+1)P_{n+1}(x_i)]^2}$$

For intergral in [a,b], to make it $[a,b]\rightarrow [-1,1]$, do the rescaling
$$\begin{aligned}\int_a^bf(x’)dx’=\frac{(b-a)}{2}\int_{-1}^1f\left(\frac{b-a}{2}x+\frac{b+a}{2}\right)dx\end{aligned}$$

The solution of $w_i$ can be done in a recursive way. 见作业。

Parallelization Methods

  • Threads
  • OpenMP
  • OpenACC
  • MPI
  • CUDA

Root Finding

Bisection

blabla

Newton

$$x\leftarrow x-f(x)/f’(x) \text{ until f(x)/f’(x)<Error}$$

time cost

For N-bit accuracy,

  • Bisection: O(log2(N))
  • Newton: O(log2(log2(N)))

Gaussain Elimination

blabla

Introduction to Matrices

For $$S[x_1,…,x_N]=\frac{1}{2}\sum_i(\sum_ja_{ij}x_j-b_i)^2$$

What is the minium of S?

$$\frac{\partial S}{\partial x_k}=\sum_ia_{ik}(\sum_ja_{ij}x_j-b_i)=0$$

Let’s define $A^{\intercal}=||a_{ji}||$, this implies

$$a_{ik}(a_{ij}x_j-b_i)=0\Leftrightarrow A^{\intercal}(Ax-b)=0$$

This means that if we have N vars and N eqns we can get the minium of S by
solving,

$$Ax=b$$

If we have too few vars we need to solve

$$A^{\intercal}Ax=A^{\intercal}b$$

Notice that [看作业4],

$$\chi^2=\sum_{i=0}^{N_0}\frac{(y_i-f(x_i))^2}{\sigma_i^2}=(y_i/\sigma_i-(x_i^n/\sigma_i)c_n)$$

so we can let $b_i=y_i/\sigma_i$ and $a_{in}=(x_i^n/sigma_i)$ to get the equation,

$$A^{\intercal}Ac=A^{\intercal}b\Longleftrightarrow\sum_i(x_i^{m+n}/\sigma_i^2)c_n=\sum_ix_i^my_i/\sigma_i^2$$

So the new $$A=||a_{nm}=\sum_i(x_i^{m+n}/\sigma_i^2)||$$and the new $$b_n=\sum_i\sum_jx_i^my_i/\sigma_i^2$$

solving $A^{\intercal}Ax=A^{\intercal}b$ becomes a problem called “pseudo-inverse”, aiming to minimizing the residuals in the “new” least square method problem.

Solving the Matrix Problem

Let’s start from a question about blurring: blur one pixel px[x][y] with itself and its 4 neighbors.

1
px[x][y] = 0.25*(px[x+1][y] + px[x-1][y] + px[x][y+1] + px[x][y-1]);

where $x,y \in [0,L-1]$. To make px compatible for periodic input, like some coord.>=L or <=0, do modules for each elements above:

1
px[x][y] = 0.25*(px[(x+1)%L][y] + px[(x-1+L)%L][y] + px[x][(y+1)%L] + px[x][(y-1+L)%L]);

Sometimes the pixel itself is important too, so we add a parameter r where $0\leq r\leq 1$ to show the porposion of the central poel itself:

1
2
px[x][y] = (1-r)*px[x][y]+r/4*(px[(x+1)%L][y] + px[(x-1+L)%L][y]
+ px[x][(y+1)%L] + px[x][(y-1+L)%L]);

The operation above can be written as a stencil as below.

$$\left(\begin{array}{ccc}0&r/4&0\r/4&1-r&r/4\0&r/4&0\end{array}\right)$$

Let’s consider px as a function f, let t be a step iteration, we have

$$f(x,y,t+1)=(1-r)f(x,y,t)+\frac{r}{4}\left[f(x+1,y,t)+f(x-1,y,t)+f(x,y+1,t)+f(x,y-1,t)\right]$$

When $t\rightarrow t+\delta t$, we have

$$\begin{aligned}
f(x,y,t+\delta t)& =(1-r)f(x,y,t)+\frac{r}{4}\left[f(x+a,y,t)+f(x-a,y,t)+f(x,y+a,t)+f(x,y-a,t)\right] \
&\approx(1-r)f(x,y,t)+\frac{r}{4}\left[f(x,y,t)+a\frac{df}{dx}(x,y,t)+\frac{a^2}{2}\frac{d^2f}{dx^2}(x,y,t)\right. \
&+f(x,y,t)-a\frac{df}{dx}(x,y,t)+\frac{a^2}{2}\frac{d^2f}{dx^2}(x,y,t) \
&+f(x,y,t)-a\frac{df}{dy}(x,y,t)+\frac{a^{2}}{2}\frac{d^{2}f}{dy^{2}}(x,y,t) \
&+f(x,y,t)-a\frac{df}{dy}(x,y,t)+\frac{a^2}{2}\frac{d^2f}{dy^2}(x,y,t)\bigg] \
&=(1-r)f(x,y,t)+\frac{r}{4}\left[4f(x,y,t)+2\left(\frac{a^2}{2}\frac{d^2f}{dx^2}(x,y,t)+\frac{a^2}{2}\frac{d^2f}{dy^2}(x,y,t)\right)\right] \
&=f(x,y,t)+\frac{ra^{2}}{4}\nabla^{2}f(x,y,t),
\end{aligned}$$

where $\nabla^2$ is Laplacian. The conclusion above can be simplified as

$$f(x,y,t)+\delta t\frac{df}{dt}(x,y,t) =f(x,y,t)+\frac{ra^{2}}4\nabla^{2}f(x,y,t) $$

So we get an important equation in physics: the Diffusion Equation, where f will labeled appropriately as T the temperature.

$$
\frac{df}{dt}(x,y,t) =\frac{ra^2}{4\delta t}\nabla^2f(x,y,t)
$$

[P22有一些Jacobi和Gauss-Seidel的迭代的代码记得补算法本身]

Curve fitting & Error analysis

This chapter mainly talks about Lagrangian interpolating polynomial

WikipediaWiki_zh-CN

THEOREM:

For given n+1 points $(x_i,y_i),i\in [0,n]$, there is only one Lagrangian polynomial L(x) that:

  1. L(x) has the degree <=k
  2. L(x) goes through every node, which means, $\forall i\in [0,n], L(x_i)=y_i$.

L(x) is repersented as

$$L(x)=\sum_{j=0}^ky_j\ell_j(x)$$

where $l_j(x)$ equals
$$\begin{aligned}
\ell_{j}(x)& =\frac{(x-x_0)}{(x_j-x_0)}\cdots\frac{(x-x_{j-1})}{(x_j-x_{j-1})}\frac{(x-x_{j+1})}{(x_j-x_{j+1})}\cdots\frac{(x-x_k)}{(x_j-x_k)} \
&=\prod_{\substack{0\leq m\leq k\m\neq j}}\frac{x-x_m}{x_j-x_m}.
\end{aligned}$$

and the generation of $L(x)$ can be represented as a Polynomial Interpolation algorithm:

1
2
3
4
5
6
7
8
f(x) = 0
for i = 1 to n + 1 do
g(x) = 1
for j = 1 to n + 1, j != i do
g(x)∗ = (x − xj)/(xi − xj)
end
f(x)+ = yi * g(x)
end

Let’s consider as follows:

  1. $L_i(x)$ should goes through $(x_i,y_i)$ and $(x_j,0)$ where $j\neq i$
  2. Adding all $l_i(x)$ together will get a curve that goes thourgh all these points.

FFT as recursion for divide and conquer

From continuous to discrete

Suppose we sample a periodic signal (with period T) at N times $y_n=f(t_k n )$ for $n=0…N-1$ at equal time intervalt$t_n=n\Delta t$. The Fouriese serise for a desicrete sample of a “time” sequence with N terns can be expanded like this:

$$y_n=a_0+\sum_{k=1}^{N_0}[a_k\cos(2\pi kn/N)+b_k\sin(2\pi kn/N)]$$

where $2N_i+1\geq N$. Frequency $\omega_k=2\pi k /N$.

Using Euler Equations
$$\cos(\theta)+i\sin(\theta)\equiv e^{i\theta}$$

we change the equation above into

$$y_n=\sum_{k=0}^{N-1}c_ke^{2\pi ikn/N}$$

it’s a linear equation, and solving $c_k$ by $y_k$ and Gaussian elimimation is a $O(N^3)$ approach.

but there’s a better trick, notice

$$\sum_ne^{2\pi ikn/N}e^{-2\pi ik’n/N}=\delta_{kk’}$$

we have

$$c_k=\frac{1}{N}\sum_{k=0}^{N-1}y_ke^{-i2\pi nk/N}$$

Let’s define $\omega_N^n\equiv e^{i2\pi n/N}$ as the “Nth roots of unit”. You can regard it as a batch of vectors on a unit circle, dividing the circle into N parts evenly. Or you can regard them as somehow a batch of “base” of vector for a linear space (though not so correct according to the definition).

FFT details

$$y_k\equiv\mathcal{FT}N[c_n]=\sum{n=0}^{N-1}e^{i2\pi nk/N}a_n=\sum_{n=0}^{N-1}(\omega_N^n)^ka_n$$

Use a binary representation of : $n=\sum 2^pn_p, n_p=0,1$
$$\omega_{N}^{n}=\omega_{N}^{n_{0}}\omega_{N/2}^{n_{1}}\omega_{N/4}^{n_{2}}…\omega_{2}^{n_{p}}$$
$$\sum_{n}\omega_{N}^{nk}=\sum_{n_{0}=0,1}\omega_{N/2}^{n_{1}k}\sum_{n_{1}=0,1}\omega_{N}^{n_{0}k}…\sum_{n_{p}=0,1}\omega_{2}^{n_{p}k}$$

Split the polynomial into even and odd parts, Then, we have 2 N/2 FTs:

  • if we do low bit first,
    $$y_k=\sum_{n=0}^{N/2-1}e^{i2\pi nk/(N/2)}a_{2n}+\omega_N^k\sum_{n=0}^{N/2-1}e^{i2\pi nk/(N/2)}a_{2n+1}$$
    lower k:
    $$y_k=\sum_{n=0}^{N/2-1}e^{i2\pi n/(N/2)}[a_{2n}+\omega_N^ka_{2n+1}]$$
    higher k:
    $$y_{k+N/2}=\sum_{n=0}^{N/2-1}e^{i2\pi n/(N/2)}[a_{2n}-\omega_N^ka_{2n+1}$$

  • if we do high bit first,
    $$y_k=\sum_{n=0}^{N/2-1}e^{i2\pi nk/N}a_n+e^{i\pi k}\sum_{n=0}^{N/2-1}e^{i2\pi nk/N}a_{n+N/2}]$$

even k:
$$y_{2\tilde{k}}=\sum_{n=0}^{N/2-1}e^{i2\pi n\tilde{k}/(N/2)}[a_n+a_{n+N/2}]$$
odd k:
$$y_{2\tilde{k}+1}=\sum_{n=0}^{N/2-1}e^{i2\pi n\tilde{k}/(N/2)}\omega_N^n[a_n-a_{n+N/2}]$$

This pattern as a data flow is called the butterfly network.

In summary,
$$\begin{aligned}
y_{k}&=\mathcal{FT}{N/2}[a{2n}+\omega_{N}^{k}a_{2n+1}] \
y_{k+N/2}& =\mathcal{FT}{N/2}[a{2n}-\omega_{N}^{k}a_{2n+1}] \
y_{2k}&=\mathcal{FT}{N/2}[a{n}+a_{n+N/2}] \
y_{2k+1}&=\mathcal{FT}{N/2}\omega{N}^{n}[a_{n}-a_{n+N/2}]
\end{aligned}$$

Multigrid Recursion to speed up Matrix Solvers

blabla