Solving Wave Equation in Different Languages
2D wave equation with Dirichlet boundary condition is solved in C++ (with blas and Lapack), Python (with numpy), Matlab and Julia. Euler backward in time and central finite difference in space discretisation methods are adopted.
The aim is to compare the scientific calculation related syntaxes between these 4 languages. And have simple a comparison on their performances.
Problem setup
Wave equation
Wave equation is used for describe the wave fields such as water waves, sound waves and electromagnetic waves. The equation is: \[ \mathbf{\ddot {U}}=C^2\mathbf{\nabla} ^{2}\mathbf{U} \] In 2D, the principle is: \[ \frac{\partial^2u}{\partial t^2} = C^2\left(\frac{\partial^2u}{\partial x^2}+\frac{\partial^2u}{\partial y^2}\right), \quad u = f(x,t) \] From the formula, we can see it is second order, linear, hyperbolic PDE. Although it can be solved analytically (by variable separation), but we ignore that fact and try to solved it numerically.
Initial condition
\[ u^0_{(x, y)}=\left[r_{(x, y)}^{4}-1\right]\left[\mathrm{e}^{-4 \omega^{2} r_{(x, y)}^{2}}-\mathrm{e}^{-4 \omega^{2}}-\frac{3}{4}\left(\mathrm{e}^{-\omega^{2} r_{(x, y)}^{2}}-\mathrm{e}^{-\omega^{2}}\right)\right] . \]
Boundary condition
Dirchlet boundary is adopted, i.e. all the \(u\) at the boundary is a fixed value 0;
Finite difference methods
Time Derivative Discretisation
Central difference is adopted, with a equal space time-step \(dt\): \[ \begin{aligned} \left[\frac{\partial^2u}{\partial t^2}\right]^k_{ij} =& \frac{v^{k}_{ij}-v^{k-1}_{ij}}{dt}, \\ \text{where} \quad v^k_{ij} = \left[\frac{\partial u}{\partial t}\right]^k_{ij} =& \frac{u^{k+1}_{ij}-u^{k}_{ij}}{dt} \end{aligned} \]
Space Derivatives Discretisation
Central difference is adopted, with a equal-space orthogonal cartesian mesh (\(dx\) for horizontal and \(dy\) for vertical), the derivatives become: \[ \begin{aligned} \left[\frac{\partial^2u}{\partial x^2}\right]^k_{ij} = \frac{u^k_{i+1,j}-2u^k_{ij}+u^k_{i-1,j}}{dx^2} \\ \left[\frac{\partial^2u}{\partial y^2}\right]^k_{ij} = \frac{u^k_{i,j+1}-2u^k_{ij}+u^k_{i,j-1}}{dy^2} \end{aligned} \]
Desirable result

Iteration steps
Element-wise loop:
This method is adopted in the Codes section. It is efficient for this problem, but the scalability is poor and not used in reality.
\[ u^{k+1}_{ij} = u^k_{ij}+v^k_{ij}dt \]
\[ v^{k+1}_{ij} = Cdt\left(\frac{u^{k+1}_{i+1,j}-2u^{k+1}_{ij}+u^{k+1}_{i-1,j}}{dx^2} + \frac{u^{k+1}_{i,j+1}-2u^{k+1}_{ij}+u^{k+1}_{i,j-1}}{dy^2}\right) + v^{k}_{ij} \]
Matrix operation form:
\[ \mathbf{U^{k+1}} = \mathbf{V^{k}} dt + \mathbf{U^{k}} \]
In most of the cases, the 2D matrix is flattened into an 1D column vector*, and the formula turns into: \[ \mathbf{V}^{k+1} = C^2dt\mathbf{AU}^{k+1}+\mathbf{V}^{k} \] where \[ \mathbf{A}=\left[\begin{array}{cccccccc} e & a & & b & & & &&& \\ a & e & a & & b & & &&& \\ & \ddots &\ddots & \ddots& & \ddots & &&\\ b & & a & e & a & & b&&&\\ & \ddots& & \ddots & \ddots & \ddots & &\ddots&& \\ & & b & & a & e & a & &b& \\ & & & \ddots & & \ddots& \ddots & \ddots& &\\ & & & & b & & a& e & a& \\ & & & & & b & & a& e & \end{array}\right], \quad \mathbf{U} = \left[\begin{array}{c} u_{00} \\ u_{10} \\ \vdots \\ u_{01} \\ u_{21} \\ \vdots \\ \vdots \\ \vdots \\ u_{mn} \end{array}\right], \quad \mathbf{V} = \left[\begin{array}{c} v_{00} \\ v_{10} \\ \vdots \\ v_{01} \\ v_{21} \\ \vdots \\ \vdots \\ \vdots \\ v_{mn} \end{array}\right] \]
\[ \text{which }\quad a = \frac{1}{dx^2}, \qquad b = \frac{1}{dy^2}, \qquad e = -\frac{2}{dx^2}-\frac{2}{dy^2} \]
Set the boundary values back to 0
*If the matrix is not flattened:
In 2D case, the equivalent matrix operations formula is: \[\mathbf{V^{k+1}} = \left(\frac{C^2dt}{dx^2}\mathbf{U^{k+1}}\begin{bmatrix}1 & 1 & & & & \\ & -2 & 1 & & & \\ & 1 & -2 & \ddots & & \\ & & 1 & \ddots & 1 & \\ & & & \ddots & -2 & \\ & & & & 1 & 1 \\\end{bmatrix}+\frac{C^2dt}{dy^2}\begin{bmatrix}1 & & & & & \\1 & -2 & 1 & & & \\ & 1 & -2 & 1 & & \\ & & \ddots & \ddots & \ddots & \\ & & & 1 & -2 & 1 \\ & & & & & 1 \\\end{bmatrix} \mathbf{U^{k+1}}\right) + \mathbf{V^k}\] Unfortunately this formula isn't provided in all the mature CFD codes. Maybe because it requests 2 matrix MULTI-ADDs at one step, but the main reason is that when dealing with unstructured mesh i.e. nodes are located freely in the space, not on a cartesian grid, the 2D matrix cannot be used.
*The matrix is flattened in column major way, for example: \[\left[\begin{array}{lll}u_{00} & u_{01} & u_{02} \\u_{10} & u_{11} & u_{12} \\u_{20} & u_{21} & u_{22}\end{array}\right] \Rightarrow\left[\begin{array}{c}u_{00} \\u_{10} \\u_{20} \\u_{01} \\\vdots \\u_{22}\end{array}\right]\]
Codes
All the codes are available here: solve-wave-equation. Central difference in time and space, element-wise operations are applied.
Python, Julia, Matlab
These 3 languages have very similar syntaxes, as shown below:
Scroll horizontally to see all the codes.
| Python | Julia | Matlab |
|---|---|---|
|
|
|
|
|
|
|
C++
Because of the flexibility (difficulty) of C++, there are extra concerns as:
Matrix Datatype:
In order to take advantage of the efficiency of STL(standard template library) without worrying about the memory control, vector<double> is used to store the data to be calculated.
There are two ways of storing a 2D matrix,
- 1D
vector<double>, the most common way. The matrix is flattened as discussed above. - 2D
vector<vector<double> >, consistent with the Python/Julia/Matlab codes.
How to represent a 2D matrix is not a c++ exclusive question. The languages above can also handling both 1D 2D matrix.
2D vector is adopted here for the sake of simplicity.
Matrix operations
There are 2 options for matrix operations
Nested 2-layer for-loops to conduct the element-wised operations. Serial and slow it seems to be, but thanks to the compiler, it will be optimised to be data-level-parallel and the speed is ok.
Besides, there is a fortran library BLAS (Basic Linear Algebra Subprograms), famous of the efficiency and high performance on matrix operations. BLAS has been widely used in the scientific computing field. It not only supports general full-populated matrix type, but also supports the triangular, symmetric and banded matrix computing. (Similar to the sparse matrix type in MATLAB, but less flexible)
Here I choose option 1, still working on option 2.
Since BLAS is written on Fortran, there are 2 ways of calling BLAS in C/C++
- Calling the Fortran BLAS library in C++, need extra attention on the matrix storage method*, and the compiling method*
- Use the CBLAS wrapper, may not work on all of the HPCs, but more convenient. The storage method can be pre-assigned when calling. No need to take care of the compiling method.
*With a 2D matrix, there are two ways of storage in the memory, row-major or column-major. The convention is row-major but on Fortran, it is column-major. \[\begin{aligned}Conventional:&\quad \left(\begin{array}{lll}a_{11} & a_{12} & a_{13} \\a_{21} & a_{22} & a_{23}\end{array}\right) \quad \text { stored as }[\underbrace{a_{11} a_{12} a_{13}}_{\text {Row } 1} \underbrace{a_{21} a_{22} a_{23}}_{\text {Row } 2}] \\Fortran:& \quad\left(\begin{array}{lll}a_{11} & a_{12} & a_{13} \\a_{21} & a_{22} & a_{23}\end{array}\right) \quad \text { stored as }[\underbrace{a_{11} a_{21}}_{\text {Col. } 1} \underbrace{a_{12} a_{22}}_{\text {Col. } 2} \underbrace{a_{13} a_{23}}_{\text {Col. } 3}]\end{aligned}\]
*The way Fortran and C++ compilers generate compiled forms of a function differ, so in order to use a Fortran code, we must tell the C++ compiler to look for the compiled symbol with a slightly different name than what it would otherwise expect. There are two main changes:
Fortran appends the function name with an underscore (_), whereas C/C++ compilers do not.
Since C++ supports function overloading (see Section 6.5), it “mangles” the function names with the parameter datatypes to ensure they are always unique.
In order to call Fortran 90 functions from our C++ code, we therefore need to add the underscore and turn off the name mangling.
Plotting methods:
Although there is no official plotting library in C++, there are many libraries available, such as: GNUPlot, Koolplot, MATLAB, matplotlib-cpp. See plotting package for c++ in detail. I prefer GNUPlot, a fast and powerful, widely used plotting library.
And There are two options of plotting,
- Joint: Embedding GNUPlot with C++ (with gnuplot-iostream), updates the plot real-time with iteration.
- Two-stage: First output all the data of every plotting step, then shade them with GNUPlot all together.
In order to keep align with the codes in the last section, I choose the Joint method here.
Main code
Some of the code is here:
Scroll horizontally to see the utils
| main | utils.hpp | gnuplot-support.cpp |
|---|---|---|
|
|
|
Result

Benchmark on the matrix computing
The codes are evaluated without the I/O or rendering. With purely matrix computing, the time consumptions are:
Data is in result.md.
Compared with original code, Dt = 1e-5 to avoid the divergence when increasing the matrix dimension.
Conclusions
- C++ is great!
- Although 4 languages are covered, I feel like I spent my entire time writing C++. And the better performance over all the other interpreted codes makes me feel well paid off.
- Python is great
- Although Python is famous to be slow, but with numpy library (As far as I know, Anaconda numpy use OpenBLAS) it speed for matrix computation is not bad.
- Julia is NOT fast, at my level
- Actually, I wrote this blog aiming at learning Julia. I was hoping Julia can run as fast as C++ as they claimed. But the performance is the worst of all the codes. Maybe it is because of the JIM system. Or I need more time working on it.
- Besides, the plotting is inefficient. Again, it is not quite fair because I used Julia to call python API. I haven't look into the plotting function in Julia.
- For now, I'd rather use Python or MATLAB for quick preliminary developing and C++ for deploying.
- MATLAB has a really efficient plotting method, even I use octave as a alternative for now. I wander if I can seek the same plotting performance on C++.


