The data set he mentions is Johns Hopkins Turbulence Databases. Many hundreds of Terabytes of direct numerical simulations with different governing equations and boundary conditions.
Tuesday, January 5, 2021
Wednesday, March 27, 2019
Engineering Sketch Pad
I haven't heard of Engineering Sketch Pad (source code as part of OpenMDAO, and here) before, but this is yet another NASA sponsored open source tool that could be useful to you for aircraft conceptual design. I read about it in a post on Another Fine Mesh about some interesting research the folks at Pointwise are doing. It reminds me of, but is different from, Open Vehicle Sketch Pad.
There's a seminar on the software given by one of the developers up on a NASA site: The Engineering Sketch Pad (ESP): Supporting Design Through Analysis. (yea, DARPA!)
It has some neat features that make it useful to support high-fidelity analysis. It creates watertight geometry, it can carry attributes with the geometry that could guide mesh resolution, it does "conservative" data transfer for discipline coupling (match a solver's numerical scheme), and most of its parts are differentiable which is useful for optimization.
I added this to my list of Open Source Aeronautical Engineering Tools.
Saturday, December 17, 2016
Hybrid Parallelism Approaches for CFD
Strategies
Recent progress and challenges in exploiting graphics processors in computational fluid dynamics provides some general strategies for using multiple levels of parallelism accross GPUs, CPU cores and cluster nodes based on that review of the literature:- Global memory should be arranged to coalesce read/write requests, which can improve performance by an order of magnitude (theoretically, up to 32 times: the number of threads in a warp)
- Shared memory should be used for global reduction operations (e.g., summing up residual values, finding maximum values) such that only one value per block needs to be returned
- Use asynchronous memory transfer, as shown by Phillips et al. and DeLeon et al. when parallelizing solvers across multiple GPUs, to limit the idle time of either the CPU or GPU.
- Minimize slow CPU-GPU communication during a simulation by performing all possible calculations on the GPU.
Saturday, November 19, 2016
Saturday, December 20, 2014
Gaussian Processes for Machine Learning
- The Gaussian Processes for Machine Learning book.
- Software around the web, and to go with the book
- Data Sets
- Tutorials
Thursday, February 6, 2014
TPS Sizing with Complex Step Method
![]() |
TPS Sizing Optimization Using Complex Variable Differentiation Sensitivity |
Tuesday, January 14, 2014
CFD Vision 2030: Discretizations, Solvers, and Numerics
- Incomplete or inconsistent convergence behavior: "There are many possible reasons for failure, ranging from poor grid quality to the inability of a single algorithm to handle singularities such as strong shocks, under-resolved features, or stiff chemically reacting terms. What is required is an automated capability that delivers hands-off solid convergence under all reasonable anticipated flow conditions with a high tolerance to mesh irregularities and small scale unsteadiness."
- Algorithm efficiency and suitability for emerging HPC: "In order to improve simulation capability and to effectively leverage new HPC hardware, foundational mathematical research will be required in highly scalable linear and non-linear solvers not only for commonly used discretizations but also for alternative discretizations, such as higher-order techniques89. Beyond potential advantages in improved accuracy per degree of freedom, higher-order methods may more effectively utilize new HPC hardware through increased levels of computation per degree of freedom."
Wednesday, January 8, 2014
Algorithmic Improvements: just as important as Moore's Law
I see many emerging technologies that promise further great progress in computing. Here are some of them. I wish some industry people here could post some updates about their way to the market. They may not literally prolong the Moore's Law in regards to the number of transistors, but they promise great performance gains, which is what really matters.
3D chips. As materials science and manufacturing precision advances, we will soon have multi-layered (starting at a few layers that Samsung already has, but up to 1000s) or even fully 3D chips with efficient heat dissipation. This would put the components closer together and streamline the close-range interconnects. Also, this increases "computation per rack unit volume", simplifying some space-related aspects of scaling.
Memristors. HP is ready to produce the first memristor chips but delays that for business reasons (how sad is that!) Others are also preparing products. Memristor technology enables a new approach to computing, combining memory and computation in one place. They are also quite fast (competitive with the current RAM) and energy-efficient, which means easier cooling and possible 3D layout.
Photonics. Optical buses are finding their ways into computers, and network hardware manufacturers are looking for ways to perform some basic switching directly with light. Some day these two trends may converge to produce an optical computer chip that would be free from the limitations of electric resistance/heat, EM interference, and could thus operate at a higher clock speed. Would be more energy efficient, too.
Spintronics. Probably further in the future, but potentially very high-density and low-power technology actively developed by IBM, Hynix and a bunch of others. This one would push our computation density and power efficiency limits to another level, as it allows performing some computation using magnetic fields, without electrons actually moving in electrical current (excuse me for my layman understanding).
Quantum computing. This could qualitatively speed up whole classes of tasks, potentially bringing AI and simulation applications to new levels of performance. The only commercial offer so far is Dwave, and it's not a classical QC, but so many labs are working on that, the results are bound to come soon.
3D chips, memristors, photonics, spintronics, QC
I think Moore's Law is a steamroller. But, like the genomics sequencing technology highlighted in that post on Nuit Blanche, there are improvements just as fast, or faster than Moore's law. The improvements from better algorithms can yield exponential speed-ups too. Here's a graph (from this report) depicting the orders of magnitude improvement in linear solver performance:
Couple these software improvements with continually improving hardware and things get pretty exciting. I'm happy to live in these interesting times!
Monday, July 23, 2012
Convergence for Falkner-Skan Solutions
There are some things about the paper that are not novel, and others that seem to be nonsense. It is well-known that there can be multiple solutions at given parameter values (non-uniqueness) for this equation, see White. There is the odd claim that "the flow starts to create shock waves in the medium [above the critical wedge angle], which is a representation of chaotic behavior in the flow field." Weak solutions (solutions with discontinuities/shocks) and chaotic dynamics are two different things. They use the fact that the method they choose does not converge when two solutions are possible as evidence of chaotic dynamics. Perhaps the iterates really do exhibit chaos, but this is purely an artifact of the method (i.e. there is no physical time in this problem, only the pseudo-time of the iterative scheme). By using a different approach you will get different "dynamics", and with proper choice of method, can get convergence (spectral even!) to any of the multiple solutions depending on what initial condition you give your iterative scheme. They introduce a parameter, \(\eta_{\infty}\), for the finite value of the independent variable at "infinity" (i.e. the domain is truncated). There is nothing wrong with this (actually it's a commonly used approach for this problem), but it is not a good idea to solve for this parameter as well as the shear at the wall in your Newton iteration. A more careful approach of mapping the boundary point "to infinity" as the grid resolution is increased (following one of Boyd's suggested mappings) removes the need to solve for this parameter, and gives spectral convergence for this problem even in the presence of non-uniqueness and the not uncommon vexation of a boundary condition defined at infinity (all of external aerodynamics has this helpful feature).
Friday, August 5, 2011
Computational Explosive Astrophysics
Wednesday, October 20, 2010
NALPAL: Not A Livermore Physics Applications Language
Background and Motivation
We seem not to be able to use the machine, which we all believe is a very powerful tool for manipulating and transforming information, to do our own tasks in this very field. We have compilers, assemblers, monitors, etc. for others, and yet when I examine what the typical software person does, I am often appalled at how little he uses the machine in his own work.
Build productivity enhancing tools of broad applicability for the expert user so that efficient, special purpose PDE codes can be built reliably and quickly, rather than attempt to second guess the expert and build general purpose PDE codes (black box systems) of doubtful efficiency and reliability.
- Take as input a PDE description, along with boundary and initial condition definitions
- Discretize the PDE
- Analyze the result (e.g. for stability)
- Calculate the Jacobian (needed for Newton methods in implicit time-integration or non-linear boundary value problem (BVP)s)
- Generate code
- Manipulate the set of partial differential equations to cast them into a form that is amenable to numerical solution. For vector PDEs, this might include vector differential calculus operations and reexpression in scalar (component) form, and the application of a linearization approximation for non-linear PDEs.
- Discretize the time and space domain, and transform the partial differential operators in the PDEs into finite difference operators. This transforms the partial differential equations into a set of algebraic equations. A multitude of possible transformations for the differential operators are possible and the boundary conditions for the PDEs also must be appropriately handled. The resulting difference equations must be analyzed to see if they form an accurate and numerically stable approximation of the original equation set. For real world problems, this analysis is usually difficult and often intractable.
- After choosing a solution algorithm from numerical linear algebra, the finite difference equations and boundary conditions are coded in a programing language such as FORTRAN.
- The numerical algorithm is then integrated with code for file manipulations, operating system interactions, graphics output, etc. forming a complete computer program.
- The production program is then executed, and its output is analyzed, either in the form of numerical listings or computer-generated graphics.
With continuing advances in computer technology, the last step in this process has become easier. For a given class of problems, answers can be calculated more quickly and economically. More importantly, harder problems which require more computational resources can be solved. But the first four steps have not yet benefited from advances in computer performance; in fact, they are aggravated by it.
Taken together with the software described in other chapters, these tools allow the user to quickly generate a FORTRAN code, run numerical experiments, and discard the code without remorse if the numerical results are unsatisfactory.
PDE Code Gen Recipes
curvi : [xi, eta, zeta]; /* the curvilinear coordinates */
indep : [x, y, z]; /* the independent variables */
depends(curvi, indep);
depends(dep, curvi);
nn : length(indep);
eqn : sum(diff(sigma * diff(f, indep[i]), indep[i]), i, 1, nn);


for j : 1 thru 3 do (
for i : 1 thru 3 do (
J[i,j] : ’diff(indep[j],curvi[i])
)
);
K : zeromatrix(3,3);
for j : 1 thru 3 do (
for i : 1 thru 3 do (
K[i,j] : diff(curvi[j], indep[i])
)
);
grid_trans_subs : matrixmap(”=”, K, invert(J));
/* making substitutions from a list is easier than from a matrix */
grid_trans_sublis : flatten(makelist(grid_trans_subs[i],i,1,3));
/* Evaluation took 0.0510 seconds (0.0553 elapsed) using 265.164 KB. */
trans_eqn_factor : factor(trans_eqn) $
/* Evaluation took 2.4486 seconds (2.5040 elapsed) using 48.777 MB. */
- Define dependencies between independent and dependent (and possibly computational coordinates)
- Associate a list if indices with the coordinates
- Define rules that transform differential terms into difference terms with the appropriate index shifts and constant multipliers corresponding to the coordinate which the derivative is with respect to and the selected finite difference expression
- Apply the rules to the PDE to give a finite difference equation (FDE)
- Use Maxima’s simplification and factoring capabilities to simplify the FDE
- Output the FDE in FORTRAN format and wrap with subroutine boilerplate using text processing macros
defrule(deriv_subst_1, ’diff(fmatch,xmatch,1), diff_1(fmatch,xmatch)),
defrule(deriv_subst_2, ’diff(fmatch,xmatch,2), diff_2(fmatch,xmatch)),
mat_expr : apply1(mat_expr, deriv_subst_1),
mat_expr : apply1(mat_expr, deriv_subst_2),
- reduce_cnst()
- gcfac()
- optimize()
- fortran()
- change()
- change variables, convert arbitrary second order differential equation in nn variables to an arbitrary coordinate frame in the variables xi[i]
- notate
- atomic notation for derivatives
- notation(exp,vari)
- primitive atomic notation
- scheme()
- introduce differences of unknowns
- difference(u,f,exp)
- primitive differences, scheme and difference collect the coefficients of the differences and calculate the stencil of the solver and coordinate transformations
- myFORTRAN()
- write the FORTRAN code
Conclusion
References
Saturday, January 16, 2010
Chaotic Method of Manufactured Solutions: Lorenz 63
I quickly brushed over the method of manufactured solutions (MMS) aspects in the previous post on Lorenz 63, you might wonder why. Well, it turns out that applying the MMS to chaotic systems includes an additional consideration that is not normally of concern. Here is the ’standard’ sort of checklist for choosing manufactured solutions (taken from Verification of Computer Codes in Computational Science and Engineering, emphasis mine):
- Manufactured solutions should be sufficiently smooth on the problem domain so that the theoretical order-of-accuracy can be matched by the observed order-of-accuracy obtained from the test. A manufactured solution that is not sufficiently smooth may decrease the observed order-of-accuracy (however, see # 7 in this list).
- The solution should be general enough that it exercises every term in the governing equation. For example, one should not choose temperature T in the unsteady heat equation to be independent of time. If the governing equation contains spatial cross-derivatives, make sure the manufactured solution has a non-zero cross derivative.
- The solution should have a sufficient number of nontrivial derivatives. For example, if the code that solves the heat conduction equation is second order in space, picking T in the heat equation to be a linear function of time and space will not provide a sufficient test because the discretization error would be zero (to within round-off) even on coarse grids.
- Solution derivatives should be bounded by a small constant. This ensures that the solution is not a strongly varying function of space or time or both. If this guideline is not met, then one may not be able to demonstrate the required asymptotic order-of-accuracy using practical grid sizes. Usually, the free constants that are part of the manufactured solution can be selected to meet this guideline.
- The manufactured solution should not prevent the code from running successfully to completion during testing. Robustness issues are not a part of code order verification. For example, if the code (explicitly or implicitly) assumes the solution is positive, make sure the manufactured solution is positive; or if the heat conduction code expects time units of seconds, do not give it a solution whose units are nanoseconds.
- Manufactured solutions should be composed of simple analytical functions like polynomials, trigonometric, or exponential functions so that the exact solution can be conveniently and accurately computed. An exact solution composed of infinite series or integrals of functions with singularities is not convenient.
- The solution should be constructed in such a manner that the differential operators in the governing equations make sense. For example, in the heat conduction equation, the flux is required to be differentiable. Therefore, if one desires to test the code for the case of discontinuous thermal conductivity, then the manufactured solution for temperature must be constructed in such a way that the flux is differentiable. The resultant manufactured solution for temperature is nondifferentiable.
- Avoid manufactured solutions that grow exponentially in time to avoid confusion with numerical instability.
As emphasized in # 4 we often have to use the free constants in our manufactured solution to bound the derivatives in the domain. For chaotic systems we’ll need to use not only the free constants, but the initial conditions as well, to achieve another end: ensure that the chosen solution is in the convergent region at all times. This is a sufficient but not necessary condition for the source term to be in the basin of entrainment. When the source term is in the basin of entrainment then we can expect to be able to verify the ordered convergence of our implementation as we would in the non-chaotic case. Otherwises we can expect the truncation error to eventually lead to an arbitrarily large departure from the manufactured solution. This terminology comes from the nonlinear dynamics and control literature [1] [3]. A worked example for the Lorenz ’63 system showing the boundary of the convergent region is in Chapter 6 of [2].
The convergent region is defined by the region of phase space for which the eigenvalues of the system Jacobian has negative real parts. As in any other application of the MMS, we start by choosing some convenient solution consistent with the desiderata enumerated above. Trigonometric functions are convenient in this case:
![]() | (1) |
![]() | (2) |
![]() | (3) |
Now we need the Jacobian of the system (and then its eigenvalues), since our governing equations are definied in Maxima, that is easy enough:
we can select a useful IC/amplitude for our manufactured solutions */
sys_eqns : [eqn_1, eqn_2, eqn_3] $
J_cr : zeromatrix(3,3) $
for j : 1 thru 3 do
( for i : 1 thru 3 do
(
J_cr[i,j] : diff(rhs(sys_eqns[i]), unknowns[j])
)
) $
[J_cr_eigvals, J_cr_eigmult] : eigenvalues(J_cr) $
J_cr_eigvals : fullratsimp(J_cr_eigvals) $ $
As before these expressions are output in Fortran90 format for compilation. Here are the eigenvalues in the complex plane for bx = 1.0, by = -1.0, bz = 10.0.
We can immediately see the problem from Figure 1: our manufactured solution is in the region of phase space where one of the Jacobian’s eigenvalues has a positive real part. We can adjust the initial conditions of our solution to get all of our eigenvalues on the left half-plane.
Being limited to only certain portions of the phase space for conducting verification is fine, because a single well-designed manufactured solution verifies the correctness of the entire implementation (as long as all of the terms are activated). This is the same reason that it is often emphasized in the V&V literature that there is no need for the manufactured solution to be physically meaningful.
References
[1] Jackson, E.A., “Controls of dynamic flows with attractors,” Physics Review A, Vol. 44, Is. 8, 1991.
[2] Wu, W., “Analytical and Numerical Methods Applied to Nonlinear Vessel Dynamics and Code Verification for Chaotic Systems,” Disertation, Virginia Polytechnic Institute and State University, Dec, 2009.
[3] W. Wu, L.S. McCue, and C.J. Roy. “The method of manufactured solutions applied to chaotic systems,” Nonlinear Dynamics, 2009, under review.
Friday, January 15, 2010
Lorenz 63 Ensemble
![]() | (1) |
![]() | (2) |
![]() | (3) |
depends(unknowns, t) $
/* define the governing equations: */
expr_1 : -Pr * x + Pr * y - diff(x,t) $
eqn_1 : solve(expr_1, diff(x,t))[1] $
expr_2 : -y + Ra * x - x * z - diff(y,t) $
eqn_2 : solve(expr_2, diff(y,t))[1] $
expr_3 : -b * z + x * y - diff(z,t) $
eqn_3 : solve(expr_3, diff(z,t))[1] $
A : matrix([1/4, -1/4],
[1/4, 5/12]) $
bT : matrix([1/4, 3/4]) $
c : matrix([0], [2/3]) $
/* identity matrix for doing Kronecker products, dimension depends on
the size of the system you are trying to solve */
sys_I : diagmatrix(3, 1) $
K : transpose([k[1], k[2], k[3], k[4], k[5], k[6]]) $
Xn : transpose([x[n],y[n],z[n],x[n],y[n],z[n]]) $
tn : transpose([t[n], t[n], t[n], t[n], t[n], t[n]]) $
F : transpose([f[1], f[2], f[3], f[4], f[5], f[6]]) $
A_rk : kronecker_product(A, sys_I) $
bT_rk : kronecker_product(bT, sys_I) $
hc_rk : kronecker_product(t[n] + h*c, sys_I) $
/* argument to our system right-hand-side operator: */
nonlin_arg : Xn + h * A_rk . K $
/* set up the two-stage system: */
rk_sys : zeromatrix(6,1) $
q_sys : zeromatrix(6,1) $
rk_sys[1,1] : k[1] - subst(x=nonlin_arg[1,1], subst(y=nonlin_arg[2,1],
subst(z=nonlin_arg[3,1], rhs(eqn_1)))) $
rk_sys[2,1] : k[2] - subst(x=nonlin_arg[1,1], subst(y=nonlin_arg[2,1],
subst(z=nonlin_arg[3,1], rhs(eqn_2)))) $
rk_sys[3,1] : k[3] - subst(x=nonlin_arg[1,1], subst(y=nonlin_arg[2,1],
subst(z=nonlin_arg[3,1], rhs(eqn_3)))) $
rk_sys[4,1] : k[4] - subst(x=nonlin_arg[4,1], subst(y=nonlin_arg[5,1],
subst(z=nonlin_arg[6,1], rhs(eqn_1)))) $
rk_sys[5,1] : k[5] - subst(x=nonlin_arg[4,1], subst(y=nonlin_arg[5,1],
subst(z=nonlin_arg[6,1], rhs(eqn_2)))) $
rk_sys[6,1] : k[6] - subst(x=nonlin_arg[4,1], subst(y=nonlin_arg[5,1],
subst(z=nonlin_arg[6,1], rhs(eqn_3)))) $
J : zeromatrix(6,6) $
for j : 1 thru 6 do
( for i : 1 thru 6 do
(
J[i,j] : diff(rk_sys[i,1], k[j])
)
) $
time_update : factor(h* bT_rk . K) $ $
file_output_append : false $
fname : ”rhs.f90” $
with_stdout(fname, print(”! generated by lorentz63_mms.mac”)) $
file_output_append : true $
with_stdout(fname, f90(f[1] = rk_sys[1,1])) $
/* ...much more boring file IO follows... */
T = 40.0 # integration period
Pr = 10.0
Ra = 28.0
b = 8.0 / 3.0
# knobs for the Newton’s method:
tol = 1e-12
maxits = 20
nic = 7 # number of different IC’s in our ensemble
h = T / float(nt)
x = sp.zeros((nt,nic), dtype=float, order=’F’)
y = sp.zeros((nt,nic), dtype=float, order=’F’)
z = sp.zeros((nt,nic), dtype=float, order=’F’)
t = sp.linspace(0.0, T, nt)
# some perturbations:
eps = 1e-16 * sp.array([0, 1, -1, 2, -2, 3, -3])
for i in xrange(nic):
x[0][i] = 1.0 + eps[i]
y[0][i] = -1.0 + eps[i]
z[0][i] = 10.0 + eps[i]
from scipy.misc import comb
# calculate our ensemble of trajectories:
for i in xrange(nic):
L63.time_loop(x[:,i],y[:,i],z[:,i],t,Ra,Pr,b,h,tol,maxits)
# calculate all the pair-wise differences between trajectories (nic
# choose 2 of em):
npairs = comb(nic, 2, exact=1)
xdiffs = sp.zeros((nt,npairs), dtype=float, order=’F’)
ydiffs = sp.zeros((nt,npairs), dtype=float, order=’F’)
zdiffs = sp.zeros((nt,npairs), dtype=float, order=’F’)
for i, pair in enumerate(combinations(xrange(nic), 2)):
xdiffs[:,i] = x[:,pair[0]] - x[:,pair[1]]
ydiffs[:,i] = y[:,pair[0]] - y[:,pair[1]]
zdiffs[:,i] = z[:,pair[0]] - z[:,pair[1]]