Deep Neural Operators

By Prof. Seungchul Lee
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

Table of Contents

1. Deep Neural Operators¶

1.1. Limitations of PINN¶

PINNs have gained significant attention for their ability to solve forward and inverse problems governed by partial differential equations (PDEs). Their success has been demonstrated across numerous applications, including fluid dynamics, heat transfer, and material science. By embedding physics directly into the learning process, PINNs provide a data-efficient alternative to traditional numerical solvers.

However, despite their advantages, PINNs have clear limitations that hinder their broader applicability:

Single Instance Solver
- A trained PINN model is designed to solve a specific instance of a PDE problem.
- In other words, PINN models can accurately predict solutions for the given initial conditions (ICs), boundary conditions (BCs), and governing equations used during training.
Need for Re-training
- If the initial or boundary conditions change, the entire PINN model must be re-trained from scratch, making real-time or parametric predictions computationally expensive.

This raises a critical question:

How can we enable real-time predictions for varying conditions without retraining the model for each new scenario?

Transition to Deep Neural Operators

Deep Neural Operators (DNOs) address this challenge by learning solution mappings rather than solving individual instances of PDEs. Unlike PINNs, which approximate solutions for a fixed problem setup, DNOs learn function-to-function mappings between input conditions (ICs, BCs, PDE parameters) and their corresponding solutions. This allows them to generalize across a range of conditions, making them highly efficient for real-time predictions and multi-instance problem-solving.

By shifting from instance-specific learning (as in PINNs) to operator learning, Deep Neural Operators unlock new possibilities for solving parametric PDEs, enabling rapid and flexible predictions across varying physical conditions.

1.2. What is an Operator?¶

In mathematics and functional analysis, an operator is a mapping between functions, typically defined over infinite-dimensional spaces. Unlike traditional functions that map numbers to numbers, operators map functions to functions, making them fundamental in various areas of physics, engineering, and machine learning.

Examples of Operators

(1) Derivative Operator: Maps a function to its derivative.

$$f(x) \to \frac{d}{dx} f(x)$$

Example: If $ f(x) = x^2 $, then the operator gives $ f'(x) = 2x $.

(2) Integral Operator: Maps a function to its integral.

$$f(x) \to \int f(x) dx$$

Example: If $ f(x) = 2x $, then the operator yields $ F(x) = x^2 + C $.

(3) Projection Operator: Maps a function or vector onto a lower-dimensional subspace.

Example: Orthogonal projection in Hilbert spaces, used in signal processing and numerical methods.

(4) Fourier Transform and Laplace Transform: Map a function from the time/spatial domain to the frequency domain.

Fourier Transform:

$$ f(x) \quad \to \quad F(j\omega) = \int_{-\infty}^{\infty} f(x) e^{-j 2\pi \omega x} dx$$

Laplace Transform:

$$f(t) \quad \to \quad F(s) = \int_0^\infty f(t) e^{-st} dt$$

These transformations are widely used in signal processing, differential equations, and control theory.

Can Neural Networks Learn Operators?

Deep Neural Operators (DNOs) extend the concept of operators by learning mappings between function spaces. Their theoretical foundation is supported by the universal approximation theorem for operators, which ensures that neural networks can approximate function-to-function mappings with sufficient capacity.

Universal Approximation Theorem for Operators

The Universal Approximation Theorem for Operators states that a deep neural network can approximate any continuous operator that maps one function space to another.

Formally, let:

$ \mathcal{A}: \mathcal{X} \to \mathcal{Y} $ be an operator mapping functions in a Banach space $ \mathcal{X} $ to another Banach space $ \mathcal{Y} $.
A neural network can approximate $ \mathcal{A} $ arbitrarily well given a sufficient number of parameters and suitable architecture.

This means that neural networks are capable of learning entire function-to-function mappings, rather than just function values at specific points.

1.3. Key Papers in DNOs¶

For a deeper understanding of Deep Neural Operators, the following two papers are highly recommended:

(1) Deep Operator Network (DeepONet)

Published in Nature Machine Intelligence (2021)

(2) Fourier Neural Operator (FNO)

Paper: Fourier Neural Operator for Parametric PDEs (2021)

Although both methods offer valuable insights, we focus on DeepONet in this course for the following reasons:

Intuitive Architecture: DeepONet explicitly separates the input function (branch network) and the operator (trunk network), making it easier to understand and implement.
Connection to PINNs: DeepONet can be integrated with Physics-informed Neural Networks (PINNs), making it a natural transition from the previous topics.
Strong Theoretical Foundations: DeepONet is backed by universal approximation results for operators, making it a powerful tool for learning solution mappings in PDE problems.

By focusing on DeepONet, we can develop a solid foundation for neural operator learning while ensuring applicability to a wide variety of scientific and engineering problems.

2. Deep Operator Network (DeepONet)¶

Consider a flow field in a two-dimensional case. Suppose we are interested in identifying the pressure $P$ at $(y_1, y_2)$ given that the viscosity of the fluid is $u_1$. Our goal is to efficiently learn the mapping from viscosity and spatial coordinates to the corresponding pressure field.

Since it is known that $u_1, y_1, y_2 $ determine the pressure value, a fully connected neural network can serve as a potential candidate for modeling this relationship, where $u_1, y_1, y_2 $ are utilized as input features to predict the corresponding pressure field.

However, it has been observed that the approximation improves when the viscosity $ u_1 $ and the spatial coordinates $ (y_1, y_2) $ are processed separately through two distinct neural networks. This approach allows for more effective feature extraction and representation, leading to a better approximation of the pressure field.

The final output is obtained through an element-wise multiplication of the latent representations from two neural networks: the branch network and the trunk network. This operation ensures that the final output reflects the combined influence of the conditions and the specific evaluation point, facilitating the accurate approximation of complex operators.

Training

To train the two neural networks, it is necessary to obtain solution samples from either a numerical solver or real-world data. These samples are then used to train the model via supervised learning, ensuring that the networks accurately learn the mapping from input conditions to the corresponding pressure field.

While one might consider a physics-informed approach capable of solving such problems without data, for now, we will focus on this as a pure data-driven approach. Of course, we will later introduce and discuss the physics-informed DeepONet approach as well.

For now, our emphasis is on training neural networks using available datasets to establish accurate mappings from input conditions to pressure fields.

Evaluation

Once training is complete, given an unseen condition $ u_1 $, we can sweep across the spatial coordinates $ (y_1, y_2) $ to scan the entire domain and evaluate the corresponding pressure field. This allows us to assess the model's ability to generalize to new conditions and accurately predict pressure distributions across the domain.

In the above example, the viscosity $ u_1 $ represents a single condition. However, in many cases, multiple values are required to accurately characterize the conditions. For instance, the inlet velocity distribution can be represented as a vector $ (u_1, u_2, \ldots, u_5) $, capturing variations in flow properties over constant intervals. This multi-dimensional representation enables a more comprehensive and generalizable model for accurately capturing the underlying conditions.

Then, the corresponding neural network should be structured to accommodate multiple input conditions.

2.1. DeepONet Architecture¶

It is now time to formalize and generalize the DeepONet architecture.

(1) Branch Network

Encodes the input function values $ (u_1, u_2, \cdots, u_m) $
e.g., initial/boundary conditions or forcing terms of a PDE
Outputs a latent representation that captures condition information.

(2) Trunk Network

Takes spatial or parametric coordinates as inputs $ (y_1, y_2, \cdots, y_d) $.
Outputs the feature representation of the solution at given locations.

(3) Final Prediction

The DeepONet learns an operator by combining the outputs of the branch and trunk networks using a dot product:

$$G(u)(y) = \sum_{i=1}^{p} b_i(u) t_i(y)$$

$\quad$where:

$G(u)(y)$ is the predicted solution function,
$b_i(u)$ are the outputs from the branch network,
$t_i(y)$ are the outputs from the trunk network.

This formulation allows DeepONet to approximate operators efficiently.

2.2. From Function to Operator¶

A key idea behind DeepONet is the separation of networks to process the input function (conditions) and the evaluation point in the output domain. Unlike conventional neural networks that approximate functions, DeepONet is specifically designed to approximate operators - that is, mappings between functions. This raises an important question: How does a neural network approximate an operator? Does it require a fundamentally different architecture?

Function: Mapping from Vectors to Vectors

Standard feedforward neural networks, such as multilayer perceptrons (MLPs), are known to approximate functions that map vector inputs to vector outputs. This is the classical function approximation setting, where the network learns a pointwise mapping between input and output spaces.

Operator: Mapping from Functions to Functions

An operator maps a function to another function. One way to approximate an operator using neural networks is to represent input functions via sampled values or projections onto a set of basis functions. Given such a representation, an operator can be learned as a mapping from input coefficients (or sampled values) to outputs, which may themselves be reconstructed via output basis functions.

In this sense, neural networks can approximate operators provided that suitable basis functions are defined in both the input and output domains.

The key innovation of DeepONet is that it does not rely on hand-crafted basis functions. Instead, it learns both the latent representation of the input function and the output basis functions directly from data, allowing for more flexible and data-driven operator approximation.

2.3. DeepONet as Basis and Coefficients Representation¶

At first glance, the architecture of DeepONet may appear somewhat unexpected. However, this design is rooted in the universal approximation theorem for operators, which extends the classical universal approximation theorem from functions to operators. The DeepONet architecture consists of two key components: the branch network and the trunk network. The branch network is responsible for processing the input function, encoding its features into a latent representation. Meanwhile, the trunk network processes the evaluation points at which the operator is to be evaluated, generating another set of features.

To provide a clearer understanding, we will now present an insightful interpretation of this structure. By examining DeepONet through the lens of basis and coefficients representation, we can reveal its underlying mathematical foundation. In this interpretation, the trunk network effectively learns a set of basis functions defined over the domain, while the branch network computes the corresponding coefficients for these basis functions. This perspective not only clarifies the role of each network but also highlights how DeepONet leverages its architecture to efficiently approximate complex operators.

Mathematical Interpretation

Given an operator $ G $ that maps an input function $ u $ to an output function $ G(u) $, DeepONet approximates this mapping as:

$$G(u)(y) = \sum_{i=1}^{p} b_i(u) t_i(y)$$

where:

$ b_i(u) $ are the coefficients learned from the branch network, which encodes the input condition $ u $.
$ t_i(y) $ are the basis functions learned from the trunk network, which encodes the spatial or parametric coordinates $ y $.
The final output is obtained through a dot product between the branch and trunk network outputs.

Comparison with Classical Representations

The below figure visually represents DeepONet as a basis and coefficients representation, where the framework decomposes the solution into learned coefficients and learned basis functions.

2.4. Physics-informed DeepONet (PI-DeepONet)¶

The DeepONet is fundamentally a data-driven approach, which relies on solution samples obtained from either a numerical solver or real-world data to learn the operator.

In contrast, Physics-informed DeepONet (PI-DeepONet) enhances the original DeepONet framework by incorporating physical laws directly into the learning process. By embedding governing equations - such as conservation laws or partial differential equations (PDEs) - into the training pipeline, PI-DeepONet effectively leverages both data and physics. This integration leads to improved accuracy, efficiency, and generalization, enabling the model to handle scenarios with sparse or noisy data while maintaining physical consistency.

Ideally, Physics-informed DeepONet (PI-DeepONet) can be trained using only collocation points and governing equations, without requiring any data. However, in practice, it is often recommended to utilize a combination of both observed data and physical laws. This hybrid approach leverages the strengths of data-driven learning and physics-based constraints, leading to improved accuracy, robustness, and generalization, particularly in scenarios where data may be sparse or noisy.

So far, we have explored DeepONet and PI-DeepONet at a high level, covering their fundamental concepts. To gain a solid understanding and practical insights, we will now dive into various examples and applications, illustrating their capabilities and implementation in engineering problems.

3. Lab 1: Heat Transfer¶

Consider the simplest example of DeepONet, which consists of:

One neuron for the branch network (representing the condition).
One neuron for the trunk network (representing the evaluation location).

This minimal architecture captures the essence of DeepONet while maintaining a lightweight structure.

3.1. Problem Setting¶

Consider the steady-state heat conduction in a 1D rod with boundary conditions:

Left boundary: $T(0) = 0$
Right boundary: $T(1) = T_{BC}$

The governing equation for steady-state heat conduction without internal heat generation is:

$$\frac{d^2 T}{dy^2} = 0$$

Analytical Solution
Solving the above equation with the given boundary conditions:

$$T(y) = T_{BC} \cdot y$$

This linear temperature profile represents the heat conduction in a 1D rod.

This example can be used to train a simple DeepONet, where:

The branch network takes $u = T_{BC}$ as input.
The trunk network takes $y$ as input.
The output is $T(y)$, which the model learns from data.

3.2. Solve the Heat Transfer 1D Problem using DeepONet¶

Import Library

import numpy as np
import matplotlib.pyplot as plt
import random
import os
import warnings
warnings.filterwarnings("ignore")

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils import data
%matplotlib inline

CUDA

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print("Using device:", device)

Using device: cuda:0

Define Desired Ouput Locations

y_pts = np.linspace(0, 1, 25).reshape(-1, 1)

plt.figure(figsize = (6, 3))
plt.scatter(y_pts, np.zeros(25), s = 30)
plt.xlabel('')
plt.title('Collocation points')
plt.show()

Boundary Condition $u$ and its Ground Truth

We use 10 points with different $T_{\text{BC}}$ values for training, and one new data point for testing where $T_{\text{BC}} = 60$.

train_q = np.array([10, 20, 30, 40, 50, 70, 80, 90, 100, 110]).reshape(-1, 1)
test_q  = np.array([60]).reshape(-1, 1)

train_gt_list = []
for q in train_q:
    train_gt_list.append(q[0] * y_pts.reshape(-1))
train_gt = np.array(train_gt_list)

test_gt = 60 * y_pts

print('Loading Conditions for Train: {}'.format(train_q.shape))
print('Ground Truth for Train:       {}'.format(train_gt.shape))
print('Loading Conditions for Test:  {}'.format(test_q.shape))
print('Ground Truth for Test:        {}'.format(test_gt.shape))

Loading Conditions for Train: (10, 1)
Ground Truth for Train:       (10, 25)
Loading Conditions for Test:  (1, 1)
Ground Truth for Test:        (25, 1)

for i in range(10):
  plt.plot(y_pts, train_gt[i], '-o', color = 'b', markersize = 5, alpha = 0.3)

plt.plot(y_pts, test_gt, '-o', color = 'r', markersize = 5, label = 'test')

plt.grid('on')
plt.title('Temperature Distribution in the Rod (Ground Truth)')
plt.legend()
plt.xlabel('y')
plt.ylabel('Temperature (°C)')
plt.show()

As observed in the ground truth of the temperature distribution in the rod, the prediction for the test data can be easily obtained through the linear combination of temperature distributions from the training data. However, the primary objective of this example is to demonstrate the implementation of DeepONet, even though the output of the test data is quite straightforward.

C = train_q
Y = y_pts
GT = train_gt

print('Conditions:         {}'.format(C.shape))
print('Collocation points: {}'.format(Y.shape))
print('Ground Truths:      {}'.format(GT.shape))

Conditions:         (10, 1)
Collocation points: (25, 1)
Ground Truths:      (10, 25)

Data Batch Generator

During the training of DeepONet, data needs to be provided in batches. A Data Batch Generator efficiently feeds training data into the DeepONet model.

class DataBatchGenerator(data.Dataset):
    def __init__(self, c, y, gt, batch_size = 64):
        self.c = torch.tensor(c, dtype = torch.float32)
        self.y = torch.tensor(y, dtype = torch.float32)
        self.gt = torch.tensor(gt, dtype = torch.float32)
        self.N = self.c.shape[0] * self.y.shape[0]
        self.batch_size = batch_size

    def __len__(self):
        return (self.N // self.batch_size) + 1

    def __getitem__(self, index):
        c_idx = torch.randint(low = 0, high=self.c.shape[0], size = (self.batch_size,))
        y_idx = torch.randint(low = 0, high=self.y.shape[0], size = (self.batch_size,))
        c_sample = self.c[c_idx, :]
        y_sample = self.y[y_idx, :]
        gt_sample = self.gt[c_idx, y_idx].reshape(-1, 1)
        return c_sample, y_sample, gt_sample

train_dataset = DataBatchGenerator(C, Y, GT)

Define DeepONet

We first define BranchNet and TrunkNet, and then combine them to build the full DeepONet model.

BranchNet extracts features from the input function (e.g., discretized loading conditions).
TrunkNet extracts features from spatial locations where predictions are required.
The final DeepONet combines them via a dot product to generate the output.

class BranchNet(nn.Module):
    def __init__(self, input_dim = 1):
        super(BranchNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 16)
        )

    def forward(self, x):
        return self.linears(x.float())

class TrunkNet(nn.Module):
    def __init__(self, input_dim = 1):
        super(TrunkNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 16)
        )

    def forward(self, x):
        return self.linears(x.float())

class DeepONet(nn.Module):
    def __init__(self):
        super(DeepONet, self).__init__()
        self.branch_net = BranchNet(input_dim = 1)
        self.trunk_net = TrunkNet(input_dim = 1)

    def forward(self, c, y):
        branch_out = self.branch_net(c)
        trunk_out = self.trunk_net(y)
        out = torch.sum(branch_out * trunk_out, dim = 1).reshape(-1, 1)
        return out

model = DeepONet().to(device)

optimizer = optim.Adam(model.parameters(), lr = 5e-4)
Loss_func = nn.MSELoss(reduction = 'mean')

Train

train_iter = iter(train_dataset)
epoch = 0
model.train()

while epoch < 2001:
    u_batch, y_batch, gt_batch = next(train_iter)

    u_batch = u_batch.to(device)
    y_batch = y_batch.to(device)
    gt_batch = gt_batch.to(device)

    pred = model(u_batch, y_batch)
    loss = Loss_func(pred.float(), gt_batch.float())

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 500 == 0:
        with torch.no_grad():
            print('Epoch: {} Loss: {:.6f}'.format(epoch, loss.item()))
    epoch += 1

Epoch: 0 Loss: 1636.793457
Epoch: 500 Loss: 2.654348
Epoch: 1000 Loss: 0.657815
Epoch: 1500 Loss: 0.176380
Epoch: 2000 Loss: 0.063038

Evaluation

def exact_solution(x):
    return 60 * x

with torch.no_grad():
    model.eval()
    test_u = torch.tensor(test_q, dtype = torch.float32).to(device)
    test_Y = torch.tensor(y_pts, dtype = torch.float32).to(device)

    preds = []
    for y in test_Y:
        y_input = y.reshape(-1, 1)
        pred = model(test_u, y_input)
        preds.append(pred)

    test_pred = torch.cat(preds, dim = 0)

    plt.figure(figsize = (6, 4))
    plt.plot(y_pts, test_pred.detach().cpu().numpy(), c = 'r', label = 'Predicted Temperature',
             linestyle = 'dashed', linewidth = 2)
    plt.plot(y_pts, exact_solution(y_pts), c = 'k', label = 'Exact Solution')
    plt.legend(fontsize = 10)
    plt.xticks(fontsize = 10)
    plt.yticks(fontsize = 10)
    plt.ylabel('Temperature (°C)')
    plt.title('Test Result for Heat Conduction Problem')
    plt.show()

4. Lab 2: Euler Beam¶

In the previous example, the condition, specifically the temperature at the right boundary, was represented by a single parameter. In this example, we extend the approach to consider a condition characterized by multiple discrete values.

4.1. Problem Setup¶

We will solve a Euler beam bending problem using DeepONet:

Problem properties

$$E = 1 \operatorname{pa}, \quad I = 1 \operatorname{kg\cdot m^2}, \quad l = 1 \operatorname{m}$$

Training dataset consists of 10 different loading conditions

The load distribution $q(x)$ is represented using 20 equally spaced points, denoted as $u_1, u_2, \cdots, u_{20}$.

The external loading $q(x)$ can be arbitrarily distributed. However, for simplicity, we limit ourselves to the case where $q(x)$ follows an affine form:

$$ q(x) = \omega x + \omega_0 $$

The exact solution can be easily calcuated. For example, when $q (x) = 1$

$$y(x) = -{1 \over 24}x^4 + {1 \over 6}x^3 - {1 \over 4}x^2$$

Assume that the desired output locations to evaluate are set to be 25 equally spaced evaluation points in $[0, 1]$

4.2. Solve the Euler Beam Problem with DeepONet¶

The corresponding DeepONet structure is as follows.

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

Import Library

import numpy as np
import random
import matplotlib.pyplot as plt
%matplotlib inline

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils import data

import warnings
warnings.filterwarnings("ignore")

CUDA

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cuda:0

Define Desired Ouput Locations

The desired output locations to evaluate are set to be 25 equally spaced evaluation points in $[0, 1]$

# Desired ouput locations

y_pts = np.linspace(0, 1, 25).reshape(-1, 1)

plt.figure(figsize = (6, 3))
plt.scatter(y_pts, np.zeros(25), s = 30)
plt.xlabel('y')
plt.title('25 Domain Points')
plt.show()

Load $q(x)$ and its Ground Truth

We use 10 distributed loads with different $\omega_0$ and $\omega_1$ for training, and one distributed load for testing, $q(x) = 1$.

Download

train_q =  np.load('/content/drive/MyDrive/DL/DL_data/q_train.npy')
test_q = np.load('/content/drive/MyDrive/DL/DL_data/q_test.npy')
train_gt = np.load('/content/drive/MyDrive/DL/DL_data/gt_train.npy')
test_gt = np.load('/content/drive/MyDrive/DL/DL_data/gt_test.npy')

print('Loading Condtions for Train  :{}'.format(train_q.shape))
print('Ground Truth for Train       :{}\n'.format(train_gt.shape))

print('Loading Condtions for Test   :{}'.format(test_q.shape))
print('Ground Truth for Test        :{}'.format(test_gt.shape))

Loading Condtions for Train  :(10, 20)
Ground Truth for Train       :(10, 25)

Loading Condtions for Test   :(1, 20)
Ground Truth for Test        :(1, 25)

External Load $q(x)$

Training dataset consists of 10 different loading conditions
The load distribution $q(x)$ is represented using 20 equally spaced points, denoted as $u_1, u_2, \cdots, u_{20}$.

plt.figure(figsize = (6, 4))

for i in range(10):
  plt.plot(np.linspace(0, 1, 20), train_q[i], '-o', color = 'b', markersize = 5, alpha = 0.3)

plt.plot(np.linspace(0, 1, 20), test_q[0], '-o', color = 'r', markersize = 5, label = 'test')

plt.grid('on')
plt.xlim([-0.1, 1.1])
plt.ylim([-0.1, 2.1])
plt.title('External Loading Conditions')
plt.legend()
plt.xlabel('y')
plt.show()

Again, the test loading condition can be expressed as a linear combination of the training loading conditions, and the Euler beam follows a linear system. As a result, the output for the test loading condition is quite predictable. Just keep in mind that the primary goal of this example is to demonstrate the implementation of DeepONet.

Corresponding Ground Truth for Bending Displacement

plt.figure(figsize = (6, 4))

for i in range(10):
  plt.plot(y_pts, train_gt[i], '-o', color = 'b', markersize = 5, alpha = 0.3)

plt.plot(y_pts, test_gt[0], '-o', color = 'r', markersize = 5, label = 'test')

plt.title("Deflection of Euler Beam")
plt.legend()
plt.grid('on')
plt.xlabel('y')
plt.show()

Data Batch Generator

During the training of DeepONet, data needs to be provided in batches. A Data Batch Generator efficiently feeds training data into the DeepONet model.

U = train_q
Y = y_pts
GT = train_gt

print('Loading conditions :{}'.format(U.shape))
print('desired output locs:{}'.format(Y.shape))
print('Ground truths      :{}'.format(GT.shape))

Loading conditions :(10, 20)
desired output locs:(25, 1)
Ground truths      :(10, 25)

# Data Batch Generator
class DataBatchGenerator(data.Dataset):
    def __init__(self, u, y, gt, batch_size = 64):
        'Initialization'
        self.u = torch.tensor(u, dtype = torch.float32)
        self.y = torch.tensor(y, dtype = torch.float32)
        self.gt = torch.tensor(gt, dtype = torch.float32)
        self.N = self.u.shape[0] * self.y.shape[0]
        self.batch_size = batch_size

    def __len__(self):
        return (self.N // self.batch_size) + 1

    def __getitem__(self, index):
        u_idx = torch.randint(low = 0, high = self.u.shape[0], size = (self.batch_size,))
        y_idx = torch.randint(low = 0, high = self.y.shape[0], size = (self.batch_size,))
        u = self.u[u_idx, :]
        y = self.y[y_idx, :]
        gt = self.gt[u_idx, y_idx].reshape(-1, 1)
        return u, y, gt

train_dataset = DataBatchGenerator(U, Y, GT)

Define DeepONet

We first define BranchNet and TrunkNet, and then combine them to build the full DeepONet model.

BranchNet extracts features from the input function (e.g., discretized loading conditions).
TrunkNet extracts features from spatial locations where predictions are required.
The final DeepONet combines them via a dot product to generate the output.

class BranchNet(nn.Module):
    def __init__(self, input_dim = 20):
        super(BranchNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 16)
        )

    def forward(self, x):
        a = x.float()
        return self.linears(a)

class TrunkNet(nn.Module):
    def __init__(self, input_dim = 1):
        super(TrunkNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 32),
            nn.Tanh(),
            nn.Linear(32, 16)
        )

    def forward(self, x):
        a = x.float()
        return self.linears(a)

class DeepONet(nn.Module):
    def __init__(self):
        super(DeepONet, self).__init__()
        self.branch_net = BranchNet(input_dim = 20)
        self.trunk_net = TrunkNet(input_dim = 1)

    def forward(self, u, y):
        branch_out = self.branch_net(u)
        trunk_out = self.trunk_net(y)
        out = torch.sum(branch_out * trunk_out, dim = 1).reshape(-1,1)
        return out

model = DeepONet().to(device)

optimizer = optim.Adam(model.parameters(), lr = 5e-3)
Loss_func = nn.MSELoss(reduction = 'mean')

Train

train_iter = iter(train_dataset)
epoch = 0

model.train()

while epoch < 2001:
  train_batch = next(train_iter)
  u_batch, y_batch, gt_batch = train_batch

  u_batch = u_batch.to(device)
  y_batch = y_batch.to(device)
  gt_batch = gt_batch.to(device)

  pred = model(u_batch, y_batch)

  loss = Loss_func(pred.float(), gt_batch.float())

  optimizer.zero_grad()
  loss.backward()
  optimizer.step()

  if epoch % 500 == 0:
      with torch.no_grad():
          print('Epoch: {:4d} Loss: {:.6f} '.format(epoch, loss.item()))

  epoch += 1

Epoch:    0 Loss: 0.018297 
Epoch:  500 Loss: 0.000089 
Epoch: 1000 Loss: 0.000014 
Epoch: 1500 Loss: 0.000003 
Epoch: 2000 Loss: 0.000002

Evaluation

def exact_solution(y):
    return -(y ** 4) / 24 + y ** 3 / 6 - y ** 2 / 4

with torch.no_grad():
    model.eval()

    test_u = torch.tensor(test_q, dtype = torch.float32).to(device)
    test_Y = torch.tensor(y_pts, dtype = torch.float32).to(device)

    preds = []
    for y in test_Y:
      pred = model(test_u, y)
      preds.append(pred)

    test_pred = torch.cat(preds, dim = 0)

# plot
    plt.figure(figsize = (6, 4))
    plt.plot(y_pts, test_pred.detach().cpu().numpy(), c =  'r',
             label = 'Predict u', linestyle = 'dashed', linewidth = 2)
    plt.plot(y_pts, exact_solution(y_pts), c =  'k', label = 'Exact Solution')
    plt.legend(fontsize = 10)
    plt.xticks(fontsize = 10)
    plt.yticks(fontsize = 10)
    plt.xlabel('Time (s)', fontsize = 10)
    plt.ylabel('Displacement (m)', fontsize = 10)
    plt.title('Test Result', fontsize = 15)
    plt.show()

5. Lab 3: Elastic Deformation for Thin Plate (PI-DeepONet)¶

So far, we have demonstrated how to implement DeepONet with two simple examples. If physical laws are available, they can be incorporated into DeepONet, resulting in PI-DeepONet (Physics-informed DeepONet). This approach effectively combines PINNs with DeepONet, enabling the model to learn from both data and governing physical equations.

Next, we will explore how to implement PI-DeepONet using the example of elastic deformation in a thin plate.

This example is designed to highlight two key purposes:

Extending from 1D to 2D
Transitioning from DeepONet to PI-DeepONet

Here, we will demonstrate PI-DeepONet without relying on any data, focusing solely on the implementation aspect. Of course, the combination of data and physics within PI-DeepONet will be addressed in Lab 4.

5.1. Problem Setup¶

We will solve thin plate equations to find displacement and stress distribution of thin plate using PI-DeepONet

Only Physics-informed (PDE and BC)
No data

Based on Kirchhoff-Love plate theory, three hypotheses were used
- straight lines normal to the mid-surface remain straight after deformation
- straight lines normal to the mid-surface remain normal to the mid-surface after deformation
- the thickness of the plate does not change during a deformation

A non-uniform stretching force is applied to square elastic plate
Only one quarter of the plate is considered since the geometry and in-plane forces are symmetric (yellow domain)

Problem properties

$$ E = 50 \operatorname{Mpa}, \quad \nu = 0.3, \quad \omega = 20 \operatorname{mm}, \quad h = 1 \operatorname{mm}, \quad f \operatorname{Mpa} \in [0.5, 1.5] $$

Governing equations (Föppl-von Kármán equations) for the isotropic elastic plate:

$$ \begin{align*} &{E \over 1 - \nu^2}\left({\partial^2 u \over \partial x^2} + {1 - \nu \over 2}{\partial^2 u \over \partial y^2} + {1 + \nu \over 2}{\partial^2 v \over \partial x \partial y} \right) = 0\\\\ &{E \over 1 - \nu^2}\left({\partial^2 v \over \partial y^2} + {1 - \nu \over 2}{\partial^2 v \over \partial x^2} + {1 + \nu \over 2}{\partial^2 x \over \partial x \partial y} \right) = 0 \end{align*} $$

Two Dirichlet boundary conditions at $x = 0,\, y = 0\; (B.C.①, B.C.②)$:

$$ v(x,y) = 0 \qquad \text{at} \quad y = 0\\\\ u(x,y) = 0 \qquad \text{at} \quad x = 0 $$

Two free boundary conditions at $y = \omega / 2\; (B.C.③)$:

$$ \sigma_{yy} = 0,\quad \sigma_{yx} = 0 $$

Free boundary condition and in-plane force boundary condition at $x = \omega / 2\; (B.C.④)$:

$$ \sigma_{xx} = P \centerdot h,\quad \sigma_{xy} = 0 $$

Make a neural network and loss funcitons like below:

Numerical Solution

Numerical solution of a test case is illustrated in below figures:
- The case of $f$ = 1
- $x, y$ direction displacement $u$, $v$ and stress $\sigma_{xx}$, $\sigma_{yy}$
Solve this problem using PI-DeepONet and then compare with a numerical solution

5.2. Solve the Thin Plate Problem with PI-DeepONet¶

Load Library

import numpy as np
import math
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils import data
%matplotlib inline

CUDA

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cuda:0

Define Properties

# Properties

E = 50.0
nu = 0.3
a = 10.0
b = 10.0
h = 1.0

Generate Collocation Points

Nx = 50                                           # Number of x samples
Ny = 50                                           # Number of y samples
x = np.linspace(0, b, Nx)
y = np.linspace(0, b, Ny)

xy = np.meshgrid(x, y)
y_pts = np.concatenate((xy[0].reshape(-1, 1), xy[1].reshape(-1, 1)), 1)

top_x = np.linspace(0, 10, 50).reshape(-1, 1)
top_y = 10 * np.ones_like(top_x).reshape(-1, 1)

bottom_x = np.linspace(0, 10, 50).reshape(-1, 1)
bottom_y = np.zeros_like(bottom_x).reshape(-1, 1)

left_y = np.linspace(0, 10, 50).reshape(-1, 1)
left_x = np.zeros_like(left_y).reshape(-1, 1)

right_y = np.linspace(0, 10, 50).reshape(-1, 1)
right_x = 10 * np.ones_like(right_y).reshape(-1, 1)


top_pts = np.hstack((top_x, top_y))
bottom_pts = np.hstack((bottom_x, bottom_y))
left_pts = np.hstack((left_x, left_y))
right_pts = np.hstack((right_x, right_y))

plt.figure(figsize = (5, 5))
plt.scatter(y_pts[:, 0], y_pts[:, 1], s=4, color='gray', label='Interior', alpha = 0.3)
plt.scatter(bottom_pts[:, 0], bottom_pts[:, 1], s=4, color='red', label='Bottom')
plt.scatter(top_pts[:, 0], top_pts[:, 1], s=4, color='blue', label='Top')
plt.scatter(left_pts[:, 0], left_pts[:, 1], s=4, color='green', label='Left')
plt.scatter(right_pts[:, 0], right_pts[:, 1], s=4, color='purple', label='Right')

plt.legend()
plt.title('Collocation Points')
plt.xlabel('X')
plt.ylabel('Y')
plt.grid(True)
plt.show()

Load $P(x)$

# Define Condition parameter
train_P = np.load('/content/drive/MyDrive/DL/DL_data/tp_P_train.npy')
test_P = np.load('/content/drive/MyDrive/DL/DL_data/tp_P_test.npy')

print('Loading Condtions for Train: {}'.format(train_P.shape))
print('Loading Condtions for Test:  {}'.format(test_P.shape))

Loading Condtions for Train: (10, 250)
Loading Condtions for Test:  (1, 250)

Plot

plt.figure(figsize = (6, 4))
for i in range(10):
  plt.plot(np.linspace(0, 10, 250), train_P[i], '-o', color = 'b', markersize = 5, alpha = 0.3, markevery = (0, 10))
plt.plot(np.linspace(0, 10, 250), test_P[0], '-o', color = 'r', markersize = 5, label = 'test', markevery = (0, 10))

plt.title("External Loading Conditions")
plt.legend()
plt.grid('on')
plt.xlabel('y')
plt.show()

Generate Dataset

U = train_P
Y = y_pts

print('Loading train conditions  :{}'.format(U.shape))
print('Desired train output locs :{}'.format(Y.shape))

U_test = test_P
Y_test = y_pts

Y_top = top_pts
Y_bottom = bottom_pts
Y_left = left_pts
Y_right = right_pts

Loading train conditions  :(10, 250)
Desired train output locs :(2500, 2)

class DataBatchGenerator(data.Dataset):
    def __init__(self, u, y, batch_size = 2500):
        self.u = torch.tensor(u, dtype = torch.float32)
        self.y = torch.tensor(y, dtype = torch.float32)
        self.N = self.u.shape[0] * self.y.shape[0]
        self.batch_size = batch_size

    def __len__(self):
        return (self.N // self.batch_size) + 1

    def __getitem__(self, index):
        u_idx = torch.randint(low = 0, high = self.u.shape[0], size = (self.batch_size,))
        y_idx = torch.randint(low = 0, high = self.y.shape[0], size = (self.batch_size,))
        u = self.u[u_idx, :]
        y = self.y[y_idx, :]
        return u, y

train_dataset = DataBatchGenerator(U, Y)
train_top = DataBatchGenerator(U, Y_top)
train_bottom = DataBatchGenerator(U, Y_bottom)
train_left = DataBatchGenerator(U, Y_left)
train_right = DataBatchGenerator(U, Y_right)

Define Neural Networks and Hyper-parameter

class BranchNet(nn.Module):
    def __init__(self, input_dim = 250):
        super(BranchNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100)
        )

    def forward(self, x):
        a = x.float()
        return self.linears(a)

class TrunkNet(nn.Module):
    def __init__(self, input_dim = 2):
        super(TrunkNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100)
        )

    def forward(self, x):
        a = x.float()
        return self.linears(a)

class DeepONet(nn.Module):
    def __init__(self):
        super(DeepONet, self).__init__()
        self.branch_net = BranchNet(input_dim = 250)
        self.trunk_net = TrunkNet(input_dim = 2)

        self._initialize_weights()

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Linear):
                nn.init.xavier_normal_(m.weight)
                nn.init.zeros_(m.bias)

    def forward(self, u, y):
        branch_out = self.branch_net(u)
        trunk_out = self.trunk_net(y)
        out = torch.sum(branch_out * trunk_out, dim = 1).reshape(-1,1)
        return out

model_w = DeepONet().to(device)
model_v = DeepONet().to(device)

optimizer = optim.Adam(list(model_w.parameters()) + list(model_v.parameters()), lr = 1e-4)
Loss_func = nn.MSELoss(reduction = 'mean')

Define Functions to Compute Physics Loss

def derivative(y, t):
    df = torch.autograd.grad(y, t, grad_outputs = torch.ones_like(y).to(device), create_graph = True)[0]
    df_x = df[:, 0:1]
    df_y = df[:, 1:2]
    return df_x, df_y

def requires_grad(x):
    return torch.tensor(x, requires_grad = True).to(device)

Define PDE with Boundary Conditions

Governing equations (Föppl-von Kármán equations) for the isotropic elastic plate:

$$ \begin{align*} &{E \over 1 - \nu^2}\left({\partial^2 u \over \partial x^2} + {1 - \nu \over 2}{\partial^2 u \over \partial y^2} + {1 + \nu \over 2}{\partial^2 v \over \partial x \partial y} \right) = 0\\\\ &{E \over 1 - \nu^2}\left({\partial^2 v \over \partial y^2} + {1 - \nu \over 2}{\partial^2 v \over \partial x^2} + {1 + \nu \over 2}{\partial^2 x \over \partial x \partial y} \right) = 0 \end{align*} $$

def PDE(U, Y_domain):
    w, v = model_w(U, Y_domain), model_v(U, Y_domain)

    dw_x, dw_y = derivative(w, Y_domain)
    dw_xx, dw_xy = derivative(dw_x, Y_domain)
    _, dw_yy =  derivative(dw_y, Y_domain)

    dv_x, dv_y = derivative(v, Y_domain)
    dv_xx, dv_xy = derivative(dv_x, Y_domain)
    _, dv_yy = derivative(dv_y, Y_domain)

    force_eq_x = (dw_xx + 0.5 * (1 - nu) * dw_yy + 0.5 * (1 + nu) * dv_xy) * E / (1 - nu**2)
    force_eq_y = (dv_yy + 0.5 * (1 - nu) * dv_xx + 0.5 * (1 + nu) * dw_xy) * E / (1 - nu**2)

    return force_eq_x.float(), force_eq_y.float()

Two Dirichlet boundary conditions at $x = 0,\, y = 0\; (B.C.① \text{ (left)}, B.C.② \text{ (bottom)})$:

$$ \begin{align*} v(x,y) &= 0 \qquad \text{at} \quad y = 0 \text{: (bottom)} \\ \\ u(x,y) &= 0 \qquad \text{at} \quad x = 0 \text{: (left)} \end{align*} $$

def BC_left(U_left, Y_left):
    # Left edge
    w = model_w(U_left, Y_left)
    return w.float()

def BC_bottom(U_bottom, Y_bottom):
    # Bottom edge
    v = model_v(U_bottom, Y_bottom)
    return v.float()

Two free boundary conditions at $y = \omega / 2\; (B.C.③ \text{ (top)})$:

$$ \sigma_{yy} = 0,\quad \sigma_{yx} = 0 $$

Free boundary condition and in-plane force boundary condition at $x = \omega / 2\; (B.C.④ \text{ (right)})$:

$$ \begin{align*} \sigma_{xx} &= P \cdot h \\ \sigma_{xy} &= 0 \end{align*} $$

Stress

$$ \begin{align*} \sigma_{xx} &= \frac{E}{1 - \nu^2} \left( \frac{\partial u}{\partial x} + \nu \frac{\partial v}{\partial y} \right) \\ \sigma_{yy} &= \frac{E}{1 - \nu^2} \left( \frac{\partial v}{\partial y} + \nu \frac{\partial u}{\partial x} \right)\\ \sigma_{xy} &= \sigma_{yx} = \frac{E}{2(1 + \nu)} \left( \frac{\partial v}{\partial x} + \frac{\partial u}{\partial y} \right) \end{align*} $$

def BC_top(U_top, Y_top):
   # Top edge
    w, v = model_w(U_top, Y_top), model_v(U_top, Y_top)

    dw_x, dw_y = derivative(w, Y_top)
    dv_x, dv_y = derivative(v, Y_top)

    sig_xy = (dv_x + dw_y) * E / (1 + nu) / 2
    sig_yy = (dv_y + nu * dw_x) * E / (1 - nu**2)

    return sig_xy.float(), sig_yy.float()

def BC_right(U_right, Y_right):
   # Right edge
    w, v = model_w(U_right, Y_right), model_v(U_right, Y_right)

    dw_x, dw_y = derivative(w, Y_right)
    dv_x, dv_y = derivative(v, Y_right)

    sig_xx = (dw_x + nu * dv_y) * E / (1 - nu**2)
    sig_xy = (dv_x + dw_y) * E / (1 + nu) / 2
    sig_ex = U_right[:, 0:1] * h * torch.cos(math.pi / (2 * b) * Y_right[:, 1:2]).reshape(-1, 1)

    sig_xx_ex = sig_xx.float() - sig_ex.float()

    return sig_xx_ex.float(), sig_xy.float()

Train

(This process may require a significant amount of time; therefore, we recommend not running it. However, we have provided a pretrained model for your convenience.)

train_iter = iter(train_dataset)
train_left_iter = iter(train_left)
train_right_iter = iter(train_right)
train_bottom_iter = iter(train_bottom)
train_top_iter = iter(train_top)

epoch = 0

while epoch < 150001:

    train_batch = next(train_iter)
    train_left_batch = next(train_left_iter)
    train_right_batch = next(train_right_iter)
    train_bottom_batch = next(train_bottom_iter)
    train_top_batch = next(train_top_iter)

    u_batch, y_batch = train_batch
    u_left_batch, y_left_batch = train_left_batch
    u_right_batch, y_right_batch = train_right_batch
    u_bottom_batch, y_bottom_batch = train_bottom_batch
    u_top_batch, y_top_batch = train_top_batch

    ## Requires grad
    u_batch, y_batch = requires_grad(u_batch), requires_grad(y_batch)
    u_left_batch, y_left_batch = requires_grad(u_left_batch), requires_grad(y_left_batch)
    u_right_batch, y_right_batch = requires_grad(u_right_batch), requires_grad(y_right_batch)
    u_bottom_batch, y_bottom_batch = requires_grad(u_bottom_batch), requires_grad(y_bottom_batch)
    u_top_batch, y_top_batch = requires_grad(u_top_batch), requires_grad(y_top_batch)

    w, v = model_w(u_batch, y_batch), model_v(u_batch, y_batch)

    ## PDE loss
    force_eq_x, force_eq_y = PDE(u_batch, y_batch)
    loss_PDE_x = Loss_func(force_eq_x, torch.zeros_like(force_eq_x))
    loss_PDE_y = Loss_func(force_eq_y, torch.zeros_like(force_eq_y))
    loss_PDE = loss_PDE_x + loss_PDE_y

    ## BC loss
    left_w = BC_left(u_left_batch, y_left_batch)
    right_sig_xx_ex, right_sig_xy = BC_right(u_right_batch, y_right_batch)
    bottom_v = BC_bottom(u_bottom_batch, y_bottom_batch)
    top_sig_xy, top_sig_yy = BC_top(u_top_batch, y_top_batch)

    loss_BC_left = Loss_func(left_w, torch.zeros_like(left_w))
    loss_BC_right_1 = Loss_func(right_sig_xx_ex, torch.zeros_like(right_sig_xx_ex))
    loss_BC_right_2 = Loss_func(right_sig_xy, torch.zeros_like(right_sig_xy))
    loss_BC_bottom = Loss_func(bottom_v, torch.zeros_like(bottom_v))
    loss_BC_top_1 = Loss_func(top_sig_xy, torch.zeros_like(top_sig_xy))
    loss_BC_top_2 = Loss_func(top_sig_yy, torch.zeros_like(top_sig_yy))

    loss_BC_right = loss_BC_right_1 + loss_BC_right_2
    loss_BC_top = loss_BC_top_1 + loss_BC_top_2
    loss_BC = loss_BC_left + loss_BC_bottom + loss_BC_right + loss_BC_top

    loss = loss_PDE + loss_BC

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 1000 == 0:
        print('Epoch: {} Loss: {:.6f} PDELoss: {:.6f} BCLoss: {:.6f}'.format(epoch, loss.item(), loss_PDE.item(), loss_BC.item()))

    epoch += 1

Test Pretrained Model

def check_stress(test_U, test_Y):
    w, v = test_model_w(test_U, test_Y), test_model_v(test_U, test_Y)

    dw_x, _ = derivative(w, test_Y)
    _, dv_y = derivative(v, test_Y)

    sig_xx = (dw_x + nu * dv_y) * E / (1 - nu**2)
    sig_yy = (dv_y + nu * dw_x) * E / (1 - nu**2)

    return sig_xx, sig_yy

test_model_w = torch.load('/content/drive/MyDrive/DL/DL_data/tp_model_w.pt')
test_model_v = torch.load('/content/drive/MyDrive/DL/DL_data/tp_model_v.pt')
test_model_w.eval()
test_model_v.eval()

Nx = 100                                            # Number of samples
Ny = 100                                            # Number of samples
x_test = torch.linspace(0, b, Nx)                   # Input data for x (Nx x 1)
y_test = torch.linspace(0, b, Ny)                   # Input data for y (Ny x 1)
Y_test = torch.meshgrid(x_test, y_test)

u_test = U_test[0]
Y =  torch.cat([Y_test[0].reshape(-1, 1), Y_test[1].reshape(-1, 1)], dim = 1)
U = u_test * np.ones((len(Y), 1))
U, Y = requires_grad(U), requires_grad(Y)

pred_w = test_model_w(U, Y)
pred_v = test_model_v(U, Y)
pred_sig_x, pred_sig_y = check_stress(U, Y)
test_results = torch.cat((pred_w, pred_v, pred_sig_x, pred_sig_y), dim = 1)

color_legend = [[0, 0.182], [-0.06, 0.011], [-0.0022, 1.0], [-0.15, 0.45]]
title = ['x-displacement ($u$)', 'y-displacement ($v$)', '$\sigma_{xx}$', '$\sigma_{yy}$']

plt.figure(figsize=(10, 8))
for idx in range(4):
    plt.subplot(2, 2, idx+1)
    plt.scatter(Y[:, 0].detach().cpu().numpy(),
                Y[:, 1].detach().cpu().numpy(),
                c = test_results.detach().cpu().numpy()[:, idx],
                cmap = 'rainbow',
                s = 7)
    plt.clim(color_legend[idx])
    plt.title(title[idx], fontsize = 13)
    plt.xlabel('x (mm)', fontsize = 12)
    plt.ylabel('y (mm)', fontsize = 12)
    plt.axis('square')
    plt.xlim(0, 10)
    plt.ylim(0, 10)
    plt.colorbar()
plt.tight_layout()
plt.show()

6. Lab 4: Flow Around a Cylinder¶

We will learn how to implement PI-DeepONet with the following example: Flow around a cylinder.

6.1. Problem Setup¶

We will solve the Naiver-Stokes equation with varying viscosity term with PI-DeepONet

Data from a numerical solver and
Physics (PDE and BC)

This approach combines PINNs with DeepONet, allowing the model to learn from both data and governing physical equations.

Problem properties

$$\rho = 1\operatorname{kg/m^3}, \quad \mu \operatorname{N\cdot s/m^2} \in [0.002, 0.02] , \quad D = 2h = 1\operatorname{m}, \quad L = 2\operatorname{m}, \quad u_{in} = 1\operatorname{m/s}, \quad \nu = \frac{\mu}{\rho}$$

2D Navier-Stokes Equations & boundary conditions (for steady state)

$$ \begin{align*} \rho \left(u{\partial u \over \partial x} + v{\partial u \over \partial y} + {1 \over \rho}{\partial p \over \partial x}\right) - \mu \ \left({\partial^2 u \over {\partial x^2}} + {\partial^2 u \over {\partial y^2}}\right) &= 0\\\\ \rho \left(u{\partial v \over \partial x} + v{\partial v \over \partial y} + {1 \over \rho}{\partial p \over \partial y}\right) - \mu \ \left({\partial^2 v \over {\partial x^2}} + {\partial^2 v \over {\partial y^2}}\right) &= 0\\\\ {\partial u \over \partial x} + {\partial v \over \partial y} &= 0 \end{align*} $$

Two Dirichlet boundary conditions on the plate boundary (no-slip condition)

$$u(x,y) = 0, \quad v(x,y) = 0 \qquad \text{at} \quad y = \frac{D}{2} \ \; \text{or} \; -\frac{D}{2}$$

Two Dirichlet boundary conditions at the inlet boundary (no-slip condition)

$$u(-1,y) = u_{\text{in}}, \quad v(-1,y) = 0$$

Two Dirichlet boundary conditions at the outlet boundary

$$p(1,y) = 0, \quad v(1,y) = 0$$

Two Dirichlet boundary conditions at the cylinder boundary

$$u(\text{cylinder}) = 0, \quad v(\text{cylinder}) = 0$$

Create a neural network and define loss functions as follows:

A single input neuron for the branch network
Two input neurons for the trunk network

6.2. Solve the Navier-Stokes Equations with PI-DeepONet¶

Load Library

import numpy as np
import matplotlib.pyplot as plt
import random
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils import data

%matplotlib inline

import os
import warnings
warnings.filterwarnings("ignore")

CUDA

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
print(device)

cuda:0

Define Conditions

Train: 9 different viscosity values
Test: 2 unseen viscosity values

train_vis = np.array([0.002, 0.004, 0.008, 0.01, 0.02, 0.06, 0.08, 0.1, 0.2]).reshape(-1, 1)
test_vis = np.array([0.006, 0.04]).reshape(-1, 1)

To obtain the ground truth for the training data, a CFD simulation has been performed. The ground truth values are stored only at the mesh points in the CFD simulation.

Download

mesh_pts = np.load('/content/drive/MyDrive/DL/DL_data/viscosity_train_xy_domain.npy')
train_gt= np.load('/content/drive/MyDrive/DL/DL_data/viscosity_train_gt_domain.npy')
test_gt = np.load('/content/drive/MyDrive/DL/DL_data/viscosity_test_gt_domain.npy')

print("train XY shape: {}".format(mesh_pts[0].shape))
print("train gt shape: {}".format(train_gt.shape))
print("test gt shape:  {}".format(test_gt.shape))

train XY shape: (19692, 2)
train gt shape: (9, 19692, 3)
test gt shape:  (2, 19692, 3)

Note

19,692 mesh points
$(y_1, y_2)$ (or $(x,y)$) are the desired output locations
For each mesh point, the output values include:
- velocity in the $x$ direction
- velocity in the $y$ direction
- pressure

Collocation Points for PDE (Desired Output Locations)

All the mesh points in the CFD simulation are good candidates for collocation points in PI-DeepONet. These points ensure that the model adheres to physics-informed constraints.

y_pts = mesh_pts[0]

Collocation Points for Boundary Conditions

top_x = np.linspace(-0.5, 1.5, 200).reshape(-1, 1)
top_y = 0.5 * np.ones_like(top_x).reshape(-1, 1)

bottom_x = np.linspace(-0.5, 1.5, 200).reshape(-1, 1)
bottom_y = -0.5 * np.ones_like(bottom_x).reshape(-1, 1)

inlet_y = np.linspace(-0.5, 0.5, 100).reshape(-1, 1)
inlet_x = -0.5 * np.ones_like(inlet_y).reshape(-1, 1)

outlet_y = np.linspace(-0.5, 0.5, 100).reshape(-1, 1)
outlet_x = 1.5 * np.ones_like(outlet_y).reshape(-1, 1)

radius = 0.05
theta = np.linspace(0, 2 * np.pi, 200)
cylinder_x = (0 + radius * np.cos(theta)).reshape(-1, 1)
cylinder_y = (0 + radius * np.sin(theta)).reshape(-1, 1)

top_pts = np.hstack((top_x, top_y))
bottom_pts = np.hstack((bottom_x, bottom_y))
inlet_pts = np.hstack((inlet_x, inlet_y))
outlet_pts = np.hstack((outlet_x, outlet_y))
cylinder_pts = np.hstack((cylinder_x, cylinder_y))

wall_pts = np.concatenate((top_pts, bottom_pts, cylinder_pts), 0)

plt.scatter(y_pts[:, 0], y_pts[:, 1], s = 1)
plt.scatter(wall_pts[:, 0], wall_pts[:, 1], s = 5)
plt.scatter(inlet_pts[:, 0], inlet_pts[:, 1], s = 5)
plt.scatter(outlet_pts[:, 0], outlet_pts[:, 1], s = 5)

plt.title('Collocation Points')
plt.axis('scaled')
plt.xlabel('x')
plt.ylabel('y')
plt.show()

Data Batch Generator

U = train_vis
Y = y_pts
GT = train_gt

print('Loading conditions :{}'.format(U.shape))
print('desired output locs:{}'.format(Y.shape))
print('Ground truths      :{}'.format(GT.shape))

U_test = test_vis
Y_test = y_pts
GT_test = test_gt

Y_wall = wall_pts
Y_inlet = inlet_pts
Y_outlet = outlet_pts

Loading conditions :(9, 1)
desired output locs:(19692, 2)
Ground truths      :(9, 19692, 3)

# Data generator
class DataBatchGenerator(data.Dataset):
    def __init__(self, u, y, gt, batch_size = 64):
        'Initialization'
        self.u = torch.tensor(u, dtype = torch.float32)
        self.y = torch.tensor(y, dtype = torch.float32)
        self.gt = torch.tensor(gt, dtype = torch.float32)
        self.N = self.u.shape[0] * self.y.shape[0]
        self.batch_size = batch_size

    def __len__(self):
        return (self.N // self.batch_size) + 1

    def __getitem__(self,index):
        u_idx = torch.randint(low = 0, high = self.u.shape[0], size = (self.batch_size,))
        y_idx = torch.randint(low = 0, high = self.y.shape[0], size = (self.batch_size,))
        u = self.u[u_idx, :]
        y = self.y[y_idx, :]
        gt = self.gt[u_idx, y_idx, :]
        return u, y, gt

# BC generator
class BCGenerator(data.Dataset):
    def __init__(self, u, y, batch_size = 64):
        'Initialization'
        self.u = torch.tensor(u, dtype = torch.float32)
        self.y = torch.tensor(y, dtype = torch.float32)
        self.N = self.u.shape[0] * self.y.shape[0]
        self.batch_size = batch_size

    def __len__(self):
        return (self.N // self.batch_size) + 1

    def __getitem__(self,index):
        u_idx = torch.randint(low = 0, high = self.u.shape[0], size = (self.batch_size,))
        y_idx = torch.randint(low = 0, high = self.y.shape[0], size = (self.batch_size,))
        u = self.u[u_idx, :]
        y = self.y[y_idx, :]
        return u, y

train_dataset = DataBatchGenerator(U, Y, GT)
train_inlet = BCGenerator(U, Y_inlet)
train_wall = BCGenerator(U, Y_wall)
train_outlet = BCGenerator(U, Y_outlet)

Define DeepONet Structure

class BranchNet(nn.Module):
    def __init__(self, input_dim = 1):
        super(BranchNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100)
        )

    def forward(self, x):
        a = x.float()
        return self.linears(a)

class TrunkNet(nn.Module):
    def __init__(self, input_dim = 2):
        super(TrunkNet, self).__init__()
        self.linears = nn.Sequential(
            nn.Linear(input_dim, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100),
            nn.Tanh(),
            nn.Linear(100, 100)
        )

    def forward(self, x):
        a = x.float()
        return self.linears(a)

class DeepONet(nn.Module):
    def __init__(self):
        super(DeepONet, self).__init__()
        self.branch_net = BranchNet(input_dim = 1 )
        self.trunk_net = TrunkNet(input_dim = 2)

    def forward(self, u, y):
        branch_out = self.branch_net(u)
        trunk_out = self.trunk_net(y)
        out = torch.sum(branch_out * trunk_out, dim = 1).reshape(-1,1)
        return out

Since the output includes velocities in the $x$- and $y$-directions and pressure, three separate DeepONets are built - one for each output.

We could have designed a single DeepONet with three output neurons, but to keep the network as simple as possible, we chose to build three independent networks instead.

model_v1 = DeepONet().to(device)
model_v2 = DeepONet().to(device)
model_p = DeepONet().to(device)

optimizer = optim.Adam(list(model_v1.parameters()) +
                       list(model_v2.parameters()) +
                       list(model_p.parameters()),
                       lr = 5e-4)

Loss_func = nn.MSELoss(reduction = 'mean')

Define Functions to Compute Physics Loss

def derivative(y, t):
    df = torch.autograd.grad(y, t, grad_outputs = torch.ones_like(y).to(device), create_graph = True)[0]
    df_x = df[:, 0:1]
    df_y = df[:, 1:2]
    return df_x, df_y

def requires_grad(x):
    return torch.tensor(x, requires_grad = True).to(device)

Define PDE

2D Navier-Stokes Equations & boundary conditions (for steady state)

$$ \begin{align*} \rho \left(u{\partial u \over \partial x} + v{\partial u \over \partial y} + {1 \over \rho}{\partial p \over \partial x}\right) - \mu \ \left({\partial^2 u \over {\partial x^2}} + {\partial^2 u \over {\partial y^2}}\right) &= 0\\\\ \rho \left(u{\partial v \over \partial x} + v{\partial v \over \partial y} + {1 \over \rho}{\partial p \over \partial y}\right) - \mu \ \left({\partial^2 v \over {\partial x^2}} + {\partial^2 v \over {\partial y^2}}\right) &= 0\\\\ {\partial u \over \partial x} + {\partial v \over \partial y} &= 0 \end{align*} $$

def PDE(u_batch, y_batch):

    v1, v2, p = model_v1(u_batch, y_batch), model_v2(u_batch, y_batch), model_p(u_batch, y_batch)

    dv1_x, dv1_y = derivative(v1, y_batch)
    dv1_xx, _ = derivative(dv1_x, y_batch)
    _, dv1_yy =  derivative(dv1_y, y_batch)

    dv2_x, dv2_y = derivative(v2, y_batch)
    dv2_xx, _ = derivative(dv2_x, y_batch)
    _, dv2_yy = derivative(dv2_y, y_batch)

    dp_x, dp_y = derivative(p, y_batch)

    vis = u_batch
    pde_v1 = 1 * (v1 * dv1_x + v2 * dv1_y) + dp_x - vis * (dv1_xx + dv1_yy)
    pde_v2 = 1 * (v1 * dv2_x + v2 * dv2_y) + dp_y - vis * (dv2_xx + dv2_yy)
    pde_cont = dv1_x + dv2_y

    return pde_v1.float(), pde_v2.float(), pde_cont.float()

Boundary Conditions

Two Dirichlet boundary conditions on the plate boundary (no-slip condition),

$$u(x,y) = 0, \quad v(x,y) = 0 \qquad \text{at} \quad y = \frac{D}{2} \ \; \text{or} \; -\frac{D}{2}$$

Two Dirichlet boundary conditions at the cylinder boundary

$$u(\text{cylinder}) = 0, \quad v(\text{cylinder}) = 0$$

Two Dirichlet boundary conditions at the inlet boundary

$$u(-1,y) = u_{\text{in}}, \quad v(-1,y) = 0$$

Two Dirichlet boundary conditions at the outlet boundary

$$p(1,y) = 0, \quad v(1,y) = 0$$

def BC_wall(u_wall, y_wall):
    wall_v1 = model_v1(u_wall, y_wall)
    wall_v2 = model_v2(u_wall, y_wall)
    return wall_v1.float(), wall_v2.float()

def BC_inlet(u_inlet, y_inlet):
    inlet_v1 = model_v1(u_inlet, y_inlet)
    inlet_v1 = inlet_v1 - torch.ones_like(inlet_v1).to(device)
    inlet_v2 = model_v2(u_inlet, y_inlet)
    return inlet_v1.float(), inlet_v2.float()

def BC_outlet(u_outlet, y_outlet):
    outlet_v2 = model_v2(u_outlet, y_outlet)
    outlet_p = model_p(u_outlet, y_outlet)
    return outlet_v2.float(), outlet_p.float()

Train

(Again, this process may require a significant amount of time; therefore, we recommend not running it. However, we have provided a pretrained model for your convenience.)

epoch = 0

train_iter = iter(train_dataset)
train_wall_iter = iter(train_wall)
train_inlet_iter = iter(train_inlet)
train_outlet_iter = iter(train_outlet)

while epoch < 150001:

    train_batch = next(train_iter)
    train_wall_batch = next(train_wall_iter)
    train_inlet_batch = next(train_inlet_iter)
    train_outlet_batch = next(train_outlet_iter)

    u_batch, y_batch, gt_batch = train_batch
    u_wall_batch, y_wall_batch = train_wall_batch
    u_inlet_batch, y_inlet_batch = train_inlet_batch
    u_outlet_batch, y_outlet_batch = train_outlet_batch

    ## Reguires grad
    u_batch, y_batch, gt_batch = requires_grad(u_batch), requires_grad(y_batch), requires_grad(gt_batch)
    u_wall_batch, y_wall_batch = requires_grad(u_wall_batch), requires_grad(y_wall_batch) # BC wall
    u_inlet_batch, y_inlet_batch = requires_grad(u_inlet_batch), requires_grad(y_inlet_batch) # BC inlet
    u_outlet_batch, y_outlet_batch = requires_grad(u_outlet_batch), requires_grad(y_outlet_batch) # BC outlet

    v1, v2, p = model_v1(u_batch, y_batch), model_v2(u_batch, y_batch), model_p(u_batch, y_batch)

    ## Data loss
    loss_data_v1 = Loss_func(v1.float(), gt_batch[:, 0:1].float())
    loss_data_v2 = Loss_func(v2.float(), gt_batch[:, 1:2].float())
    loss_data_p = Loss_func(p.float(), gt_batch[:, 2:3].float())

    loss_data = loss_data_v1 + loss_data_v2 + loss_data_p

    ## PDE loss
    PDE_v1, PDE_v2, PDE_cont = PDE(u_batch, y_batch)

    loss_PDE_v1 = Loss_func(PDE_v1, torch.zeros_like(PDE_v1).to(device))
    loss_PDE_v2 = Loss_func(PDE_v2, torch.zeros_like(PDE_v2).to(device))
    loss_PDE_cont = Loss_func(PDE_cont, torch.zeros_like(PDE_cont).to(device))

    loss_pde = loss_PDE_v1 + loss_PDE_v2 + loss_PDE_cont

    ## BC loss
    wall_v1, wall_v2 = BC_wall(u_wall_batch, y_wall_batch)
    inlet_v1, inlet_v2 = BC_inlet(u_inlet_batch, y_inlet_batch)
    outlet_v2, outlet_p = BC_outlet(u_outlet_batch, y_outlet_batch)

    loss_BC_wall_v1 = Loss_func(wall_v1, torch.zeros_like(wall_v1).to(device))
    loss_BC_wall_v2 = Loss_func(wall_v2, torch.zeros_like(wall_v2).to(device))
    loss_BC_inlet_v1 = Loss_func(inlet_v1, torch.zeros_like(inlet_v1).to(device))
    loss_BC_inlet_v2 = Loss_func(inlet_v2, torch.zeros_like(inlet_v2).to(device))
    loss_BC_outlet_v2 = Loss_func(outlet_v2, torch.zeros_like(outlet_v2).to(device))
    loss_BC_outlet_p = Loss_func(outlet_p, torch.zeros_like(outlet_p).to(device))

    loss_BC_wall = loss_BC_wall_v1 + loss_BC_wall_v2
    loss_BC_inlet = loss_BC_inlet_v1 + loss_BC_inlet_v2
    loss_BC_outlet  =loss_BC_outlet_v2 + loss_BC_outlet_p
    loss_bc = loss_BC_wall + loss_BC_inlet + loss_BC_outlet

    ## Total loss
    loss = 10 * loss_data + 0.1 * loss_bc + 0.01 * loss_pde

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 1000 == 0:
        with torch.no_grad():
            print('Epoch: {} Loss: {:.4f} DATALoss: {:.4f} PDELoss: {:.4f} BCLoss: {:.4f}'\
                  .format(epoch, loss.item(), 10 * loss_data.item(), 0.01 * loss_pde.item(), 0.1 * loss_bc.item()))

    epoch += 1

6.3. Test Pre-trained Model¶

Since running the training process takes a long time, we are providing the pre-trained model, allowing you to use it for evaluating unseen data directly.

model_v1

with torch.no_grad():
    test_model_v1 = torch.load('/content/drive/MyDrive/DL/DL_data/model_v1.pt')

    u = U_test[0]
    Y = Y_test
    GT = GT_test[0]

    U = u * np.ones((len(Y), 1))
    U, Y = requires_grad(U), requires_grad(Y)

    pred = test_model_v1(U, Y)

    plt.figure(figsize = (9, 2))
    plt.subplot(1, 3, 1)
    plt.title('CFD')
    plt.scatter(Y[:, 0].detach().cpu().numpy(), Y[:, 1].detach().cpu().numpy(), c = GT[:, 0:1], s = 0.1, cmap = 'jet')
    plt.axis('scaled')

    plt.subplot(1, 3, 2)
    plt.title('Prediction')
    plt.scatter(Y[:, 0].detach().cpu().numpy(), Y[:, 1].detach().cpu().numpy(), c = pred.detach().cpu().numpy(), s = 0.1, cmap = 'jet')
    plt.axis('scaled')

    plt.subplot(1, 3, 3)
    plt.title('Error')
    plt.scatter(Y[:, 0].detach().cpu().numpy(), Y[:, 1].detach().cpu().numpy(), c = np.abs(pred.detach().cpu().numpy() - GT[:,0:1]), s = 0.1, cmap = 'jet')
    plt.axis('scaled')
    plt.tight_layout()
    plt.show()

6.4. Fast Computational Accelerator¶

We have demonstrated PI-DeepONet using the flow around a cylinder example. For an unseen viscosity value, the trained model can infer and predict the flow velocity field and pressure field.

This framework serves as a fast computational accelerator (or an AI-based surrogate model), significantly reducing the computational cost compared to traditional numerical simulations. The results below illustrate the predicted velocity and pressure fields across varying viscosity values in near real-time.

GIF 설명

%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')