AI for Mechanical Engineering: Manufacturing

By Dr. Bumsoo Park
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST

# 1. Tool Wear PredictionÂ¶

• Taylor's equation for tool wear:

$$VT^n = C,$$
• where
• $V$: Cutting speed
• $T$: Tool life
• $n, C$: Empirical constants
• A typical problem involves predicting/evaluating the remaining tool life, given conditions ($V, n, C$)

## 1.1 Prediction with DataÂ¶

• What if we are only given data?

InÂ [Â ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy
import sklearn

InÂ [Â ]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive

InÂ [Â ]:
# Load the .npy file

# Convert the numpy array to a DataFrame

InÂ [Â ]:
# Original data

plt.xlabel('Cutting Speed (m/min)')
plt.ylabel('Tool Life (min)')
plt.legend()
plt.show()


### Option 1Â¶

• We know there is an exponential relationship between $V$ and $T$ (from $VT^n=C$)
• Using this information, we first linearize in log-log scale

\begin{align*} \\ \log{(VT^n)} &= \log{C}, \\ \log{(V)}+\log{(T^n)} &= \log{C}, \\ \log{(V)}+n\log{(T)} &= \log{C}, \\ \log{(T)} &= -\frac{1}{n}\log{(V)}+ -\frac{1}{n}\log{C} \\ \end{align*}

InÂ [Â ]:
df_loaded['log_V'] = np.log(df_loaded['Cutting Speed (V)'])

# Plotting the original data and the linearized data
plt.figure(figsize=(12, 5))

# Original data
plt.subplot(1, 2, 1)
plt.xlabel('Cutting Speed')
plt.ylabel('Tool Life')
plt.title('Original Data')

# Linearized data
plt.subplot(1, 2, 2)
plt.xlabel('log(Cutting Speed)')
plt.ylabel('log(Tool Life)')
plt.title('Linearized Data')

plt.show()

• Next perform linear regression
InÂ [Â ]:
# Perform linear regression on the log-transformed data

# Calculate n and C from the slope and intercept
n_estimated = -1 / slope
C_estimated = np.exp(intercept * n_estimated)

# Plotting the original data and the linearized data
plt.figure(figsize=(14, 6))

# Original data
plt.subplot(1, 2, 1)
plt.xlabel('Cutting Speed (V)')
plt.ylabel('Tool Life (T)')
plt.title('Original Data')

# Linearized data
plt.subplot(1, 2, 2)
plt.xlabel('log(Cutting Speed (V))')
plt.ylabel('log(Tool Life (T))')
plt.title('Linearized Data')
plt.legend()

plt.show()

• Then transform back to linear space
InÂ [Â ]:
# Original data

# Fitted line
cutting_speeds_fit = np.linspace(50, 150, 500)
tool_life_fit = (C_estimated / cutting_speeds_fit) ** (1 / n_estimated)
plt.plot(cutting_speeds_fit, tool_life_fit, color='red', label='Fitted Line')

plt.xlabel('Cutting Speed (V)')
plt.ylabel('Tool Life (T)')
plt.title('Fitted Line in Linear Space')
plt.legend()
plt.show()

print(f"Estimated n: {n_estimated}")
print(f"Estimated C: {C_estimated}")

Estimated n: 0.24571963417534542
Estimated C: 283.6716060346621


### Option 2Â¶

• Use Scipy's optimize.curve_fit() function for optimization
InÂ [Â ]:
# Define Taylor's Tool Wear Equation for fitting
def taylors_equation(V, n, C):
return (C / V) ** (1 / n)

# Initial guess for the parameters
initial_guess = [0.5, 100]

# Perform curve fitting: inputs(function, x, y, initial guess)
popt, pcov = scipy.optimize.curve_fit(taylors_equation, df_loaded['Cutting Speed (V)'],

# Extract the fitted parameters
n_fitted, C_fitted = popt

# Original data

# Fitted line
cutting_speeds_fit = np.linspace(50, 150, 500)
tool_life_fit = (C_fitted / cutting_speeds_fit) ** (1 / n_fitted)
plt.plot(cutting_speeds_fit, tool_life_fit, color='red', label='Fitted Line')

plt.xlabel('Cutting Speed (m/min)')
plt.ylabel('Tool Life (min)')
plt.title('Fitted Line in Linear Space')
plt.legend()
plt.show()

print(f"Estimated n: {n_fitted}")
print(f"Estimated C: {C_fitted}")

Estimated n: 0.2541470894413146
Estimated C: 308.90136308585073


# 2. Rotary Machinery Signal AnalysisÂ¶

• Assume we have time series vibration data from a rotary machine

• 3 types of signals

InÂ [Â ]:
# Load the data from the .npy files
inner = inner.reshape(1800, -1)
outer = outer.reshape(1800, -1)
normal = normal.reshape(1800, -1)

# Print the shapes or some basic information about the data to confirm it's loaded correctly
print("Inner Split Data Shape:", inner.shape)
print("Normal Split Data Shape:", normal.shape)
print("Outer Split Data Shape:", outer.shape)

Inner Split Data Shape: (1800, 527)
Normal Split Data Shape: (1800, 527)
Outer Split Data Shape: (1800, 527)

• Plot the signals
InÂ [Â ]:
classes = ['Normal', 'Inner_fault', 'Outer_fault']
data = [normal, inner, outer]

for state, name in zip(data, classes):
print('{} data shape: {}'.format(name, state.shape))
plt.figure(figsize = (9, 6))
for i in range(3):
plt.subplot(3,1,i+1)
plt.title('Class: {}'.format(classes[i]), fontsize = 15)
plt.plot(data[i][0])
plt.ylim([-4,4])

plt.tight_layout()
plt.show()

Normal data shape: (1800, 527)
Inner_fault data shape: (1800, 527)
Outer_fault data shape: (1800, 527)


## 2.1. Time Series ClassificationÂ¶

• Recall SVM

• Previously, the goal was to find the classification boundary(ies), given the features (data)
• For the data above, we need to determine features

• Types of features we can use for time series

• Basic features (e.g. mean, max, min, standard deviation)
• Derived features / features based on domain knowledge (e.g. Fourier transform, autocorreclation)
• Advanced statistical features (e.g. Kurtosis, skewness)

### Basic Feature SelectionÂ¶

• We will start by using the mean and median to represent all signals
• $n=2$ features ($\mathbb{R}^{500} â†’ \mathbb{R}^{2}$)
InÂ [Â ]:
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

InÂ [Â ]:
all_data = np.vstack((normal, inner, outer))
all_data = all_data / np.max(all_data)

InÂ [Â ]:
def simple_features(trainX):
n_data = len(trainX)
reduced = np.zeros((n_data,2))
for i in range(n_data):
reduced[i,0] = np.median(trainX[i,:])
reduced[i,1] = np.mean(trainX[i,:])

return reduced

x = simple_features(all_data)
y = [0]*len(normal) + [1]*len(inner) + [2]*len(outer)
y = np.array(y)[:, np.newaxis]

scaler = StandardScaler()
train_X_scaled = scaler.fit_transform(x)
train_x, test_x, train_y, test_y = train_test_split(train_X_scaled,
y,
test_size = 0.2,
random_state = 42)

InÂ [Â ]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

# Train SVM classifier
svm_classifier = SVC(kernel='linear', probability=True)  # Added probability for contour plotting
svm_classifier.fit(train_x, train_y.ravel())

# Predict on the test set
pred_y = svm_classifier.predict(test_x)

# Evaluate the classifier
print("Accuracy:", accuracy_score(test_y, pred_y))
print("Classification Report:")
print(classification_report(test_y, pred_y))

# Plotting decision boundaries
plt.figure(figsize=(10, 6))
plt.scatter(test_x[:, 0], test_x[:, 1], c=test_y.ravel(), s=30, cmap=plt.cm.Paired, edgecolors='k', label='Test Points')

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = test_x[:, 0].min() - 1, test_x[:, 0].max() + 1
y_min, y_max = test_x[:, 1].min() - 1, test_x[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
np.arange(y_min, y_max, 0.02))

Z = svm_classifier.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.Paired)

plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('SVM Decision Boundary with Test Points')
plt.legend()
plt.show()

Accuracy: 0.7814814814814814
Classification Report:
precision    recall  f1-score   support

0       0.71      0.72      0.71       390
1       0.76      0.73      0.74       341
2       0.89      0.90      0.89       349

accuracy                           0.78      1080
macro avg       0.78      0.78      0.78      1080
weighted avg       0.78      0.78      0.78      1080



### Feature Selection with Domain Knowledge (Frequency Domain)Â¶

• $n=2$ features ($\mathbb{R}^{500} â†’ \mathbb{R}^{2}$)

• Idea is to view dominant frequencies of signal (with Fourier Transform)

InÂ [Â ]:
# Example transformation (visualization)

signal1 = all_data[0,:]
signal2 = all_data[1800,:]
signal3 = all_data[3600,:]

# Compute FFT
fft_signal1 = np.fft.fft(signal1)
fft_signal2 = np.fft.fft(signal2)
fft_signal3 = np.fft.fft(signal3)

# Compute frequencies
freqs = np.fft.fftfreq(len(signal1))

# Plot the FFT results
fig, axs = plt.subplots(3, 1, figsize=(9, 6))

axs[0].plot(freqs, np.abs(fft_signal1))
axs[0].set_title('FFT of Signal 1')
axs[0].set_xlabel('Frequency')
axs[0].set_ylabel('Magnitude')

axs[1].plot(freqs, np.abs(fft_signal2))
axs[1].set_title('FFT of Signal 2')
axs[1].set_xlabel('Frequency')
axs[1].set_ylabel('Magnitude')

axs[2].plot(freqs, np.abs(fft_signal3))
axs[2].set_title('FFT of Signal 3')
axs[2].set_xlabel('Frequency')
axs[2].set_ylabel('Magnitude')

plt.tight_layout()
plt.show()