AI for Mechanical Engineering: Manufacturing
By Dr. Bumsoo Park
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST
http://iailab.kaist.ac.kr/
Industrial AI Lab at KAIST
1. Tool Wear Prediction¶
- Taylor's equation for tool wear:
$$VT^n = C, $$
- where
- $V$: Cutting speed
- $T$: Tool life
- $n, C$: Empirical constants
- A typical problem involves predicting/evaluating the remaining tool life, given conditions ($V, n, C$)
1.1 Prediction with Data¶
- What if we are only given data?
Download data here
In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy
import sklearn
In [ ]:
from google.colab import drive
drive.mount('/content/drive')
In [ ]:
# Load the .npy file
data_loaded = np.load('/content/drive/MyDrive/AIME_Manufacturing/tool_wear_data.npy')
# Convert the numpy array to a DataFrame
df_loaded = pd.DataFrame(data_loaded, columns=['Cutting Speed (V)', 'Tool Life (T)'])
- Visualize loaded data
In [ ]:
# Original data
plt.scatter(df_loaded['Cutting Speed (V)'], df_loaded['Tool Life (T)'], label='Experimental Data')
plt.xlabel('Cutting Speed (m/min)')
plt.ylabel('Tool Life (min)')
plt.title('Loaded Data')
plt.legend()
plt.show()
Option 1¶
- We know there is an exponential relationship between $V$ and $T$ (from $VT^n=C$)
- Using this information, we first linearize in log-log scale
$$ \begin{align*} \\ \log{(VT^n)} &= \log{C}, \\ \log{(V)}+\log{(T^n)} &= \log{C}, \\ \log{(V)}+n\log{(T)} &= \log{C}, \\ \log{(T)} &= -\frac{1}{n}\log{(V)}+ -\frac{1}{n}\log{C} \\ \end{align*} $$
In [ ]:
df_loaded['log_V'] = np.log(df_loaded['Cutting Speed (V)'])
df_loaded['log_T'] = np.log(df_loaded['Tool Life (T)'])
# Plotting the original data and the linearized data
plt.figure(figsize=(12, 5))
# Original data
plt.subplot(1, 2, 1)
plt.scatter(df_loaded['Cutting Speed (V)'], df_loaded['Tool Life (T)'])
plt.xlabel('Cutting Speed')
plt.ylabel('Tool Life')
plt.title('Original Data')
# Linearized data
plt.subplot(1, 2, 2)
plt.scatter(df_loaded['log_V'], df_loaded['log_T'])
plt.xlabel('log(Cutting Speed)')
plt.ylabel('log(Tool Life)')
plt.title('Linearized Data')
plt.show()
- Next perform linear regression
In [ ]:
# Perform linear regression on the log-transformed data
slope, intercept, r_value, p_value, std_err = scipy.stats.linregress(df_loaded['log_V'], df_loaded['log_T'])
# Calculate n and C from the slope and intercept
n_estimated = -1 / slope
C_estimated = np.exp(intercept * n_estimated)
# Plotting the original data and the linearized data
plt.figure(figsize=(14, 6))
# Original data
plt.subplot(1, 2, 1)
plt.scatter(df_loaded['Cutting Speed (V)'], df_loaded['Tool Life (T)'])
plt.xlabel('Cutting Speed (V)')
plt.ylabel('Tool Life (T)')
plt.title('Original Data')
# Linearized data
plt.subplot(1, 2, 2)
plt.scatter(df_loaded['log_V'], df_loaded['log_T'])
plt.plot(df_loaded['log_V'], slope * df_loaded['log_V'] + intercept, color='red', label='Fitted Line')
plt.xlabel('log(Cutting Speed (V))')
plt.ylabel('log(Tool Life (T))')
plt.title('Linearized Data')
plt.legend()
plt.show()
- Then transform back to linear space
In [ ]:
# Original data
plt.scatter(df_loaded['Cutting Speed (V)'], df_loaded['Tool Life (T)'], label='Experimental Data')
# Fitted line
cutting_speeds_fit = np.linspace(50, 150, 500)
tool_life_fit = (C_estimated / cutting_speeds_fit) ** (1 / n_estimated)
plt.plot(cutting_speeds_fit, tool_life_fit, color='red', label='Fitted Line')
plt.xlabel('Cutting Speed (V)')
plt.ylabel('Tool Life (T)')
plt.title('Fitted Line in Linear Space')
plt.legend()
plt.show()
print(f"Estimated n: {n_estimated}")
print(f"Estimated C: {C_estimated}")
Option 2¶
- Use Scipy's optimize.curve_fit() function for optimization
In [ ]:
# Define Taylor's Tool Wear Equation for fitting
def taylors_equation(V, n, C):
return (C / V) ** (1 / n)
# Initial guess for the parameters
initial_guess = [0.5, 100]
# Perform curve fitting: inputs(function, x, y, initial guess)
popt, pcov = scipy.optimize.curve_fit(taylors_equation, df_loaded['Cutting Speed (V)'],
df_loaded['Tool Life (T)'], p0=initial_guess)
# Extract the fitted parameters
n_fitted, C_fitted = popt
# Original data
plt.scatter(df_loaded['Cutting Speed (V)'], df_loaded['Tool Life (T)'], label='Experimental Data')
# Fitted line
cutting_speeds_fit = np.linspace(50, 150, 500)
tool_life_fit = (C_fitted / cutting_speeds_fit) ** (1 / n_fitted)
plt.plot(cutting_speeds_fit, tool_life_fit, color='red', label='Fitted Line')
plt.xlabel('Cutting Speed (m/min)')
plt.ylabel('Tool Life (min)')
plt.title('Fitted Line in Linear Space')
plt.legend()
plt.show()
print(f"Estimated n: {n_fitted}")
print(f"Estimated C: {C_fitted}")
2. Rotary Machinery Signal Analysis¶
- Assume we have time series vibration data from a rotary machine
(data from CWRU dataset: https://engineering.case.edu/bearingdatacenter/download-data-file)
In [ ]:
# Load the data from the .npy files
inner = np.load('/content/drive/MyDrive/AIME_Manufacturing/inner_split.npy')
inner = inner.reshape(1800, -1)
outer = np.load('/content/drive/MyDrive/AIME_Manufacturing/outer_split.npy')
outer = outer.reshape(1800, -1)
normal = np.load('/content/drive/MyDrive/AIME_Manufacturing/normal_split.npy')
normal = normal.reshape(1800, -1)
# Print the shapes or some basic information about the data to confirm it's loaded correctly
print("Inner Split Data Shape:", inner.shape)
print("Normal Split Data Shape:", normal.shape)
print("Outer Split Data Shape:", outer.shape)
- Plot the signals
In [ ]:
classes = ['Normal', 'Inner_fault', 'Outer_fault']
data = [normal, inner, outer]
for state, name in zip(data, classes):
print('{} data shape: {}'.format(name, state.shape))
plt.figure(figsize = (9, 6))
for i in range(3):
plt.subplot(3,1,i+1)
plt.title('Class: {}'.format(classes[i]), fontsize = 15)
plt.plot(data[i][0])
plt.ylim([-4,4])
plt.tight_layout()
plt.show()
2.1. Time Series Classification¶
Recall SVM
- Previously, the goal was to find the classification boundary(ies), given the features (data)
For the data above, we need to determine features
Types of features we can use for time series
- Basic features (e.g. mean, max, min, standard deviation)
- Derived features / features based on domain knowledge (e.g. Fourier transform, autocorreclation)
- Advanced statistical features (e.g. Kurtosis, skewness)
Basic Feature Selection¶
- We will start by using the mean and median to represent all signals
- $n=2$ features ($\mathbb{R}^{500} → \mathbb{R}^{2}$)
In [ ]:
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
In [ ]:
all_data = np.vstack((normal, inner, outer))
all_data = all_data / np.max(all_data)
In [ ]:
def simple_features(trainX):
n_data = len(trainX)
reduced = np.zeros((n_data,2))
for i in range(n_data):
reduced[i,0] = np.median(trainX[i,:])
reduced[i,1] = np.mean(trainX[i,:])
return reduced
x = simple_features(all_data)
y = [0]*len(normal) + [1]*len(inner) + [2]*len(outer)
y = np.array(y)[:, np.newaxis]
scaler = StandardScaler()
train_X_scaled = scaler.fit_transform(x)
train_x, test_x, train_y, test_y = train_test_split(train_X_scaled,
y,
test_size = 0.2,
random_state = 42)
In [ ]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
# Train SVM classifier
svm_classifier = SVC(kernel='linear', probability=True) # Added probability for contour plotting
svm_classifier.fit(train_x, train_y.ravel())
# Predict on the test set
pred_y = svm_classifier.predict(test_x)
# Evaluate the classifier
print("Accuracy:", accuracy_score(test_y, pred_y))
print("Classification Report:")
print(classification_report(test_y, pred_y))
# Plotting decision boundaries
plt.figure(figsize=(10, 6))
plt.scatter(test_x[:, 0], test_x[:, 1], c=test_y.ravel(), s=30, cmap=plt.cm.Paired, edgecolors='k', label='Test Points')
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].
x_min, x_max = test_x[:, 0].min() - 1, test_x[:, 0].max() + 1
y_min, y_max = test_x[:, 1].min() - 1, test_x[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.02),
np.arange(y_min, y_max, 0.02))
Z = svm_classifier.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, alpha=0.3, cmap=plt.cm.Paired)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('SVM Decision Boundary with Test Points')
plt.legend()
plt.show()
Feature Selection with Domain Knowledge (Frequency Domain)¶
$n=2$ features ($\mathbb{R}^{500} → \mathbb{R}^{2}$)
Idea is to view dominant frequencies of signal (with Fourier Transform)
In [ ]:
# Example transformation (visualization)
signal1 = all_data[0,:]
signal2 = all_data[1800,:]
signal3 = all_data[3600,:]
# Compute FFT
fft_signal1 = np.fft.fft(signal1)
fft_signal2 = np.fft.fft(signal2)
fft_signal3 = np.fft.fft(signal3)
# Compute frequencies
freqs = np.fft.fftfreq(len(signal1))
# Plot the FFT results
fig, axs = plt.subplots(3, 1, figsize=(9, 6))
axs[0].plot(freqs, np.abs(fft_signal1))
axs[0].set_title('FFT of Signal 1')
axs[0].set_xlabel('Frequency')
axs[0].set_ylabel('Magnitude')
axs[1].plot(freqs, np.abs(fft_signal2))
axs[1].set_title('FFT of Signal 2')
axs[1].set_xlabel('Frequency')
axs[1].set_ylabel('Magnitude')
axs[2].plot(freqs, np.abs(fft_signal3))
axs[2].set_title('FFT of Signal 3')
axs[2].set_xlabel('Frequency')
axs[2].set_ylabel('Magnitude')
plt.tight_layout()
plt.show()