There are many potential applications of MCMC in machine learning. In this article, we will implement the Metropolis algorithm for Ising model and apply it to an image segmentation problem. To be specific, we will:
As it is in cold winter at New York now, we will use an image of ski jumping as an example. Let's first read the image and convert it into a gray image.
from google.colab import drive
drive.mount('/content/drive')
import numpy as np
import random
import cv2
from google.colab.patches import cv2_imshow
# Read an image and convert it to a grey-value matrix
img_file = '/content/drive/My Drive/jump.jpg'
img = cv2.imread(img_file)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(img_gray.shape)
cv2_imshow(img_gray)
Metropolis algorithm is a classical algorithm to do Markov Chain Mento Carlo. Since it is a must-have for any textbook covering this topic, we don't repeat it here. For the detailed implementation, you can find it in the function "Metropolis4Ising_2d()".
In the Ising model, we assigns the following expression to the energy
$$ J(\xi) = -(1/2) \sum_{i\ne j}c_{i,j}\xi_i \xi_j - \theta \sum_i b_i \xi_i $$where
The Gibbs Law with this energy at temperature $T$ is $$ \pi_T(\xi)=\frac{1}{Z_T} e^{-\frac{1}{T} J(\xi)} $$ $Z_T$ is chosen in a way such that $\pi_T$ is a probability mass function.
# Metropolis algorithm for Ising model
# Key points:
# How to program a configuration in the form of {-1, 1}^N: Just a matrix of {-1, 1}
# Evaluate the energy for a certain configuration
def selection_function(u):
return min(1, u)
def Metropolis4Ising_2d(data_2d, X0, n_steps, n_samples, T, theta):
"""
A Metropolis algorithm for Ising model.
[INPUT]:
X0: A 2D numpy array.
Its value at index i must be with {-1, 1}.
n_steps: integer.
The number of simulation steps.
n_samples: integer.
The number of samples to return.
"""
M, N = X0.shape
X = np.copy(X0)
random.seed(0)
i_step = 0
samples = []
while i_step < n_steps:
ix, iy = random.randint(0, M-1), random.randint(0, N-1)
U = random.random()
ix_neighbor = [ix+dx for dx in [-1,1] if ix+dx>=0 and ix+dx<=M-1]
iy_neighbor = [iy+dy for dy in [-1,1] if iy+dy>=0 and iy+dy<=N-1]
affinity_of_neighbors = sum(X[ixp, iyp] for ixp in ix_neighbor for iyp in iy_neighbor)
energy_diff = X[ix, iy] * (-2 * theta * data_2d[ix, iy] - affinity_of_neighbors)
pdf_ratio = np.exp(energy_diff / T)
h_val = selection_function(pdf_ratio)
# Reserve the state at (ix, iy):
if U < h_val:
X[ix, iy] = - X[ix, iy]
# No matter reverse the state or not, save a copy if it is within the last
# n samples
if i_step >= n_steps - n_samples:
samples.append(np.copy(X))
i_step += 1
return samples
# A simple test function:
def test_Metropolis():
M, N = 3, 3
data_2d = np.random.random((M,N)) * 2. - 1.
X0 = np.zeros_like(data_2d)
for i in range(M):
for j in range(N):
if data_2d[i,j] > 0:
X0[i,j] = 1
else:
X0[i,j] = -1
n_steps = 20
n_samples = 2
T = 100
theta = 1.0
samples = Metropolis4Ising_2d(data_2d, X0, n_steps, n_samples, T, theta)
for i, s in enumerate(samples):
print(f"The {i}-th sample:")
print(s)
There are some choice about hyper-parameters in the model that we need to make. The important parameters include
data_2d = np.copy(img_gray).astype(float)
cv2.normalize(img_gray.astype(float), data_2d, alpha=-1, beta=1, norm_type=cv2.NORM_MINMAX)
print("The min/max gray value:", np.amin(data_2d), np.amax(data_2d))
X0 = np.copy(data_2d)
X0[X0>0] = 1
X0[X0<=0] = 0
n_steps = 10000
n_samples = 2
T = 1
theta = 0.001
samples = Metropolis4Ising_2d(data_2d, X0, n_steps, n_samples, T, theta)
Xtoshow = np.copy(X0)
cv2.normalize(samples[-2], Xtoshow, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX)
cv2_imshow(Xtoshow)
After 10K steps, the algorithm gives us a segmentation as the figure above shows. The algorithm captures most part of the althlete except the pants whose brightness is close to the background.
We also observe some noise point in the result. As $\theta$ or $T$ increases, the noise in the segmentation results becomes larger.
In this article, we implement a Metropolis algorithm to optimize the Ising model for image segmentation. Just as other machine learning models, there is no magic to replace the role of a proper objective function (Ising model in this case) for a successful application of MCMC.
Fetaya, "Stochastic optimization". Available at: http://www.wisdom.weizmann.ac.il/~ethanf/. (NOTE: A nice introduction from Metropolis algorithm to simulated annealing method.)
Ma et al. (2019), "Sampling can be faster than optimization", PNAS. Available at: https://www.pnas.org/content/116/42/20881. (NOTE: An interseting comparison on efficiency between stochastic methods and convex optimization methods.)
Hannah (2014), "Stochastic optimization" (part of the course "Computational Stochastics" at Columbia University). Avaialble at: http://www.stat.columbia.edu/~liam/teaching/compstat-spr14/lauren-notes.pdf.