Data

A simple API for importing and preparing data for use. Mostly manipulates numpy arrays to generate profiles and sections

We will be treating 2D arrays as rasters. Basically load any .csv, .txt or other file into a numpy array as you would normally. Each entry should be the height data for it’s respective pixel.

Let’s try out the simplest method, a text file containing an (M,N) array compatible with Numpy. If you’d like to try your own data, simply change the file below and the loading function (e.g. if you have a .csv just change the delimeter in the np.loadtxt() call).

file = 'BYGS008_top_segment_500samp_10cm_interp089.txt'
image = np.loadtxt(file)

Let’s have a look at the image

plt.imshow(image)
plt.show()

image.shape
(501, 501)

It can be very useful to study how roughness parameters change with regards to their orientation. The following function helps produce a range of profiles rotating around the central point of the image or array


source

gen_rot_prof

 gen_rot_prof (array, deg=180, increment=1)

Generates an array of rotational profiles through to deg, in even increments of increment. Uses OpenCV and Imutils to rotate the array around the center of the array/raster/image, extracts the middle row.

Type Default Details
array 2D array of height values
deg int 180 Number of degrees to rotate through, i.e 180 gives full 360 rotation
increment int 1 deg/increment = number of evenly spaced profiles to calculate.

Numpy likes the data in various forms for linear algebra, here is a helper to convert an (M,N) matrix into a (n,(X,Y,Z)) matrix.


source

image2xyz

 image2xyz (im)

Converts 2D (m,n) image/array to xyz coordinates. Used for plane levelling


source

xyz2image

 xyz2image (xyz)

Helper to convert back from xyz (n,3) arrays to (M,N) image/matrices

Details
xyz (n,3) shape array
im_xyz = image2xyz(image)
im_xyz[:5]
array([[ 0.        ,  0.        , -0.89188266],
       [ 1.        ,  0.        , -0.8919338 ],
       [ 2.        ,  0.        , -0.89193225],
       [ 3.        ,  0.        , -0.89193505],
       [ 4.        ,  0.        , -0.89192402]])

Levelling and Form Removal

In order to perform roughness calculations it is recommended to level the data and remove the underlying form.This produces a S-F surface from the primary surface if we are using Standards terms. Because surfaces are always digitized and discretized in some way, the actual surface has to be modelled using some function. ISO software standards recommend using a Bicubic spline to remove the form. Because the function is an assumption, the user should choose their function based on their scientific knowledge of the surface and the goals of their research. Multiple functions can be tested and the results observed. Here I provide a least-squares solution to the problem, computing the results in the same shape as the original image and subtract them.

With the underlying form modeled, the funciton can be sampled from to generate a larger number of samples.


source

remove_form

 remove_form (im, degree=3, return_form=False)

Remove the form of the raster by fitting a polynomial of specified degree and subtracting it.

Type Default Details
im 2D Numpy array or array like
degree int 3 Polynomial degree to remove
return_form bool False Return the form/computed polynomial values instead of removing them from im

source

plane_level

 plane_level (im, norm=True, return_form=False)

Level an (m,n) array by computing the best fit plane and subtracting the results. Thin wrapper around remove_form with degree = 1.

Type Default Details
im Numpy array or array like
norm bool True Normalize the data by subtracting the mean
return_form bool False
w = np.array([[1,1,1],[0,0,0],[-1,-1,-1]])
u = np.array([[1,0,-1]]*3)
test_close(plane_level(w), np.zeros(w.shape))
test_close(plane_level(u), np.zeros(u.shape))
test_fail(plane_level, kwargs = dict(xyz=np.array([1])))
test_fail(plane_level, kwargs = dict(xyz=np.array([[1,1]])))
fig = plt.figure()
ax = fig.add_subplot(1, 2, 1)
imgplot = plt.imshow(plane_level(image))
ax.set_title('Levelled image')
ax = fig.add_subplot(1, 2, 2)
imgplot = plt.imshow(plane_level(image, return_form = True))
ax.set_title('Levelling plane')
Text(0.5, 1.0, 'Levelling plane')

image_f = remove_form(plane_level(image))
image_form = remove_form(plane_level(image), return_form = True)

fig = plt.figure()
ax = fig.add_subplot(1, 2, 1)
imgplot = plt.imshow(image_f)
ax.set_title('Formless Image')
ax = fig.add_subplot(1, 2, 2)
imgplot = plt.imshow(image_form)
ax.set_title('Polynomial')
Text(0.5, 1.0, 'Polynomial')

Noise and smoothing

Similarly, it is recommended to remove noise and attenuate high frequency features. We achieve this through the use of a gaussian filter.


source

smooth_image

 smooth_image (array, sigma=None, alpha=None, cutoff=None, axis=None,
               **kwargs)

*Removes high frequency/wavelength features (‘noise’) by applying a gaussian filter on the image. Thin wrapper of scipy.ndimage.gaussian_filter.

If all sigma,alpha,cutoff = None, sigma defaults to (np.sqrt(np.log(2)/np.pi)) * cutoff

If sigma is not none, sigma takes priority over any alpha or cutoff provided.

Refer to ISO 11562:1997 for reasoning behind alpha and cutoff.*

Type Default Details
array Numpy array or array like
sigma NoneType None Standard deviation for gaussian kernel Useful for determining the wavelength of the low pass filter.
alpha NoneType None Used in gaussian weighting function, defaults to np.sqrt(np.log(2)/np.pi)
cutoff NoneType None Cutoff wavelength, defaults to 1
axis NoneType None Axis along which to apply filter
kwargs
image_f_s = smooth_image(image_f,cutoff=1)

fig = plt.figure()
ax = fig.add_subplot(1, 2, 1)
imgplot = plt.imshow(image_f_s)
ax.set_title('Sigma = 1')
ax = fig.add_subplot(1, 2, 2)
imgplot = plt.imshow(smooth_image(image_f,cutoff=10))
ax.set_title('Sigma = 10')
Text(0.5, 1.0, 'Sigma = 10')

Sections

It can be useful to study subsections of surfaces. The following helpers assist with this process. Otherwise, normal manipulation of numpy arrays is always possible.


source

gen_sections

 gen_sections (image, how='square', number=100)

Generates sections of the array/image, either in square, horizontal, or vertical sections. Useful for studying the change of parameters over the surface. Mostly wraps around np.hsplit and np.vsplit. Note, if ‘number’ does not divide into the array evenly, the bottom/side remains will not be included.

Type Default Details
image 2D array (or arraylike) of height values
how str square How to subdivide the array, options are: ‘square’, ‘row’, ‘column’
number int 100 Number of sections to produce
a_10000 = np.arange(100*100).reshape(100,100)
test_eq(gen_sections(a_10000)[0],a_10000[:10,:10])

a_523 = np.arange(523*523).reshape(523,523)
a_520 = np.arange(520*520).reshape(520,520)
test_eq(gen_sections(a_523).shape,gen_sections(a_520).shape)
test_sections = np.load('example_sections.npy')
test_sections.shape
(100, 50, 50)
image_sections =  gen_sections(image)

Now, because we’ve applied all of our preprocessing steps to the original image. We can export it for use later. We should also save our profiles and sections. The sections should be in .npy format because they are 3D.

np.savetxt('example.txt', image_f_s)
np.savetxt('example_profiles.txt', gen_rot_prof(image_f_s))
np.save('example_sections.npy', image_sections)

And we can load them back in just to check.

profiles = np.loadtxt('example_profiles.txt')
plt.imshow(profiles)
plt.show()

Utilities

Various useful functions which are used elsewhere


source

compute_parameters

 compute_parameters (array, parameter_list:list, valid_module=None,
                     to_df:bool=False, **kwargs)

Computes a set of parameters for a given array, provide a list of parameters (as strings of their respective functions e.g. [‘Ra’,‘Rms’]) and a module to verify against (might require some module aliasing, see CLI notebook for example use). Returns a list of results or a dataframe.

Type Default Details
array Input array to be calculate parameters on
parameter_list list List of parameters to calculate as strings
valid_module NoneType None module to generate functions from, used to check user input, see rough.cli:rough
to_df bool False Return the parameters as a pandas dataframe, with columns set as the parameter names
kwargs

source

distance_matrix

 distance_matrix (shape:tuple, center:(<class'int'>,<class'int'>)=None,
                  sections=False)

*Returns a (m,n) matrix containing distance values from center coordinates.

if Sections = True. Returns (x,m,n) where x is the number of input sections.*

Type Default Details
shape tuple Shape of array, used to calculate center if not given
center (<class ‘int’>, <class ‘int’>) None Central point from which to calculate distances, if None, defaults to x//2, y//2
sections bool False If True, takes the first element of shape as the number of stack in image
a25 = np.arange(25).reshape(5,5)
a25.shape
(5, 5)
a25t = np.tile(a25,(5,5,5))
test_eq(a25t[0],a25t[4])
test_shape = (101,101)
#eps = 1e-05
test_eq(distance_matrix(test_shape), np.rot90(distance_matrix(test_shape)))
test_eq(distance_matrix(test_shape), np.flipud(distance_matrix(test_shape)))
test_eq(distance_matrix(test_shape), np.fliplr(distance_matrix(test_shape)))

test_ne(distance_matrix(test_shape), np.zeros(test_shape))
test_ne(distance_matrix(test_shape), np.ones(test_shape))

source

normalize

 normalize (im, axis=1, how='center', feature_range=None)

*Normalize the input array along given axis. Typically used to ‘center’ rows/columns/areas in order to calculate parameters. how can be: - ‘center’: Subtract the mean from the array along the axis, - ‘l1’ - ‘l2’ - ‘standardize’ : Subtract the mean and divide by the standard deviation along given axis - ‘minmax’ : ‘standardize’ within ‘feature_range’. See use in Sal

Mostly a reimplementation of scalers from sklearn with explicit formulation.*

Type Default Details
im Array or stack of array to normalize
axis int 1 Axis along which to normalize
how str center normalization method: ‘center’, ‘standardize’, ‘minmax’
feature_range NoneType None Tuple containing the feature range for minmax
#For testing normalize
from sklearn.preprocessing import MinMaxScaler
a_400 =np.arange(-200,200).reshape(-1,1)
a_rand = np.random.randint(-1000,1000,1000).reshape(-1,1)
scaler = MinMaxScaler((-1,1))

test_eq((a_400 - np.mean(a_400,axis = 0,keepdims = True)), normalize(a_400, axis = 0, how = 'center'))
test_eq((a_rand - np.mean(a_rand,axis = 0, keepdims = True)), normalize(a_rand,axis = 0, how = 'center'))

test_close(scaler.fit_transform(a_400),  normalize(a_400,axis = 0, how = 'minmax'), eps = 1e-10)
test_close(scaler.fit_transform(a_rand), normalize(a_rand,axis = 0, how = 'minmax'), eps = 1e-10)
image_f_s.shape
(501, 501)
from scipy.signal import correlate
cor_result = correlate(image_f_s, image_f_s,mode='same')
plt.imshow(cor_result)
plt.show()

correlate(np.ones((3,3)), np.ones((3,3)), mode = 'same')
array([[4., 6., 4.],
       [6., 9., 6.],
       [4., 6., 4.]])
rand_arr = np.random.rand(501,501)
rand_cor = correlate(rand_arr,rand_arr,mode='same')
plt.imshow(rand_cor)
plt.show()

(np.amax(rand_cor),np.amin(rand_cor),np.ptp(rand_cor))
(83755.5666617089, 15717.948579544685, 68037.61808216422)
rand_ncor = normalize(rand_cor,axis = None, how= 'minmax')
(np.amax(rand_ncor),np.amin(rand_ncor),np.ptp(rand_ncor))
(1.0, -1.0, 2.0)
plt.imshow(rand_ncor)
plt.show()

rand_dist = distance_matrix(rand_ncor.shape)
rand_where = np.where(rand_ncor <= 0.2, rand_dist, np.NaN)
np.nanmin(rand_where)
36.76955262170047
x = np.linspace(0, 20)
y = np.sin(x)
sin_wave = np.broadcast_to(y,(50,50))
plt.imshow(sin_wave)
plt.show()

plt.imshow(sin_wave.T)
plt.show()

sin_cor = correlate(sin_wave,sin_wave,mode='same')
plt.imshow(sin_cor)
plt.show()

ncor_sin = normalize(sin_cor, axis = None, how = 'minmax')
plt.imshow(ncor_sin)
plt.show()

ncor_sin_2 = np.where(ncor_sin <=0.2, ncor_sin, np.NaN)
plt.imshow(ncor_sin_2)
plt.show()

sint_cor = correlate(sin_wave.T,sin_wave.T,mode='same')
plt.imshow(sint_cor)
plt.show()

sin_cor[-1,-1]
-302.59161125378563
(np.ptp(cor_result),np.max(cor_result),np.min(cor_result))
(0.12309696498814524, 0.10140944660491391, -0.021687518383231318)