Tutorial 4: Peak Matching

Peak matching relies on mlgidMATCH package. First, create the mlgidBASE class instance, run detection and fitting:

from mlgidbase import mlgidBASE
filename = r'../../example/BA2PbI4.h5'
analysis = mlgidBASE(filename=filename)
analysis.run_detection()
analysis.run_fitting()
2026-05-26 14:11:02.463920841 [W:onnxruntime:Default, device_discovery.cc:283 GetGpuDevices] Failed to detect devices under "/sys/class/drm/card0": device_discovery.cc:93 ReadFileContents Failed to open file: "/sys/class/drm/card0/device/vendor"
INFO - Loading model
INFO - Saved detected peaks to file: ../../example/BA2PbI4.h5, entry: entry_0000, frame: 0
INFO - Saved fitted peaks to file: ../../example/BA2PbI4.h5, entry: entry_0000, frame: 0

CIF preprocessing

Before usage, a preprocessing of CIF files should be done (see full documentation):

import warnings
warnings.filterwarnings("ignore")

from mlgidmatch.preprocess.cif_preprocess import CifPattern
from pygidsim.experiment import ExpParameters

# path to the folder with CIF files
folder_path = '../../example/cifs/'

params = ExpParameters(q_xy_max=5, # maximum q_xy value (Å⁻¹)
                       q_z_max=5,  # maximum q_z value (Å⁻¹)
                       en=24000)   # X-ray beam energy (eV)

cif_prepr = CifPattern(
    params=params,
    folder_path=folder_path,
    create_all=True
)
---------------------------------------------------------------------------
KeyboardInterrupt                         Traceback (most recent call last)
Cell In[2], line 14
     10 params = ExpParameters(q_xy_max=5, # maximum q_xy value (Å⁻¹)
     11                        q_z_max=5,  # maximum q_z value (Å⁻¹)
     12                        en=24000)   # X-ray beam energy (eV)
     13 
---> 14 cif_prepr = CifPattern(
     15     params=params,
     16     folder_path=folder_path,
     17     create_all=True

File ~/checkouts/readthedocs.org/user_builds/mlgidbase/envs/stable/lib/python3.11/site-packages/mlgidmatch/preprocess/cif_preprocess.py:119, in CifPattern.__init__(self, params, folder_path, cifs, create_elementary, create_all, preprocessed_3d)
    116 self.all_patterns_int1d = None
    117 if create_all:
    118     self.all_patterns_q2d, self.all_patterns_int2d, self.all_patterns_q1d, self.all_patterns_int1d = \
--> 119         self._create_all_possible_patterns()

File ~/checkouts/readthedocs.org/user_builds/mlgidbase/envs/stable/lib/python3.11/site-packages/mlgidmatch/preprocess/cif_preprocess.py:257, in CifPattern._create_all_possible_patterns(self)
    254 R = rotate_vect(rec=self.pattern_3d.rec[idx], orientation=orientation)
    255 q_3d = self.pattern_3d.q_3d[idx] @ R
--> 257 q_2d, intensity, _ = GIWAXS.giwaxs_2d(
    258     q_3d=q_3d,
    259     intensity=self.pattern_3d.intensities[idx],
    260     mi=None,
    261     q_xy_range=self.params.q_xy_range,
    262     q_z_range=self.params.q_z_range,
    263     wavelength=self.params.wavelength,
    264     move_fromMW=True,
    265 )
    267 # remove peaks with low intensities
    268 max_peaks = 1000

File ~/checkouts/readthedocs.org/user_builds/mlgidbase/envs/stable/lib/python3.11/site-packages/pygidsim/giwaxs_sim.py:423, in GIWAXS.giwaxs_2d(q_3d, intensity, mi, q_xy_range, q_z_range, wavelength, move_fromMW)
    421 q_2d_fin = np.concatenate((q_2d_fin, mirror), axis=1)
    422 int_fin = np.concatenate((int_fin, int_fin), axis=0)
--> 423 q_2d_fin, mirr_ind = np.unique(q_2d_fin, axis=1, return_index=True)  # avoid duplicates regarding the mirroring
    424 int_fin = int_fin[mirr_ind]
    426 # apply q_range limits after the mirroring

File ~/checkouts/readthedocs.org/user_builds/mlgidbase/envs/stable/lib/python3.11/site-packages/numpy/lib/_arraysetops_impl.py:342, in unique(ar, return_index, return_inverse, return_counts, axis, equal_nan, sorted)
    339     uniq = np.moveaxis(uniq, 0, axis)
    340     return uniq
--> 342 output = _unique1d(consolidated, return_index,
    343                    return_inverse, return_counts,
    344                    equal_nan=equal_nan, inverse_shape=inverse_shape,
    345                    axis=axis, sorted=sorted)
    346 output = (reshape_uniq(output[0]),) + output[1:]
    347 return _unpack_tuple(output)

File ~/checkouts/readthedocs.org/user_builds/mlgidbase/envs/stable/lib/python3.11/site-packages/numpy/lib/_arraysetops_impl.py:382, in _unique1d(ar, return_index, return_inverse, return_counts, equal_nan, inverse_shape, axis, sorted)
    380 # If we don't use the hash map, we use the slower sorting method.
    381 if optional_indices:
--> 382     perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
    383     aux = ar[perm]
    384 else:

KeyboardInterrupt: 

This step needs to be performed only once for a given set of CIF files. The CifPattern instance is then used during the matching stage. It can also be saved and reused across different samples to avoid repeated preprocessing:

import pickle

with open('../../example/prepr_cifs.pickle', 'wb') as file:
    pickle.dump(cif_prepr, file)

Then run matching:


Minimal Code Example

analysis.run_matching(
    cif_prepr = r'../../example/prepr_cifs.pickle',
    peaks_type='segments',
    )

Parameters

  • entry (str) — Data file entry to process. Defaults to None (process all entries). OPTIONAL

  • frame_num (int or List[int]) — Frame number(s) within each entry to process. Defaults to None (all frames). OPTIONAL

  • cif_prepr (CifPattern or str) — Preprocessed CIFs object (CifPattern) or path to a PICKLE file. REQUIRED

  • peaks_type (str) — Type of peaks used for matching: 'segments' (2D) or 'rings' (1D). Defaults to 'peaks'. REQUIRED

  • probability_threshold (float) — Matching threshold for peaks (0–1). Defaults to 0.5. OPTIONAL

  • intensity_threshold (float) — Minimum intensity of fitted peaks to be considered for matching. OPTIONAL

  • device (str) — Computation device ('cpu' or 'cuda'). Defaults to None (automatic detection). OPTIONAL



Description

You can process a single entry or all entries in the file by setting entry=None. The frame_num parameter accepts either a single integer or a list of frame indices.

The cif_prepr argument should be a CifPattern instance or a path to a saved PICKLE file. It can be set only once for the mlgidBASE instance to avoid the repeating the

The peaks_type parameter defines the type of data to match: either rings (1D matching) or segments (2D matching).

probability_threshold (0–1) controls how strict the matching is, while intensity_threshold ignores fitted peaks with low intensity. The computation device can be specified via device or automatically detected.

analysis.run_matching(
    cif_prepr = r'../../example/prepr_cifs.pickle',
    peaks_type='segments',
    probability_threshold=0.1,
    intensity_threshold=0,
    device='cuda',
)
analysis.run_matching(
    cif_prepr = r'../../example/prepr_cifs.pickle',
    peaks_type='rings',
    probability_threshold=0.9,
    intensity_threshold=0,
)

The results can be visualized using silx view or loaded from the saved file, as shown in Tutorial 8.