Calibrator

class rascal.calibrator.HoughTransform[source]

This handles the hough transform operations on the pixel-wavelength space.

add_hough_points(hp)[source]

Extending the Hough pairs with an externally supplied HoughTransform object. This can be useful if the arc lines are very concentrated in some wavelength ranges while nothing in available in another part.

Parameters

hp (numpy.ndarray with 2 columns or HoughTransform object) – An externally supplied HoughTransform object that contains hough_points.

bin_hough_points(xbins, ybins)[source]

Bin up data by using a 2D histogram method.

Parameters
  • xbins (int) – The number of bins in the pixel direction.

  • ybins (int) – The number of bins in the wavelength direction.

generate_hough_points(x, y, num_slopes)[source]

Calculate the Hough transform for a set of input points and returns the 2D Hough hough_points matrix.

Parameters
  • x (1D numpy array) – The x-axis represents peaks (pixel).

  • y (1D numpy array) – The y-axis represents lines (wavelength). Vertical lines (i.e. infinite gradient) are not accommodated.

  • num_slopes (int) – The number of slopes to be generated.

generate_hough_points_brute_force(x, y)[source]

Calculate the Hough transform for a set of input points and returns the 2D Hough hough_points matrix.

Parameters
  • x (1D numpy array) – The x-axis represents peaks (pixel).

  • y (1D numpy array) – The y-axis represents lines (wavelength). Vertical lines (i.e. infinite gradient) are not accommodated.

  • num_slopes (int) – The number of slopes to be generated.

load(filename='hough_transform', filetype='npy')[source]

Store the binned Hough space and/or the raw Hough pairs.

Parameters
  • filename (str (default: 'hough_transform')) – The filename of the output, not used if to_disk is False. It will be appended with the content type.

  • filetype (str (default: 'npy')) – The file type of the saved hough transform. Choose from ‘npy’ and ‘json’.

save(filename='hough_transform', fileformat='npy', delimiter='+', to_disk=True)[source]

Store the binned Hough space and/or the raw Hough pairs.

Parameters
  • filename (str) – The filename of the output, not used if to_disk is False. It will be appended with the content type.

  • format (str (default: 'npy')) – Choose from ‘npy’ and json’

  • delimiter (str (default: '+')) – Delimiter for format and content types

  • to_disk (boolean) – Set to True to save to disk, else return a numpy array object

Returns

hp_hough_points – only return if to_disk is False.

Return type

numpy.ndarray

set_constraints(min_slope, max_slope, min_intercept, max_intercept)[source]

Define the minimum and maximum of the intercepts (wavelength) and gradients (wavelength/pixel) that Hough pairs will be generated.

Parameters
  • min_slope (int or float) – Minimum gradient for wavelength/pixel

  • max_slope (int or float) – Maximum gradient for wavelength/pixel

  • min_intercept (int/float) – Minimum interception point of the Hough line

  • max_intercept (int/float) – Maximum interception point of the Hough line

class rascal.calibrator.Calibrator(peaks, spectrum=None)[source]

Initialise the calibrator object.

Parameters
  • peaks (list) – List of identified arc line pixel values.

  • spectrum (list) – The spectral intensity as a function of pixel.

add_atlas(elements, min_atlas_wavelength=None, max_atlas_wavelength=None, min_intensity=10.0, min_distance=10.0, candidate_tolerance=10.0, constrain_poly=False, vacuum=False, pressure=101325.0, temperature=273.15, relative_humidity=0.0)[source]

Adds an atlas of arc lines to the calibrator, given an element.

Arc lines are taken from a general list of NIST lines and can be filtered using the minimum relative intensity (note this may not be accurate due to instrumental effects such as detector response, dichroics, etc) and minimum line separation.

Lines are filtered first by relative intensity, then by separation. This is to improve robustness in the case where there is a strong line very close to a weak line (which is within the separation limit).

The vacuum to air wavelength conversion is deafult to False because observatories usually provide the line lists in the respective air wavelength, as the corrections from temperature and humidity are small. See https://emtoolbox.nist.gov/Wavelength/Documentation.asp

Parameters
  • elements (string or list of strings) – Chemical symbol, case insensitive

  • min_atlas_wavelength (float (default: None)) – Minimum wavelength of the arc lines.

  • max_atlas_wavelength (float (default: None)) – Maximum wavelength of the arc lines.

  • min_intensity (float (default: None)) – Minimum intensity of the arc lines. Refer to NIST for the intensity.

  • min_distance (float (default: None)) – Minimum separation between neighbouring arc lines.

  • candidate_tolerance (float (default: 10)) – toleranceold (Angstroms) for considering a point to be an inlier during candidate peak/line selection. This should be reasonable small as we want to search for candidate points which are locally linear.

  • constrain_poly (boolean) – Apply a polygonal constraint on possible peak/atlas pairs

  • vacuum (boolean) – Set to True if the light path from the arc lamb to the detector plane is entirely in vacuum.

  • pressure (float) – Pressure when the observation took place, in Pascal. If it is not known, we suggest you to assume 10% decrement per 1000 meter altitude.

  • temperature (float) – Temperature when the observation took place, in Kelvin.

  • relative_humidity (float) – In percentage.

add_pix_wave_pair(pix, wave)[source]

Adding extra pixel-wavelength pair to the Calibrator for refitting. This DOES NOT work before the Calibrator having fit for a solution yet: use set_known_pairs() for that purpose.

Parameters
  • pix (float) – pixel position

  • wave (float) – wavelength

add_user_atlas(elements, wavelengths, intensities=None, candidate_tolerance=10.0, constrain_poly=False, vacuum=False, pressure=101325.0, temperature=273.15, relative_humidity=0.0)[source]

Add a single or list of arc lines. Each arc line should have an element label associated with it. It is recommended that you use a standard periodic table abbreviation (e.g. ‘Hg’), but it makes no difference to the fitting process.

The vacuum to air wavelength conversion is deafult to False because observatories usually provide the line lists in the respective air wavelength, as the corrections from temperature and humidity are small. See https://emtoolbox.nist.gov/Wavelength/Documentation.asp

Parameters
  • elements (list/str) – Elements (required). Preferably a standard (i.e. periodic table) name for convenience with built-in atlases

  • wavelengths (list/float) – Wavelengths to add (Angstrom)

  • intensities (list/float) – Relative line intensities (NIST value)

  • candidate_tolerance (float (default: 15)) – toleranceold (Angstroms) for considering a point to be an inlier during candidate peak/line selection. This should be reasonable small as we want to search for candidate points which are locally linear.

  • constrain_poly (boolean) – Apply a polygonal constraint on possible peak/atlas pairs

  • vacuum (boolean) – Set to true to convert the input wavelength to air-wavelengths based on the given pressure, temperature and humidity.

  • pressure (float) – Pressure when the observation took place, in Pascal. If it is not known, assume 10% decrement per 1000 meter altitude

  • temperature (float) – Temperature when the observation took place, in Kelvin.

  • relative_humidity (float) – In percentage.

clear_atlas()[source]

Remove all the lines loaded to the Calibrator.

fit(max_tries=500, fit_deg=4, fit_coeff=None, fit_tolerance=5.0, fit_type='poly', candidate_tolerance=2.0, brute_force=False, progress=True)[source]

Solve for the wavelength calibration polynomial by getting the most likely solution with RANSAC.

Parameters
  • max_tries (int (default: 5000)) – Maximum number of iteration.

  • fit_deg (int (default: 4)) – The degree of the polynomial to be fitted.

  • fit_coeff (list (default: None)) – Set the baseline of the least square fit. If no fits outform this set of polynomial coefficients, this will be used as the best fit.

  • fit_tolerance (float (default: 5.0)) – Sets a tolerance on whether a fit found by RANSAC is considered acceptable

  • fit_type (string (default: 'poly')) – One of ‘poly’, ‘legendre’ or ‘chebyshev’

  • candidate_tolerance (float (default: 2.0)) – toleranceold (Angstroms) for considering a point to be an inlier

  • brute_force (boolean (default: False)) – Set to True to try all possible combination in the given parameter space

  • progress (boolean (default: True)) – True to show progress with tdqm. It is overrid if tdqm cannot be imported.

Returns

  • fit_coeff (list) – List of best fit polynomial fit_coefficient.

  • rms (float) – The root-mean-squared of the residuals

  • residual (float) – Residual from the best fit

  • peak_utilisation (float) – Fraction of detected peaks (pixel) used for calibration [0-1].

  • atlas_utilisation (float) – Fraction of supplied arc lines (wavelength) used for calibration [0-1].

get_pix_wave_pairs()[source]

Return the list of matched_peaks and matched_atlas with their position in the array.

Returns

pw_pairs – List of tuples each containing the array position, peak (pixel) and atlas (wavelength).

Return type

list

list_atlas()[source]

List all the lines loaded to the Calibrator.

load_hough_transform(filename='hough_transform', filetype='npy')[source]

Store the binned Hough space and/or the raw Hough pairs.

Parameters
  • filename (str (default: 'hough_transform')) – The filename of the output, not used if to_disk is False. It will be appended with the content type.

  • filetype (str (default: 'npy')) – The file type of the saved hough transform. Choose from ‘npy’ and ‘json’.

manual_refit(matched_peaks=None, matched_atlas=None, degree=None, x0=None)[source]

Perform a refinement of the matched peaks and atlas lines.

This function takes lists of matched peaks and atlases, along with user-specified lists of lines to add/remove from the lists.

Any given peaks or atlas lines to remove are selected within a user-specified tolerance, by default 1 pixel and 5 atlas Angstrom.

The final set of matching peaks/lines is then matched using a robust polyfit of the desired degree. Optionally, an initial fit x0 can be provided to condition the optimiser.

The parameters are identical in the format in the fit() and match_peaks() functions, however, with manual changes to the lists of peaks and atlas, peak_utilisation and atlas_utilisation are meaningless so this function does not return in the same format.

Parameters
  • matched_peaks (list) – List of matched peaks

  • matched_atlas (list) – List of matched atlas lines

  • degree (int) – Polynomial fit degree (Only used if x0 is None)

  • x0 (list) – Initial fit coefficients

Returns

  • fit_coeff (list) – List of best fit polynomial coefficients

  • matched_peaks (list) – List of matched peaks

  • matched_atlas (list) – List of matched atlas lines

  • rms (float) – The root-mean-squared of the residuals

  • residuals (numpy 1D array) – Residual match error per-peak

match_peaks(fit_coeff=None, n_delta=None, refine=False, tolerance=10.0, method='Nelder-Mead', convergence=1e-06, min_frac=0.5, robust_refit=True, fit_deg=None)[source]

** refine option is EXPERIMENTAL, use with caution **

Refine the polynomial fit fit_coefficients. Recommended to use in it multiple calls to first refine the lowest order and gradually increase the order of fit_coefficients to be included for refinement. This is be achieved by providing delta in the length matching the number of the lowest degrees to be refined.

Set refine to True to improve on the polynomial solution.

Set robust_refit to True to fit all the detected peaks with the given polynomial solution for a fit using maximal information, with the degree of polynomial = fit_deg.

Set both refine and robust_refit to False will return the list of arc lines are well fitted by the current solution within the tolerance limit provided.

Parameters
  • fit_coeff (list (default: None)) – List of polynomial fit fit_coefficients.

  • n_delta (int (default: None)) – The number of the lowest polynomial order to be adjusted

  • refine (boolean (default: True)) – Set to True to refine solution.

  • tolerance (float (default: 10.)) – Absolute difference between fit and model in the unit of nm.

  • method (string (default: 'Nelder-Mead')) – scipy.optimize.minimize method.

  • convergence (float (default: 1e-6)) – scipy.optimize.minimize tol.

  • min_frac (float (default: 0.5)) – Minimum fractionof peaks to be refitted.

  • robust_refit (boolean (default: True)) – Set to True to fit all the detected peaks with the given polynomial solution.

  • fit_deg (int (default: length of the input fit_coefficients)) – Order of polynomial fit with all the detected peaks.

Returns

  • fit_coeff (list) – List of best fit polynomial fit_coefficient.

  • peak_match (numpy 1D array) – Matched peaks

  • atlas_match (numpy 1D array) – Corresponding atlas matches

  • rms (float) – The root-mean-squared of the residuals

  • residual (numpy 1D array) – The difference (NOT absolute) between the data and the best-fit solution. * EXPERIMENTAL *

  • peak_utilisation (float) – Fraction of detected peaks (pixel) used for calibration [0-1].

  • atlas_utilisation (float) – Fraction of supplied arc lines (wavelength) used for calibration [0-1].

plot_arc(pixel_list=None, log_spectrum=False, save_fig=False, fig_type='png', filename=None, return_jsonstring=False, renderer='default', display=True)[source]

Plots the 1D spectrum of the extracted arc.

Parameters
  • pixel_list (array (default: None)) – pixel value of the of the spectrum, this is only needed if the spectrum spans multiple detector arrays.

  • log_spectrum (boolean (default: False)) – Set to true to display the wavelength calibrated arc spectrum in logarithmic space.

  • save_fig (boolean (default: False)) – Save an image if set to True. matplotlib uses the pyplot.save_fig() while the plotly uses the pio.write_html() or pio.write_image(). The support format types should be provided in fig_type.

  • fig_type (string (default: 'png')) – Image type to be saved, choose from: jpg, png, svg, pdf and iframe. Delimiter is ‘+’.

  • filename (string (default: None)) – Provide a filename or full path. If the extension is not provided it is defaulted to png.

  • return_jsonstring (boolean (default: False)) – Set to True to return json strings if using plotly as the plotting library.

  • renderer (string (default: 'default')) – Indicate the Plotly renderer. Nothing gets displayed if json is set to True.

  • display (boolean (Default: False)) – Set to True to display disgnostic plot.

Returns

  • Return json strings if using plotly as the plotting library and json

  • is True.

plot_fit(fit_coeff, spectrum=None, tolerance=5.0, plot_atlas=True, log_spectrum=False, save_fig=False, fig_type='png', filename=None, return_jsonstring=False, renderer='default', display=True)[source]

Plots of the wavelength calibrated arc spectrum, the residual and the pixel-to-wavelength solution.

Parameters
  • fit_coeff (1D numpy array or list) – Best fit polynomial fit_coefficients

  • spectrum (1D numpy array (N)) – Array of length N pixels

  • tolerance (float (default: 5)) – Absolute difference between model and fitted wavelengths in unit of angstrom.

  • plot_atlas (boolean (default: True)) – Display all the relavent lines available in the atlas library.

  • log_spectrum (boolean (default: False)) – Display the arc in log-space if set to True.

  • save_fig (boolean (default: False)) – Save an image if set to True. matplotlib uses the pyplot.save_fig() while the plotly uses the pio.write_html() or pio.write_image(). The support format types should be provided in fig_type.

  • fig_type (string (default: 'png')) – Image type to be saved, choose from: jpg, png, svg, pdf and iframe. Delimiter is ‘+’.

  • filename (string (default: None)) – Provide a filename or full path. If the extension is not provided it is defaulted to png.

  • return_jsonstring (boolean (default: False)) – Set to True to return json strings if using plotly as the plotting library.

  • renderer (string (default: 'default')) – Indicate the Plotly renderer. Nothing gets displayed if json is set to True.

  • display (boolean (Default: False)) – Set to True to display disgnostic plot.

Returns

  • Return json strings if using plotly as the plotting library and json

  • is True.

plot_search_space(fit_coeff=None, top_n_candidate=3, weighted=True, save_fig=False, fig_type='png', filename=None, return_jsonstring=False, renderer='default', display=True)[source]

Plots the peak/arc line pairs that are considered as potential match candidates.

If fit fit_coefficients are provided, the model solution will be overplotted.

Parameters
  • fit_coeff (list (default: None)) – List of best polynomial fit_coefficients

  • top_n_candidate (int (default: 3)) – Top ranked lines to be fitted.

  • weighted ((default: True)) – Draw sample based on the distance from the matched known wavelength of the atlas.

  • save_fig (boolean (default: False)) – Save an image if set to True. matplotlib uses the pyplot.save_fig() while the plotly uses the pio.write_html() or pio.write_image(). The support format types should be provided in fig_type.

  • fig_type (string (default: 'png')) – Image type to be saved, choose from: jpg, png, svg, pdf and iframe. Delimiter is ‘+’.

  • filename ((default: None)) – The destination to save the image.

  • return_jsonstring ((default: False)) – Set to True to save the plotly figure as json string. Ignored if matplotlib is used.

  • renderer ((default: 'default')) – Set the rendered for the plotly display. Ignored if matplotlib is used.

  • display (boolean (Default: False)) – Set to True to display disgnostic plot.

Returns

Return type

json object if json is True.

remove_atlas_lines_range(wavelength, tolerance=10)[source]

Remove arc lines within a certain wavelength range.

Parameters
  • wavelength (float) – Wavelength to remove (Angstrom)

  • tolerance (float) – Tolerance around this wavelength where atlas lines will be removed

remove_pix_wave_pair(arg)[source]

Remove fitted pixel-wavelength pair from the Calibrator for refitting. The positions can be found from get_pix_wave_pairs(). One at a time.

Parameters

arg (int) – The position of the pairs in the arrays.

save_hough_transform(filename='hough_transform', fileformat='npy', delimiter='+', to_disk=True)[source]

Save the HoughTransform object to memory or to disk.

Parameters
  • filename (str) – The filename of the output, not used if to_disk is False. It will be appended with the content type.

  • format (str (default: 'npy')) – Choose from ‘npy’ and json’

  • delimiter (str (default: '+')) – Delimiter for format and content types

  • to_disk (boolean) – Set to True to save to disk, else return a numpy array object

Returns

hp_hough_points – only return if to_disk is False.

Return type

numpy.ndarray

set_calibrator_properties(num_pix=None, pixel_list=None, plotting_library=None, seed=None, logger_name='Calibrator', log_level='warning')[source]

Initialise the calibrator object.

Parameters
  • num_pix (int) – Number of pixels in the spectral axis.

  • pixel_list (list) – pixel value of the of the spectrum, this is only needed if the spectrum spans multiple detector arrays.

  • plotting_library (string (default: 'matplotlib')) – Choose between matplotlib and plotly.

  • seed (int) – Set an optional seed for random number generators. If used, this parameter must be set prior to calling RANSAC. Useful for deterministic debugging.

  • logger_name (string (default: 'Calibrator')) – The name of the logger. It can use an existing logger if a matching name is provided.

  • log_level (string (default: 'info')) – Choose {critical, error, warning, info, debug, notset}.

set_hough_properties(num_slopes=None, xbins=None, ybins=None, min_wavelength=None, max_wavelength=None, range_tolerance=None, linearity_tolerance=None)[source]
Parameters
  • num_slopes (int (default: 1000)) – Number of slopes to consider during Hough transform

  • xbins (int (default: 50)) – Number of bins for Hough accumulation

  • ybins (int (default: 50)) – Number of bins for Hough accumulation

  • min_wavelength (float (default: 3000)) – Minimum wavelength of the spectrum.

  • max_wavelength (float (default: 9000)) – Maximum wavelength of the spectrum.

  • range_tolerance (float (default: 500)) – Estimation of the error on the provided spectral range e.g. 3000-5000 with tolerance 500 will search for solutions that may satisfy 2500-5500

  • linearity_tolerance (float (default: 100)) – A toleranceold (Ansgtroms) which defines some padding around the range tolerance to allow for non-linearity. This should be the maximum expected excursion from linearity.

set_known_pairs(pix=(), wave=())[source]

Provide manual pixel-wavelength pair(s), they will be appended to the list of pixel-wavelength pairs after the random sample being drawn from the RANSAC step, i.e. they are ALWAYS PRESENT in the fitting step. Use with caution because it can skew or bias the fit significantly, make sure the pixel value is accurate to at least 1/10 of a pixel. We do not recommend supplying more than a coupld of known pairs unless you are very confident with the solution and intend to skew with the known pairs.

This can be used for example for low intensity lines at the edge of the spectrum. Or saturated lines where peaks cannot be well positioned.

Parameters
  • pix (numeric value, list or numpy 1D array (N) (default: ())) – Any pixel value, can be outside the detector chip and serve purely as anchor points.

  • wave (numeric value, list or numpy 1D array (N) (default: ())) – The matching wavelength for each of the pix.

set_ransac_properties(sample_size=None, top_n_candidate=None, linear=None, filter_close=None, ransac_tolerance=None, candidate_weighted=None, hough_weight=None, minimum_matches=None, minimum_peak_utilisation=None, minimum_fit_error=None)[source]

Configure the Calibrator. This may require some manual twiddling before the calibrator can work efficiently. However, in theory, a large max_tries in fit() should provide a good solution in the expense of performance (minutes instead of seconds).

Parameters
  • sample_size (int (default: 5)) – Number of samples used for fitting, this is automatically set to the polynomial degree + 1, but a larger value can be specified here.

  • top_n_candidate (int (default: 5)) – Top ranked lines to be fitted.

  • linear (boolean (default: True)) – True to use the hough transformed gradient, otherwise, use the known polynomial.

  • filter_close (boolean (default: False)) – Remove the pairs that are out of bounds in the hough space.

  • ransac_tolerance (float (default: 1)) – The distance criteria (Angstroms) to be considered an inlier to a fit. This should be close to the size of the expected residuals on the final fit (e.g. 1A is typical)

  • candidate_weighted (boolean (default: True)) – Set to True to down-weight pairs that are far from the fit.

  • hough_weight (float or None (default: 1.0)) – Set to use the hough space to weigh the fit. The theoretical optimal weighting is unclear. The larger the value, the heavily it relies on the overdensity in the hough space for a good fit.

  • minimum_matches (int or None (default: 0)) – Set to only accept fit solutions with a minimum number of matches. Setting this will prevent the fitting function from accepting spurious low-error fits.

  • minimum_peak_utilisation (int or None (default: 0)) – Set to only accept fit solutions with a fraction of matches. This option is convenient if you don’t want to specify an absolute number of atlas lines. Range is 0 - 1 inclusive.

  • minimum_fit_error (float or None (default: 1e-4)) – Set to only accept fits with a minimum error. This avoids accepting “perfect” fit solutions with zero errors. However if you have an extremely good system, you may want to set this tolerance lower.

use_matplotlib()[source]

Call to switch to matplotlib.

use_plotly()[source]

Call to switch to plotly.

which_plotting_library()[source]

Call to show if the Calibrator is using matplotlib or plotly library (or neither).