LATLINE

By Vinzenz Unger and Anchi Cheng

This program performs fitting of curves to the experimental amplitude and phase data for individual lattice lines.

Fitting of curves to a set of scattered input data allows the generation of discrete and uniformly sampled structure factors which can be used to calculate a 3-D density map by Fourier summation. For a more detailed description of the underlining concepts, the readers are encouraged to read the original paper on the program (Agard, 1983) and a good description of the sampling theorem (e.g. Ch.7 by Moody p168-170 in “Biophysical Electron Microscopy”, Hawkes and Valdrè editors, Academic press). The sampling theorem says that we can reconstruct the transform of a real-space object of finite width (D), by sampling the transform at 0, ±1/D, ±2/D,…., followed by a convolution of the sampled transform with the transform of an envelope function that has the shape of a square-pulse (i.e., a uniform height for all points within width D but zero everywhere else). Accordingly, for the data from a 2-D crystal, the task of obtaining a lattice line can be broken down into two parts. The first is to estimate the width of the envelope function within which the map density will be confined, and the second is to determine the amplitudes and phases of the complex transform at 0, ±1/D, ±2/D,…. A least square fitting approach in LATLINE is used to minimize errors between the data and the fitted curves by optimizing the amplitudes and phases of the sampled points. The success of the least squares minimization largely depends on the proper width of the envelope function that is manually defined by the user. Finding the proper width is mainly based on existing knowledge about the sample and sometimes a trial-and-error approach to produce the most reasonable fit. Too small a width will result in a fit that does not faithfully follow the most rapid changes in the transform. However, choosing the width too large will cause a fit that attempt to follow the noisy scatter of the data too closely. While the latter will always result in smaller overall fitting errors, the final map will be noisier due to the low amount of averaging.

The following parameters need to be adjusted:

  • IPG: LATLINE considers the symmetry constraints of lattice lines in its fitting. For instance, based on the plane group, flags are set to consider the inversion center, or whether data along a line should adopt all real or imaginary values. Note, that in these cases LATLINE will ignore the quality of the input data and will force the fit to obey the phase contraints. Furthermore, a figure of merit of 100 (perfect data point) will be assigned regardless of the scatter of the input data. The latter should be kept in mind to evaluate the “true” quality and reliability of the data set. The plane group numbers are the same as those used in ORIGTILT.
  • IPAT: This option determines whether the observed amplitudes and phases (IPAT=0) or the intensities (IPAT=1) are fitted. The latter is used to fit electron diffraction derived data alone if applicable.
  • AK, IWF, IWP: Because the phases and amplitudes of the lattice lines are coupled in a complex manner, the error function which is minimized by the least-squares procedure considers the error in the amplitude as well as in the phase. AK is then used as the relative weights for the overall phase error with respect to the amplitude error. A value >1 should be given if image amplitudes are used, because in this case the uncertainty in the amplitude measurements is significantly larger than that for the phases. If image derived amplitudes are used, then typical settings for AK range from 1.5 to 3, i.e. the phase values have more weight in the fit then the amplitudes. IWF and IWP determine the methods of weighing individual data error. For example, the merged, prescaled, and formatted data provided by the previous MRC programs allow the use of individual sigmas so that the weight = 1/sigma2 (IWF=-1 and IWP=-2).
  • ALAT: This parameter sets the third unit cell dimension, c, in Å which determines the final sampling of the structure factors at 1/ALAT and is not to be confused with the width of the envelope function. It is desirable to choose ALAT larger than the width of the envelope function because it avoids the situation of having any density at the edges of the unit cell.
  • ZMIN, ZMAX: The two parameters in Å-1 define the range in which the observed z* are expected and should match with the cutoff chosen in LATLINPRESCALE (and DLATLPREP if diffraction data are used). Furthermore, ZMIN and ZMAX are also used in creating the Fourier transform of the envelope function. Therefore, the following condition must also be met to avoid distortion of the transform:

ZMIN ≤ 1/RMIN and ZMAX ≥ 1/RMAX

where RMIN and RMAX are the limits of the real-space envelope function (see below).

  • DELPLT: This defines the z* interval in Å-1 for plotting purpose. A small number should be used for a smooth looking lattice line.

The five parameters in the following input card, DELPRO, RMIN, RMAX, RCUT, and PFACT, define the shape and size of the envelope function. As explained earlier, the envelope function contributes to the lattice line as a convolution of its transform with the discretely sampled structure factors. Hence, its effects are found in all parts of the lattice line and the least square fitting procedure does not “correct” for the effects of a wrongly chosen envelope function. Accordingly, a “trial-and-error” approach may be necessary to find the best values based on the overall fitting statistics.

  • RMIN, RMAX, RCUT, PFACT: These parameters define the boundary of the envelope function within which non-zero values are possible for the function. To produce a proper envelope function for a 2-D crystal, no density should lie beyond the boundaries. The width, RMAX-RMIN, of the envelope should therefore be chosen as the thickness of the specimen. Since the thickness is not precisely known in most cases, it is desirable to somewhat overestimate the envelope function width to minimize the problems arising from resolution limitation of the sample and, therefore, the lattice lines (Agard, 1983). For the same reason, the edges of the envelope function are often softened or tapered. RCUT is the distance in Å from either boundary to the point where tapering is to start. Two different types of taper can be applied. If PFACT ≤ 0, a linear drop-off is applied and the values of the envelope function becomes zero at RMIN/RMAX. If PFACT > 0, the tapered edge is in the shape of half of a gaussian function where PFACT corresponds to the value of the gaussian remaining at RMIN and RMAX. For such purpose, PFACT = 0.1 is a good choice, which correspond to 10 % of the height of the function. However, in most cases, a linear drop-off will be sufficient.
  • DELPRO: This is the real-space or patterson-space sampling interval in Å for the envelope function. By choosing DELPRO much larger than RCUT, the envelope function is redefined so that the transform calculated by the program will be different from the transform of the continuous function as it is defined by RCUT and PFACT. A finer sampling, i.e., a smaller DELPRO, than necessary does not cause as much impact as a too coarsely sampled profile. However, its setting is limited by the program dimension to a maximum of 250 sampling points over the whole width of the envelope. Therefore DELPRO cannot be smaller than 1/250 of the envelope function boundary width defined by RMIN and RMAX. Usually, a larger value is used to improve the efficiency of the program,
  • IGUESS, BINSIZE: The least-square fitting procedure requires an initial guess for the amplitudes and phases of the sampled points. This guess is either provided by an input file (IGUESS=0) or estimated internally from the raw input data (IGUESS=1). To generate the guess from the data, the BINSIZE has to be chosen in Å-1 for binning observed data. The values of the amplitudes and phases in each bin are averaged and interpolated to the required sampling points to give the initial guess for the refinement. BINSIZE should therefore be small enough so that variation in the bin is small, but large enough to provide sufficient data for a reliable averaging. Values such as 0.002 to 0.005 are usually acceptable. However, it is necessary to decrease BINSIZE when a thicker crystal is analyzed since proper interpolation can not be done if the size of the bin is much larger than the required sampling for lattice line fitting, i.e., 1/thickness.
  • NCYCLS: It defines the number of refinement cycles to perform. If NCYCLS is set to a negative value, the guess values are output, which can then be modified and used as the new initial guess by setting IGUESS=0. Such procedure is useful when some of the guesses that are internally generated are somehow biased and can't be refined to a reasonable value. For a normal refinement, NCYCLS=25-50 is reasonable.
  • MPRINT: This parameter controls diagnostic output. The variable and the conjugated variable matrices used to minimize the error gradient during the least-squares refinement are printed out if MPRINT > 0.