By Vinzenz Unger and Anchi Cheng This program is used to retrieve and analyze raw phase and amplitude data from a Fourier transform of an image. Setting IOUT≥1 will produce a listing of H, K, amplitude, phase, IQ, background and a dummy column that is later used by CTFAPPLY for the addition of CTF values for each reflection. The logfiles contain valuable statistics, which can help to assess the quality and resolution of the image (see below). If your images require tight boxing of relatively small areas make sure you use version 3.00 (MMBOXA) instead of any of the previous editions of MMBOX. The older versions calculate the background in a way that is different from the calculation of the signal and are based on the assumption that the transform pixels are independent. Since this is not true if the image is boxed, changes have been made in MMBOXA so that the background around each spot is now calculated in the same way as the peak itself. This more correct treatment leads to slightly lower peak-to-background ratios than are calculated by older versions and efficiently removes spots, which in reality are only noise. Hence, at first sight a result obtained by MMBOXA will always look less impressive compared to the output of MMBOX if a boxed image was used. Still, the user should be aware that any spots calculated by MMBOXA (even the weak spots) are likely to be a real data point whereas many apparently “good” spots found by MMBOX may not mean anything at all.
However, if part of a data set has already been created by MMBOX, then switching to MMBOXA for the remainder of the images may not be advisable since it could bias the data towards the part that was obtained by MMBOX. The reason for this is that all averaging and lattice line fitting procedures as well as the determination of the common phase origins weigh the data in some way or another according to the apparent “signal-to-noise” ratio of the input measurements. Hence, in a “mixed” data set the data obtained by MMBOXA will have less weight then those generated with MMBOX. Yet, if all data were generated with MMBOXA then this does not apply. On the contrary since MMBOXA should have eliminated a significant proportion of noise one would expect a better agreement between the data from different images.
A script for running MMBOXA is included in the protocol for JOBA. Over time this program has matured but the input parameter for the original options such as specifying coordinates and cutoffs in mm rather than grid units and Å are still present. However, only the operating mode using grid units (GU=Y), the automatic grid generator (GENGRID=Y) and IRAD=1 is described here because it is by far the easiest way to run the program. GENGRID=Y invokes the lattice generator which uses the parameter NOH and NOK (maximum indices allowed for H and K) to create a list of spots for which the transform coordinates are calculated based on the actual unit cell parameter ($cell) for the (1,0) and (0,1) reflection. For each reflection a resolution check is then carried out using the fixed real space unit cell dimensions for the specimen ($abax, WIDTH and $abang). Reflections that are either beyond the resolution limit of the transform or outside the chosen resolution band are rejected and a rejection statemnet is written to the logfile. For IRAD=1 resolution cutoffs ($res) are specified in [Å]. Note that the real space parameter WIDTH is a “historical remnant” which is never used inside the program, hence no separate input variable has been assigned for it in the current protocol. Once the appropriate spots have been generated the program reads the transform values within a box of NHOR and NVERT pixel around the calculated spot positions. These numbers are used to calculate “sinc” function (i.e. (sinπx/πx)) weighted phase and amplitude values for each reflection as well as the amount of background correction for the amplitudes and a final appraisal of the goodness of the measurement known as the IQ value of a spot. The signal-to-noise ratio (S/N) that is encoded in the IQ value can be estimated to a first approximation as S/N=8/IQ, i.e. an IQ1 reflection is at least 8x stronger than the background while an IQ8 spot is above background but only by a margin that is within the standard deviation of the scatter in the background pixel values itself. The size of the NHOR * NVERT box (maximum 20*20 pixel) should be chosen large enough to ensure that the peak is well separated from the perimeter to avoid parts of the signal to contribute to the background calculation. In practice, the settings given in the protocol should be sufficient for most applications. A detailed printout of all transform values within the NHOR * NVERT boxes and some additional diagnostics can be obtained by adjusting the parameter NUMSPOT to any number up to the number of spots that were generated. If analysed for the higher resolution reflections, this can be useful to see if the latter fall onto the predicted lattice positions. This will usually be the case if the lattice was refined to errors of less than 0.5 pixel as suggested in the main text and hence NUMSPOT=0 has been chosen for the current protocol. The accuracy of the lattice parameter is crucial to obtain good returns from MMBOXA because the “sinc-fit” that generates the amplitude and phase values only uses the central 2×2 pixel around the calculated spot position. This feature causes lower IQ values if spots are split or just slightly off the lattice even in cases where the spot can be detected by eye in the optical or calculated diffraction pattern. Generally, any refelection that is visible by eye in the optical diffraction should finally come out as a IQ1 or IQ2 spot despite the rare exceptions mentioned. Consequently, if a significant number of visible spots obtain IQ-values of 3 or worse one should check a more detailed diagnostic printout (by adjusting NUMSPOT) in order to find what the reasons are.
The origin ($scalex $scaley) should be set to the centre of the image to avoid phase gradients across the peaks. However, if the image was boxed or if the very best area is not centered about the middle then the apparent centre of gravity of the very best region should be chosen instead.
In some cases the centre of gravity may be outside the box. For convenience, the properly scaled centre coordinates that were used for calculating the reference area and cross-correlation map were chosen in the protocol for JOBA. Depending on the actual case this may not be the best choice to extract data after boxing an image. Hence, in each pass one should check and, where applicable, adjust the centre to agree with the centre of gravity of the very best part of the image. Note, that coordinates may need to be scaled appropriately if pixel averaging was involved at any stage of the processing.
In some applications it may be necessary to box part of the image even before the start of the cross-correlation procedure. In these cases one should check that the program used for the boxing step does not float the image, which usually is connected with a shift of the image origin. If a change of origin has been introduced for any reason, MMBOX(A) will return an explicit warning that the effective shift will not place the origin to the coordinates of the shift specified by $scalex and $scaley, thereby defying the whole purpose of setting these parameters in the first place. Reprocessing of the image with its origin being (0,0) for the left bottom corner is strongly recommended in these cases.
To make the best use of MMBOXA we recommend performing sequential runs that are limited to certain resolution ranges. For instance, looking at the scaled average intensity (which is a part of the statistical output) in resolution bands 80-15Å, 15-9Å, 10-7Å etc. gives valuable information about the resolution, the completeness and statistical reliability of the data from each image. The transform of the image contains significant data as long as there is a clear maximum for the average intensity. At higher resolutions an average intensity of ~ 10 in the centre of the square or centered about it (if the image was tightly boxed) will be significant if no other pixel closer to the perimeter of the NHOR * NVERT box displays the same or any larger value. Keeping a record of this information can be very helpful at later stages such as defocus refinements or when choosing an individual resolution cutoff for each image.