Skip to content

Latest commit

 

History

History
463 lines (376 loc) · 23.7 KB

pointgroup.rst

File metadata and controls

463 lines (376 loc) · 23.7 KB

How to choose the right point group for merging

A common question from our users is how to choose the correct symmetry for merging, i.e. the correct -y option. It's actually not that difficult, but it does touch on several areas of crystallography theory. This document aims to be a gentle introduction to the process, introducing the concepts step by step. For a somewhat terse explanation, see section 6 of the following paper:

T. A. White, A. Barty, F. Stellato et al "Crystallographic data processing for free-electron laser sources" Acta Cryst. D69 (2013), p1231–1240. doi:10.1107/S0907444913013620

Another useful article is the following:

M. Nespolo, M. I. Aroyo and B. Souvignier "Crystallographic shelves: space-group hierarchy explained" J. Applied Cryst. 51 (2018) p1481-1491 doi:10.1107/S1600576718012724

Step 1: Temporarily forget about space groups

To merge your reflection data, CrystFEL needs to know which reflections are symmetrically equivalent. This information is given by the point group. The 230 space groups can be classified into 17 categories, each corresponding to a single point group. If you know the space group for your crystals in advance, that's a big advantage. However, for now you only need to know the point group.

If you're working on an unknown structure, don't get ahead of yourself! Many crystallographic data processing programs start suggesting possible space groups very early in the process, such as when the patterns are indexed. The space group is only a hypothesis until the structure is solved, so you always need to take these early suggestions with a pinch of salt. CrystFEL's design philosophy is not to deal with space group determination at all. CrystFEL will never ask you to tell it the space group of your structure ahead of time, nor will it suggest one automatically for your structure [1].

The following table shows the point group corresponding to each of the space groups. To keep things simple, the table only contains the Sohncke space groups, which are the ones relevant to biological structures. The point groups are given in exactly the form you will type them into CrystFEL:

Point group Space groups
1 P1
2 P2, P21, C2 (pay special attention to step 3 below)
222 P222, P2221, P21212, P212121, C2221, C222, F222, I222, I212121
4 P4 P41, P42, P43, I4, I41
422 P422, P4212, P4122, P41212, P4222, P42212, P4322, P43212, I4222, I4122
32_R R32 (rhombohedral axes, pay special attention to step 6)
3_R R3 (rhombohedral axes, pay special attention to step 6)
3_H H3 (hexagonal axes, pay special attention to step 6), P3, P31, P32
321_H H32 (hexagonal axes, pay special attention to step 6), P321, P3121, P3221
312_H P312, P3112, P3212
6 P6, P61, P62, P63, P64, P65
622 P622, P6122, P6222, P6322, P6422, P6522
23 P23, F23, I23, P213, I213
432 P432, P4232, F432, F4132, I432, P4332, P4132, I4132

Notice that, in most cases, the correct point group can easily be recognised from the space group, without memorizing the entire table.

If you are in the fortunate situation of knowing the space group for your sample before processing the data, look up the point group in the table above and keep it in mind as you read the next sections. If you can't find your space group in the table (for example, A112), your source of information is using a non-standard setting. Everything should become clear in step 3.

If you don't know the space group, no problem: we will work everything out in the steps below.

Step 2: Determine the apparent symmetry

The orientation of each crystal in your dataset was determined by the indexing procedure inside indexamajig. There's a choice of indexing algorithms which work in many different ways, but they all share one thing in common: they only look at the positions of the Bragg peaks, not the intensities.

As you should know from basic diffraction theory, the positions of the Bragg peaks are determined by the translational symmetry of the structure (the lattice), whereas the intensities are determined by the contents of the unit cell.

This leads to a problem for some symmetry classes. If the overall crystal structure, taking into account the unit cell contents, has lower symmetry than the lattice, there will be an indexing ambiguity. In these cases, the Bragg peak positions don't provide enough information to correctly determine the orientation of the crystal. The results will be an equal mixture of correctly indexed patterns, and ones where the Miller indices for the reflections are wrong. But, we're getting ahead of ourselves...

Just by looking at the parameters of the lattice (the unit cell parameters), we can determine the symmetry that the merged dataset will exhibit. This is the symmetry that the indexing algorithm is able to discern (by looking at the Bragg positions only), and therefore which reflections should be considered symmetrically equivalent. This is the point group which we will tell to process_hkl or partialator.

The following table shows the possible cases and the point group to use in each case. Use the furthest down row that is compatible with your data, for example if the axis lengths are all equal (a=b=c) and the angles are all 90°, you should use 432, even though 32_R, 422, 222, 2 and 1 would all fit.

For this step, what matters are the approximate symmetries of the lattice. You should consider an angle to be equal to 90° if it's within about 1° of that value, and axis lengths to be equal if they're within about 1% of the same length. If indexamajig gets confused between the axes (shown by double peaks in the cell_explorer histograms), then they should be considered equal. Conversely, if indexamajig was able to tell the axes apart (clear, single peaks for each axis length, with significantly different lengths and angles), then you can consider them distinct.

The centering of the cell (P, A, B, C, I, F, R or H) is irrelevant at this step, unless you have "H-centering", which is a special case that we will come to later.

Unit cell parameters Point group for merging
No restrictions 1
alpha=beta=90° 2
H-centering and a=b and gamma=120° 321_H
a=b and gamma=120° 622
a=b=c 32_R
All angles 90° 222
All angles 90° and a=b 422
All angles 90° and a=b=c 432

Perhaps your cell parameters resemble one of the cases, but with the axes "re-named". For example, you might have beta=gamma=90°, alpha≠90°, and all axes different. This is the same as point group 2 above, but with the axes a,b,c re-labelled as b,c,a. We can deal with that, as described in the next step.

Step 3: Make sure the "unique axis" is correct

Let's say your point group is 2. In this case, there is a single twofold axis of rotational symmetry. The symmetry axis can be along the a, b or c direction of the lattice - these letters are just the names we use to refer to the axes. In theory, you can define the unit cell any way you like, and CrystFEL will be able to cope (with one exception, mentioned below). However, some possibilities are more "conventional" than others, and it can help to avoid problems if you follow the established conventions. For example, not all software can handle all of the possibilities smoothly. It's also easier to compare structures when they're described in the same way.

You can tell the direction of the twofold rotation axis, because it has to be along the axis perpendicular to the angle that isn't 90°. For example, the following cell parameters show that the twofold rotation axis is along b. We refer to b as the unique axis:

a=34 Å, b=123 Å, c=44 Å, alpha=gamma=90°, beta=131°

The following cell has unique axis a:

a=92 Å, b=74 Å, c=34 Å, alpha=128°, beta=gamma=90°

However, a as the unique axis is a very unconventional situation. It would make things easier for yourself to change your target unit cell to make b or c the unique axis, and re-run indexamajig [2].

If you've been told that the space group is simply "P2", check carefully to make sure which convention is being used, because unique axis b or c are considered equally acceptable.

If your non-90° angle is very close to 90°, then you should instead be using point group 222. As mentioned above, what matters are the approximate symmetries that can be discerned by the indexing algorithm.

Other types of unit cell have a 'unique' axis, as well. For example, a tetragonal cell has all angles 90°, two axes with the same length and one different. The different length axis could be labelled as a, b or c. However, in this case, anything other than unique axis c is highly unconventional. Nevertheless, check carefully here as well.

When you tell process_hkl or partialator the symmetry, you'll need to tell it the unique axis. By default, CrystFEL programs assume that the unique axis is c. If you have anything else, append _uaa, _uab or _uac to the point group symbol (from the tables above) to indicate which is the 'unique' axis. For the first example from above, we would use 2_uab:

a=34 Å, b=123 Å, c=44 Å, alpha=gamma=90°, beta=131°

For the tetragonal unit cell parameters shown below, we would use 422, which is a synonym for 422_uac since the unique axis is assumed to be c:

a=123 Å, b=123 Å, c=44 Å, alpha=beta=gamma=90°

Step 4: Add an inversion center to merge Friedel pairs

Remember that the point group tells CrystFEL which reflections to consider as symmetrically equivalent. The point group you have, at this point, will not include an inversion center, i.e. it will not consider reflections h,k,l and -h,-k,-l as equivalent. This means that the merging process will preserve any anomalous signal present in your data.

If you don't expect (or want) an anomalous signal, you can get better results by merging Friedel pairs of reflections. This doubles the number of measurements per symmetrically unique reflection, which can make a large improvement! To do this, simply add the missing inversion center to the point group. This will change the point group symbol in a way that's not immediately logical. The following table shows the results of adding an inversion symmetry to each of the point groups, so you just have to look up your case.

Point group Point group with inversion center
1 -1
2 2/m
222 mmm
422 4/mmm
32_R -3m_R
321_H -3m1_H
622 6/mmm
432 m-3m

The point group symbols in the table above look quite strange. If you need to look up one of these symbols in a crystallographic textbook, you just need to know that the minus signs are supposed to indicate a "bar" over the following digit. However, there's usually no need to worry about that.

If you've added a unique axis suffix, add the same suffix to your new point group. For example, 622_uab goes to 6/mmm_uab (although, either of these cases would be considered very unconventional).

Step 5: Worry about indexing ambiguities

At this point, you're in a position to merge your data. If your prior information about the point group from step 1 agreed with what you determined in step 2, then everything is OK and you're finished already! Simply give the point group symbol to partialator or process_hkl with the -y argument (or via the CrystFEL GUI). For example: -y 4/mmm.

However, maybe something is still not right. Perhaps the structure solution software is complaining about "twinned data", strange statistics or "poor" L-test results. Or, perhaps your prior information about the structure doesn't match the point group you determined in the previous steps. In this case, you may be facing an indexing ambiguity, where the true symmetry is lower than what can be distinguished by the indexing algorithm.

An indexing ambiguity is when the positions of the Bragg peaks do not give sufficient information to uniquely identify the orientation of the crystal. Instead, there are a small number (usually 2) of possible orientations which give the same Bragg peak positions. The correct orientation can be determined by looking at the peak intensities, so it requires a separate processing step after indexing and integration.

Indexing ambiguities can be resolved in CrystFEL using ambigator. This program takes a stream (from indexamajig), works out the correct indexing assignments, and writes a new stream with the incorrectly assigned reflections re-labelled with their correct indices. Here, "correct" means "consistent with the other patterns in the dataset" - you should keep in mind that the indexing ambiguity allows separately-processed datasets to have inconsistent labels.

The mechanics of running ambigator will be described in a separate document. However, you will need to know the "real" and "apparent" point groups. The apparent point group is the one we already determined. The real point group is so far unknown (unless you have prior information!), but there are a small number of possibilities. Here they are:

Apparent point group Real point group
422 4
32_R (rhombohedral axes) 3_R (rhombohedral axes)
432 23
622 3_H (hexagonal axes) - double ambiguity, see below
622 6
622 312_H (hexagonal axes)
622 321_H (hexagonal axes)

Notice that structures with hexagonal lattices (apparent point group 622) are particularly problematic, with quite a large number of real point groups giving the apparent 622 symmetry. One of those cases, point group 3_H exhibits a double ambiguity where there are four indexing possibilities for each pattern, not just the usual two.

Step 6: Extra information about "H cells"

A rhombohedral unit cell (all axes the same length, all angles the same but not equal to 90°) can be represented in two ways. The first way is using the axes exactly as just described. In this case, we talk about "rhombohedral axes" and use space group symbols R3 and R32. The second way is to embed the rhombohedral cell inside a hexagonal unit cell (a=b≠c, alpha=beta=90°, gamma=120°) while having multiple lattice points (i.e. extra copies of the crystal structure) within the unit cell. In this case, we talk about "hexagonal axes" and use space group symbols H3 and H32.

You will find both representations in space group tables - for example here, in the International Tables Volume A. Rhombohedral axes are easier to think about, but hexagonal axes are commonly used for protein structures. If you've downloaded a rhombohedral structure from the PDB, it's probably (but not always!) using hexagonal axes.

Different software packages use different conventions for labelling these cells. For example, you might also encounter R3:h and R3:r for hexagonal and rhombohedral axes respectively. Unfortunately, sometimes you might even encounter programs which use R3 to refer to hexagonal axes, and H3 for rhombohedral axes! However, you can always tell the difference by looking at the unit cell parameters. For some more discussion, including a useful diagram, see this classic article.

The most important thing to keep in mind is that representing the unit cell in a different way will never change any of the physical properties. If the symmetry is R3 or H3, there's an indexing ambiguity, and if it's R32 or H32 then there's no ambiguity. The R3 and H3 cases are the same thing, as are the R32 and H32 cases. In both cases, the number of symmetry equivalents for each reflection is the same. If there's a strange accidental indexing ambiguity for one version (see step 7), the same accidental indexing ambiguity applies to the other version as well.

However, you need to tell CrystFEL which representation you're using. For all trigonal point groups - that is, anything with a rhombohedral lattice, or a hexagonal lattice but no sixfold symmetry - you will need to append either _H or _R to the space group symbol. For example, for point group 3 on rhombohedral axes, use 3_R. For hexagonal axes, use 3_H.

You cannot use the unique axis and axis definition suffixes together, for example 321_H_uab. Always use unique axis c for trigonal cells on hexagonal axes.

There's a further complication. There are actually two ways that the rhombohedral cell can be "embedded" into the hexagonal cell. The two ways are called obverse and reverse. The International Tables uses the obverse representation [3], and so does all the software that I know about. This complication affects the point group symbol that you must use for space group R32/H32 (it makes no difference for R3/H3). Here are all the cases for R32/H32:

Axes Setting Point group as given to CrystFEL Comment
Rhombohedral n/a 32_R  
Hexagonal Obverse 321_H  
Hexagonal Reverse 312_H Don't use this one

Just "for fun", here's the same table for R3/H3:

Axes Setting Point group as given to CrystFEL Comment
Rhombohedral n/a 3_R  
Hexagonal Obverse 3_H  
Hexagonal Reverse 3_H Same as for obverse

As you can see, your life will be much easier if you just use rhombohedral axes all the time. However, due to the prevalence of hexagonal axes in deposited structures, this is likely to mean that you have to convert from one representation to the other. Converting atomic locations (i.e. a structural model) is outside the scope of CrystFEL, but CrystFEL can convert just the unit cell parameters. For example, given an "H-centered" unit cell file:

CrystFEL unit cell file version 1.0

lattice_type = hexagonal
centering = H
unique_axis = c

a = 66.2 A
b = 66.2 A
c = 150.2 A

al = 90.0 deg
be = 90.0 deg
ga = 120.0 deg

CrystFEL's cell_tool can calculate the rhombohedral representation:

$ cell_tool -p example.cell --uncenter
Input unit cell: cell-example.cell
------------------> The input unit cell:
hexagonal H, unique axis c, right handed.
a      b      c            alpha   beta  gamma
 66.20  66.20 150.20 A     90.00  90.00 120.00 deg
------------------> The primitive unit cell:
rhombohedral R, right handed.                                <<-----------
a      b      c            alpha   beta  gamma               <<-----------  Look here!
 62.99  62.99  62.99 A     63.40  63.40  63.40 deg           <<-----------
------------------> The centering transformation:
[    1    0    1 ]
[   -1    1    1 ]
[    0   -1    1 ]
------------------> The un-centering transformation:
[  2/3 -1/3 -1/3 ]
[  1/3  1/3 -2/3 ]
[  1/3  1/3  1/3 ]

Step 7: "It still isn't working!"

The ambiguities described in step 5 are the most common cases, but there are more possibilities. Sometimes, the lattice parameters "accidentally" give rise to indexing ambiguities. As noted above, it's the apparent symmetries of the lattice that matter here. For example, unless the indexing is very accurate (within 1/20 of a degree), the following unit cell will need to be merged with point group 222 (or mmm to merge Friedel pairs), even though it is technically monoclinic:

a=63 Å, b=82 Å, c=95 Å, alpha=gamma=90°, beta=90.04°

In this case, there will be an indexing ambiguity, because the true symmetry is 2 (unique axis b), but the apparent symmetry is 222.

Things can get even more complicated than this, and some very "interesting" ambiguities have turned up over the years. CrystFEL's cell_tool utility can analyse your unit cell and spot possible ambiguities. See the manual for details.

Crystal structures seem to have a way of finding new ways to cause trouble. So, if things are still not working, or if you're just confused, we're happy to help. Just send an email! See the contact page on the CrystFEL website for details.

Good luck, and may all your indexing be unambiguous!

Footnotes

[1]There are a couple of small exceptions here, when the data is exported to XScale or MTZ format. These formats require a space group to be nominated, because of the aforementioned reliance on early space group nomination. Here, CrystFEL chooses the lowest-symmetry space group that reflects the point symmetry according to which the merging was performed. The "downstream" structure solution software should be clever enough to assign the correct space group, regardless of what's in the data file.
[2]It's also possible to change the indexing assignments in the stream without re-running indexing, but this could be considered "advanced" usage. As mentioned above, it's also possible to continue using the non-standard setting, at least as far as CrystFEL is concerned. However, in that case you can expect to have difficulty with other software or when depositing the structure.
[3]If you're interested, this is made explicit in section 2.1.3.6.6 of International Tables Volume A (2016 edition), which you can read here (subscription required).