-
Notifications
You must be signed in to change notification settings - Fork 1
/
NEWS
403 lines (309 loc) · 14.9 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
===============================================================================
2014-08-08 version 1.9.0
===============================================================================
2014-08-07: James Osborn
Added local Xa_dot_X and X_dot_Xa and global Xa_dot_X routines.
2014-02-28: James Osborn
Made testsuite run with varying alignments.
Corrected non-aligned sse spin project/reconstruct code.
2014-01-16: James Osborn
Made generic cmath macros follow QLA_Precision.
===============================================================================
2013-10-26 version 1.8.0
===============================================================================
2013-10-13: James Osborn
Added virtual Nc=1 library.
2013-05-15: James Osborn
Added precision specific versions of some macros for C99 complex.
===============================================================================
2013-05-10 version 1.7.3
===============================================================================
2013-05-10: James Osborn
Fixed special cases of matrix functions.
===============================================================================
2013-05-04 version 1.7.2
===============================================================================
2013-05-04: James Osborn
Added a few more matrix functions and optimized many for Nc=2,3.
2013-03-26: James Osborn
Added more matrix and scalar fill functions.
2013-03-07: James Osborn
Added M_inverse_V function.
===============================================================================
2013-02-01 version 1.7.1
===============================================================================
2013-02-01: James Osborn
Added QLA_PrecisionInt macro.
Added experimental multi-source functions.
Added some QPX routines.
Added precision generic headers.
===============================================================================
2012-01-11 version 1.7.0-a7 (7th alpha release of 1.7.0)
===============================================================================
2012-01-03: James Osborn
Added float versions of cmath functions.
2011-10-19: James Osborn
Added invsqrt_M function.
Added alignment hints to types.
2011-10-18: James Osborn
Added some optimized C routines for BG/P.
Added int and real type min and max macros.
2011-09-24: James Osborn
Changed gaussian random number routine to optionally match MILC.
===============================================================================
2011-01-22 version 1.7.0-a6 (6th alpha release of 1.7.0)
===============================================================================
2011-01-21: James Osborn
Added OpenMP support for all routines.
2010-08-24: James Osborn
Fixed OpenMP support.
2010-04-22: James Osborn
Fixed mem & flops for QLA_D_vpeq_spproj_D benchmark.
===============================================================================
2010-04-22 version 1.7.0-a5 (5th alpha release of 1.7.0)
===============================================================================
2010-04-20: James Osborn
Added more functions, removed C_eqop_conj_C, added temporary pointers
for indexed varibles to avoid XLCv8 bug.
2010-04-09: James Osborn
Added more times functions.
2010-04-04: James Osborn
Fix _div_c macros to use safer temporary name.
===============================================================================
2010-04-04 version 1.7.0-a4 (4th alpha release of 1.7.0)
===============================================================================
2010-04-04: James Osborn
Many additions:
r_div_c macro
QLA_version_str()
QLA_version_int()
QLA_C_eq_det_M()
QLA_M_eq_inverse_M()
QLA_M_eq_exp_M()
QLA_M_eq_sqrt_M()
QLA_M_eq_log_M()
QLA_X_eq_R_times_X()
QLA_X_eq_C_times_X()
(for X = V,H,D,M,P)
2 & N color example benchmarks
===============================================================================
2009-11-05 version 1.7.0-a3 (3rd alpha release of 1.7.0)
===============================================================================
2009-09-28: James Osborn
Added missing qla_complex_c99.h
===============================================================================
2009-09-10 version 1.7.0-a2 (2nd alpha release of 1.7.0)
===============================================================================
2009-08-30: James Osborn
Numerous code generation improvements and fixes.
2009-08-24: James Osborn
Fixed bug in sse code.
===============================================================================
2009-05-20 version 1.7.0-a1 (first alpha release of 1.7.0)
===============================================================================
2009-05-02: James Osborn
Added more configure/code generation options mainly related to BG/L&P.
2009-04-20: James Osborn
Changed 'N' color types to be any size and only as large as needed.
The QLA_[DF]N_... type is now a macro taking Nc and the variable
as arguments, i.e.
QLA_FN_ColorMatrix(10,foo);
declares 'foo' as a 10x10 color matrix. To create a new type of a
specific Nc you can do
typedef QLA_FN_ColorMatrix(12,CM12);
CM12 foo;
CM12 *bar = (CM12 *) malloc(nmat*sizeof(CM12));
2008-10-06: James Osborn
Added extern "C" to qla.h.
===============================================================================
2008-09-22 version 1.6.3
===============================================================================
2008-09-22: James Osborn
Added source-stamp to clean targets.
2008-02-18: James Osborn
Added AC_PROG_CC_C99 to configure.ac.
2008-02-06: James Osborn
Added OpenMP directives in some places.
2008-02-05: James Osborn
Fixed Makefiles to allow parallel makes.
2007-11-28: James Osborn
Fixed "expr" expression in configure.ac to work with Macs.
===============================================================================
2006-10-10 version 1.6.2
===============================================================================
2006-10-10: James Osborn
Removed extra loop in gamma multiplication functions.
Added checksum to benchmark routine.
===============================================================================
2006-08-03 version 1.6.1
===============================================================================
2006-06-29: James Osborn
Added 440d optimization option for BG/L.
2006-06-25: James Osborn
Added another SSE routine.
===============================================================================
2006-06-25 version 1.6.0
===============================================================================
2006-06-25: James Osborn
*** Changed gamma matrix conventions. ***
The spin projection and gamma multiplication routines now agree with
the conventions of QDP++. Programs using these must be changed
accordingly. Updated documentation with conventions.
===============================================================================
2006-05-24 version 1.5.4
===============================================================================
2006-05-24: James Osborn
Really fix compilation on Mac OS/X.
Lower case types now have a '1' afterward in the filename.
===============================================================================
2006-05-23 version 1.5.3
===============================================================================
2006-05-23: James Osborn
Fix compilation on Mac OS/X.
Source file names are now mangled to prevent clashes due to case.
Lower case types now have the abbreviation doubled in the file name.
The API is not affected.
2006-04-03: James Osborn
Added some more SSE routines.
===============================================================================
2006-03-13 version 1.5.2
===============================================================================
2006-03-13: James Osborn
Added qla_sse.h to include/Makefile.am.
===============================================================================
2006-03-10 version 1.5.1
===============================================================================
2006-03-10: James Osborn
Added TEST_CC and TEST_CFLAGS variables to set compiler and flags
used for the test suite.
2006-03-07: James Osborn
Reworked SSE routines to work with both gcc and icc. This is only
for sse not sse2 which will still try to use gcc.
Added some more SSE routines.
The new routines check for alignment and then call the appropriate
routine. The checking is not perfect and there is still the
possibility of an alignment error on poorly aligned data.
===============================================================================
2005-11-30 version 1.5.0
===============================================================================
Some functions involving the eqm, peq, and meq varieties of global sums
have been removed. Use the eq form and then accumulate separately.
2005-11-29: Removed optimized C routine since generated code is similar now.
2005-11-26: Updated test suite to include all remaining new functions.
Increased tolerance on tests so that fewer errors are reported.
2005-11-26: Removed unnecessary functions that weren't being tested.
This involves only the eqm, peq and meq varieties of global sums.
2005-11-25: Improved perl generated source for Power architectures.
===========================================================================
2005-11-19 version 1.4.1
===========================================================================
2005-11-19: Added the optimized C routine: QLA_F3_V_vpeq_M_times_pV.c
2005-11-19: Removed restrict from the headers so C99 isn't necessary to link.
===========================================================================
2005-11-10 version 1.4
===========================================================================
Note that the internal layout for some fields has changed so all codes
linking to QLA should be recompiled for this version.
2005-11-10: Changed layout of spin objects to better accommodate SSE.
2005-10-08: Improved perl generated source.
2005-10-03: Added some functions to help with Wilson type fermions.
===========================================================================
2005-06-15 version 1.3.1
===========================================================================
Improved some of the SSE routines.
===========================================================================
2005-04-27 version 1.3
===========================================================================
Rearranged directories to have a top level lib and include.
Now available from Jlab CVS.
===========================================================================
2003-09-04 version 1.2
===========================================================================
This is the third beta release of the QLA C library. Again it has passed
many tests and is coming closer to having a stable interface.
The major changes are:
Conversion to use autoconf and automake.
Added macro QLA_Ns in qla_types.h set to the number of spins (default 4).
Added functions QLA_S_eq_S with all indexing varieties.
Inclusion of a few optimized routines for specific architectures.
===========================================================================
2003-02-04 version 1.1 released
===========================================================================
This is a second beta release of the QLA C library. Like version 1.0
it has passed a battery of tests, but should still be used with caution.
1. Dropped the equivalent Gauge and StaggeredPropagator types and
consolidated to a "ColorMatrix" type with type code "M" (which
was previously used for StaggeredPropagator).
2. Changed StaggeredFermion to ColorVector.
3. Introduced enhanced precision for the result of global reductions.
A new long double type with code "Q" has been introduced.
Its use is limited to returning results from global reductions
and to type conversion back to double.
Global reductions may now produce results in the same precision
as the source, or in the next higher precision.
e.g. QLA_QD3_m_eq_sum_M gives result as type QLA_Q3_ColorMatrix
The addition of long double results requires the possibility of
conversion from long double back to double. e.g. QLA_DQ3_M_veq_M
This change adds new libraries and headers with identifiers qla_dq,
qla_dq3, qla_dq2, qla_dqn, qla_q, qla_q2, qla_q3, qla_qn, and new
accessors, e.g. QLA_Q_elem_M.
4. New elementary unary functions:
ceil, floor, cosh, sinh, tanh, log10
5. New elementary binary functions added to mod, min, max
ldexp, pow, atan2
The naming conventions may seem a bit idiosyncratic...
QLA_F_R_eq_R_atan2_R, QLA_F_R_eq_R_pow_R, QLA_F_R_eq_R_ldexp_I
6. Local squared norm and local inner product
Rather than performing a reduction, these operations generate
a Real or Complex array. e.g.
QLA_F3_R_xeq_norm2_M, QLA_D2_C_veq_D_dot_D
We also include the real part of the inner product, as in
QLA_D3_R_veq_re_V_dot_V
The allowed source types now include matrices as well as vectors,
as illustrated. The meaning of inner product for matrices A, B is
the natural choice: Tr (A^\dagger B)
We have tried to maintain the parallelism in reduction operations:
QLA_D3_r_veq_re_D_dot_D
The change to lowercase r distinguishes global from local
operation.
7. Real to integer conversion...
We now distinguish truncation and rounding. Previously we had
only truncation, so we change the name
QLA_F_I_eqop_R -> QLA_F_I_eqop_trunc_R
and introduce
QLA_F_I_eqop_round_R
Since "round" was first introduced in the standard C library in C99,
we provide a version of "round" for older compilers. The source
code is in the subdirectory c99-src and the library is libqla_c99.a
(just one module for now!).
8. Other added functions
M_eqop_spintrace_P
T_eqop_i_T (multiplication by i for all complex types)
C_eq_C_divide_C
I_eq_I_divide_I
9. Integer reductions now produce only a Real type result:
i_eqop_sum_I -> r_eqop_sum_I
i_eqop_I_dot_I -> r_eqop_I_dot_I
i_eqop_norm2_I -> r_eqop_norm2_I
10. A new real part of the global inner product
e.g. QLA_F3_r_veq_re_D_dot_D
11. Miscellaneous name changes
plus_i_times_R -> plus_i_R
realtrace_M -> re_trace_M
imagtrace_M -> im_trace_M
Re -> re
Im -> im
12. New complex macros for computing the real and imaginary parts
of products:
e.g. QLA_r_peq_Re_ca_times_c for Re(a^* * b)
===========================================================================
2002-11-10 Initial release - version 1.0
===========================================================================
This is the first beta release of the QLA C library. It has passed a
battery of tests, but should be used with caution in production running
until we have had more experience with it.
Code generation is done through Perl scripts. The most
straightforward workable code is produced with no particular effort
made to help the compiler with optimization.
Future releases will feature selected performance tests.
===========================================================================