Технологии высокопроизводительных вычислений в парциально-волновом анализе в физике частиц

Токарева В.А., Денисенко И.И.

Лаборатория Ядерных Проблем, Объединённый Институт Ядерных Исследований, Россия, 141980, Дубна, Жолио-Кюри, 6, +7 (49621)6-50-59, tokareva@jinr.ru

Partial wave analysis is a fundamental technique for extracting hadron spectra and studing hadron scattering properties. It is employed in some current experiments in particle physics like BES-III, LHCb, COMPASS, and future ones like PANDA. The analysis is typically performed using the event-by-event maximum likelihood method with the objective function \begin{equation} s = -\ln L. (1) \end{equation} Here $L$ is a likelihood to observe experimental events with measured momenta ($L = \prod_i P_i$, where $P_i$ is the probability for each collected event $i$). Finally, the probability $P_i$ is proportional decay amplitude squared: $P_i \propto |A_i|^2$.

Large amount of already collected data and planned increase of data flow in future make it especially important to develop the software capable of analyzing large sets of information at small times, since currently existing software is either not designed to be scalable for growing data amount, or has significant limitations in its capabilities. Fortunately, computing the objective function (1) can be naturally parallelized since the calculations for amplitude contributions can be done independently for each event.

In this work the development of highly scalable parallel framework for the partial wave analysis is described. Its design implies an open architecture allowing the user to extend it by using their own resonance models, minimizers, etc. The framework is intended to get rid of limitations of currently existing analogs, and to accelerate the calculations by employing OpenMP parallel computing technology, high-performance computing optimizations like vectorization or aligned memory access, and offloading computationally intensive parts of the code to massively parallel co-processors. The framework realization has been tested on both multi-core CPUs and multi-core Intel Xeon Phi co-processors in the offload execution mode.

At present time the architecture and performance of the framework are being optimized using the $J/\psi \to K^+K^-\pi^0$ decay (typical process that can be observed in BESIII experiment [1]) in the isobar model $J/\psi \to R_{KK}\pi^0 (R_{KK} \to KK)$ and $J/\psi \to R_{K\pi^0}K (R_{K\pi^0} \to K\pi^0)$, where $R_{KK}$ ($K\pi$) is the intermediate resonance in the $KK$ ($K\pi^0$) kinematic channel.

The computations were performed using the resources of the HybriLIT heterogeneous cluster [2]. The results on calculation speedup and efficiency as well as a comparative analysis of the developed parallel implementations are presented.

1. BES-III experiment [Electronic resource]: http://bes3.ihep.ac.cn/ (accessed 31.10.2016)

2. HybriLIT cluster, official site [Electronic resource]: http://hybrilit.jinr.ru/ (accessed 31.10.2016)

© 2004 Дизайн Лицея Информационных технологий №1533