### abstract ###
We describe a novel approach to statistical learning from particles tracked while moving in a random environment
The problem consists in inferring properties of the environment from recorded snapshots
We consider here the case of a fluid seeded with identical passive particles that diffuse and are advected by a flow
Our approach rests on efficient algorithms to estimate the weighted number of possible matchings among particles in two consecutive snapshots, the partition function of the underlying graphical model
The partition function is then maximized over the model parameters, namely diffusivity and velocity gradient
A Belief Propagation (BP) scheme is the backbone of our algorithm, providing accurate results for the flow parameters we want to learn
The BP estimate is additionally improved by incorporating Loop Series (LS) contributions
For the weighted matching problem, LS is compactly expressed as a Cauchy integral, accurately estimated by a saddle point approximation
Numerical experiments show that the quality of our improved BP algorithm is comparable to the one of a fully polynomial randomized approximation scheme, based on the Markov Chain Monte Carlo (MCMC) method, while the BP-based scheme is substantially faster than the MCMC scheme
### introduction ###
Graphical model approaches to statistical learning and inference are widespread in many fields of science, ranging from machine learning to bioinformatics, statistical physics and error-correction
Such applications often require evaluation of a weighted sum over an exponentially large number of configurations --- a formidable  SYMBOL -hard problem in the majority of cases
In this paper we focus on one such difficult problem, which occurs when tracking identical particles moving in a random environment
As long as particles are sufficiently dilute, their tracking in two consecutive frames is rather straightforward
When the density of particles and/or the acquisition time increase, many possible sets of trajectories become statistically compatible with the acquired data and multiple matchings of the particles in two consecutive snapshots are likely
Despite of these uncertainties, one expects that reliable estimates of the properties of the environment should still be possible if the number  SYMBOL  of tracked particles is sufficiently large
This is the problem that we want to address here
The nature of the moving particles and their environment are not subject to particular restrictions, eg they might move actively, such as living organisms, or passively
Here, we shall consider the case of a fluid seeded with passive particles, a problem arising in the context of fluid mechanics experiments
Given a statistical model of the fluid flow with unknown parameters, along with the positions of  SYMBOL  indistinguishable particles in two subsequent snapshots, one aims at predicting the most probable values of the model parameters
This task is formally stated in Section~ as searching for the maximum of a weighted sum over all possible matchings between particles in the two snapshots
The problem turns out to be equivalent to computing the permanent of a non-negative matrix, known to be a  SYMBOL -complete problem~ CITATION
The main contribution of this paper is  an efficient and accurate algorithm of Belief   Propagation (BP) type for calculating the permanent  for the class of weight matrices arising from the particle tracking problem
The BP algorithm seeks a minimum of the Bethe Free Energy~ CITATION  for a suitable graphical model
The graphical model is a fully connected bipartite graph: nodes are associated with the measured particles, edges are weighted according to the model of the flow transporting the particles and constraints enforce the condition that exactly one edge per node is active
It is known that BP gives the exact result for the maximum likelihood version of the problem (finding a maximum weight matching) in spite of multiple loops characterizing the graphical model~ CITATION
The BP algorithm for the matching problem is derived and discussed in Section~
BP equations could be understood as a re-parametrization, or gauge transformation, of factor functions in the graphical model~ CITATION
Furthermore, BP solutions also provide an explicit representation of the exact partition function in terms of the so-called Loop Series~ CITATION
Our main technical result is  the derivation of a compact expression and efficient approximation for the Loop Series in the problem of weighted particle matching
This is done in Section~, where the Loop Series is expressed in terms of an  SYMBOL -th order mixed derivative of an explicit functional, reduced to  SYMBOL -dimensional Cauchy integral and finally estimated by a saddle-point approximation
Section~ describes empirical results demonstrating the performance of bare BP and the saddle-point improved BP in comparison with a (simplified) fully polynomial randomized approximation scheme for computing the permanent~ CITATION
Our improved BP achieves comparable accuracy, with significant gains in terms of speed
As the number of particles tracked in experiments is typically large (order tens of thousands) we argue that our approach is both useful and promising for applications
                                                                                                                                                                                                                                                                                                                                                                                                                                                                        loops
