### Abstract:

One of the fundamental problems in Statistics is the identification of dependencies between random
variables. Standard tests of dependence such as Pearson’s , cannot identify all possible non-linear
dependences. The main difficulty in the design of effective tests of independence is the wide variety of
association patterns that can be encountered in the data.
In this work we will address this problem using three different approaches:
In a first approach, mean embedding is used to map a probability distribution onto an element of a
Reproducing Kernel Hilbert Space (RKHS). Since such space is endowed with a metric, to perform an
independence test one simply needs to compute the distance between the element in the RKHS that
corresponds to the joint distribution of the random variables considered, and the element onto which
the product of the marginal is transformed. Given that the joint distribution of two independent random
variables is the product of the marginals, if this distance is non-zero, the variables are dependent.
If the RKHS is sufficiently rich, a zero value of this distance also characterizes independence. This
independence test is referred to as the Hilbert-Schmidt Independence Criterion (HSIC).
In a second approach, random non-linear projections are used to represent the data in a highdimensional
space. Provided that some mathematical conditions are fulfilled for the non-linear projections,
linear correlations in this extended space of random features characterize non-linear dependencies
in the original one. Taking advantage of this property, it is possible to design a test for independence,
which is known in the literature as the Randomized Dependence Coefficient (RDC) test.
Finally, in a third approach, one computes a specific distance between the cumulative distribution
functions of the random variables. We will show how for dimensions greater than one, this distance
becomes more difficult to compute. Therefore, this quantity is expressed in terms of a distance between
the corresponding characteristic functions, which are in general more manageable. This method is
referred to as the Distance Covariance (DCOV) test.
In this work, a set of Python tools have been developed to compute these different dependence
measures, so that independence tests based on them can be carried out. The properties of these tests
are the analyzed and compared in an exhaustive set of experiments. From this evaluation, we conclude
that the RDC test is both effective and efficient in most of the cases considered.