Dark Energy Survey Year 3 Results: Point-Spread Function Modeling

We introduce a new software package for modeling the point-spread function (PSF) of astronomical images, called Piff (PSFs In the Full FOV), which we apply to the first three years (known as Y3) of the Dark Energy Survey (DES) data. We describe the relevant details about the algorithms used by Piff to model the PSF, including how the PSF model varies across the field of view (FOV). Diagnostic results show that the systematic errors from the PSF modeling are very small over the range of scales that are important for the DES Y3 weak lensing analysis. In particular, the systematic errors from the PSF modeling are significantly smaller than the corresponding results from the DES year one (Y1) analysis. We also briefly describe some planned improvements to Piff that we expect to further reduce the modeling errors in future analyses.


INTRODUCTION
The Dark Energy Survey (DES Collaboration 2016, DES) has already produced very precise constraints on cosmology (Abbott et al. 2018) using just the first year of data (Y1).The Y1 weak lensing (WL) cosmic shear measurements alone were able to constrain the combination σ8 (Ωm/0.3) 0.5 to 3.5% uncertainty (Troxel et al. 2018).Such precise constraints require that systematic uncertainties be controlled to levels smaller than the statistical uncertainties.mjarvis@physics.upenn.edu The first three years of DES data (collectively referred to as Y3) thus require even better control of the various systematic effects that impact shear measurements.
One of the most significant systematic uncertainties in the DES Y1 cosmic shear analysis was the estimation of the pointspread function (PSF) at the location of each galaxy.PSF estimation is difficult because the PSF varies both spatially across the field of view (FOV) and temporally from one exposure to the next.The FOV of the Dark Energy Camera (DECam; Flaugher et al. 2015) is quite large, covering a diameter of 2.2 degrees and containing 62 2k×2k CCDs (charge coupled devises).These CCDs are not per-fectly coplanar, which means that the PSF variation is moderately discontinuous at the edges of each CCD.Furthermore, the PSF can be directly measured only at the locations of stars, which are essentially point sources, so their surface brightness profiles are direct measurements of the PSF at those locations.Since the galaxies are observed at different locations, the PSF must be interpolated to the locations of the galaxies.
In the Y1 analysis effort, it was clear that while the PSF estimation was sufficiently accurate to not significantly bias the Y1 cosmic shear results, we would need to make some improvements for the Y3 analysis, which has smaller statistical uncertainties.Other recent cosmic shear experiments have similarly montioned PSF modeling errors as a significant source of systematic uncertainty (Hamana et al. 2020;Hildebrandt et al. 2020).
The Y1 PSF estimation used the software package PSFEX (Bertin 2011).This package estimates the PSF at the location of each star using a linear combination of basis vectors.In our case, the basis used was simply the pixel values on a grid.The coefficients of this model are interpolated using a polynomial in the chip coordinates, x and y, across the area of each CCD image.
This method worked very well and is quite robust.However, Zuntz & Sheldon et al., (2018) detected some aspects that could potentially be improved for future DES analyses.First, the mean measured size of the PSF models showed a non-negligible offset from the mean measured size of the input stars.Second, a map of both size and shape residuals vs location on the focal plane exhibited spatial patterns that were clearly related to features of the astrometric distortion solutions for DECam.
To address both of these issues, we developed new PSF estimation software, named PIFF (PSFs In the Full FOV), which we describe in this paper.The principal goals for how to extend beyond the capabilities of the tried-and-true PSFEX software were: • Description of the PSF in sky coordinates rather than pixel coordinates to better account for the effects of astrometric distortion.
• Potential to fit the PSF simultaneously across multiple detectors in the FOV in cases where continuous behavior across detectors could be exploited for better fitting.
• Ability to use astrometric maps that are not expressible in the FITS (Flexible Image Transport System) standards1 .
• Code written in or easily accessible from Python.
• Modular design to enable new (potentially non-linear) PSF models, outlier rejection algorithms, and interpolation schemes.
The version of PIFF used for the DES Y3 WL analysis is 0.2.4,but PIFF has been in continuous development since the Y3 PSF models were finalized.In this paper, we demonstrate diagnostics for the DES Y3 PSF model based on version 0.2.4,but we will also point out various features that have already been improved upon in later versions.Throughout the paper, we will mention the choices used for the DES Y3 WL analysis where we discuss the various options enabled by PIFF.The complete configuration file that was used for this analysis is given in Appendix B.
We give an overview of the procedure for building a PSF model for an exposure in §2.We describe the options in PIFF for the PSF surface brightness model, interpolation schemes, and how it finds the overall solution in §3-5.We describe the DES Y3 data in §6 and show our tests of the PSF solution on these data in §7.In §8, we discuss some potential future improvements to PIFF cur-rently in development, and we conclude in §9.Detailed instructions for using PIFF can be found in the online code documentation2 .

SCHEME OF OPERATION
At any particular location in the field of view, the point-spread function is a two-dimensional function describing the mapping from a delta function (a point) in the sky at coordinates (u0, v0) to a surface brightness profile measured by the detector at coordinates (x0, y0) as Iimage(x − x0, y − y0).The functional form of this mapping is called the "model" in PIFF.
In PIFF, we usually express the model in sky coordinates, for which we use the notation (u, v), rather than image (pixel) coordinates, for which we use (x, y).The features of the PSF created by atmospheric and optical distortions tend to vary more smoothly across the field of view when considered in sky coordinates than in pixel coordinates.This is especially true in the presence of highfrequency components of astrometric distortion such as "tree rings" (Kotov et al. 2010;Plazas et al. 2014a) which originate in the detector.While detector-based components of the PSF (such as charge diffusion) may or may not have this property, the DES modeling has been found to be overall better-behaved when using sky coordinates.Fitting in pixel coordinates remains an option in PIFF for applications where this is not true.A mixed model is left to future development.
The sky coordinates u and v are defined in the local tangent plane projection of the sky, as seen from Earth, around a nominal position of the source.Positive v is to the north, and positive u is west.Converting the surface brightness profile between image and sky coordinates is straightforward given the knowledge of the world coordinate system (WCS), which defines the functions u(x, y) and v(x, y).
Iimage(x, y) = I sky (u, v) du dx dv dx du dy dv dy (2.1) where the last factor is the determinant of the Jacobian of the coordinate transformation, which we identify as the pixel area, Apix.
We normalize this function to have unit flux3 , du dv I sky (u, v) = dx dy Iimage(x, y) = 1. (2.2) Henceforth, we will dispense with the "sky" label and merely use I(u, v) as the surface brightness profile in sky coordinates.
The function I(u, v) here is taken to include a convolution by the pixel response.This is sometimes referred to as the "effective PSF" (ePSF; Bernstein 2002), which we take to be continuous even though it is only sampled at the pixel centers.It is an implementation detail of the various models in PIFF whether the underlying description is natively the ePSF or the PSF profile without the pixel.Different models handle this differently, but all models know how to properly draw themselves onto an image including the correct application of the pixel response.
The data for star i consist of the (sky-subtracted) counts diα in all pixels indexed by α in a region around star i.If the pixels each subtend an area Apix on the sky, then the model for the observed surface brightness of star i with flux fi and sky position (ui, vi) is diα = fiApixI(uiα−ui, viα−vi). (2. 3) The likelihood of obtaining the data given the model is given by where we have assumed that the model and data are in units of photoelectrons, so that the total variance of a pixel is the sum of the read/background variance σ 2 iα and the Poisson variance from the expected counts, diα.
The PSF model will always be a function of some vector of parameters p.We will denote as pi the parameters selected to fit the data for star i.When we refer to individual parameters, we will use k to index them: pik .
We must also define some interpolation scheme to provide a function p(u, v, c), which can return p at an arbitrary location (u, v) in the sky domain of the exposure and potentially depend on additional properties of the target, which we denote as c since this is most likely to be a color.The interpolated function will be somehow trained on the particular fits pi measured at the (ui, vi, ci) of the PSF stars.
One basic sequence of operations for a PIFF run for a given exposure (or portion thereof) is: (i) Read as input the list of stellar sky coordinates (ui, vi) for candidate PSF-fitting stars i = 1, . . ., N , pixel data di for the regions around each star, and noise levels σiα at each pixel.We may also have other potential data on the stars upon which the PSF can depend, in particular a color ci.
(ii) Specify a parametric form I(u, v|p) for the PSF model.
(iii) Specify a method to interpolate the PSF parameters p k from the positions (and perhaps colors) of the stars to an arbitrary test point (u, v, c) in the domain.
(iv) Execute a fit of the model to the pixel data at each star to yield a maximum-likelihood estimate pi at the location of the star.
(v) Train or fit the interpolation with the pi vectors.
(vi) Possibly identify and excise outlier stars that are deemed to be poor exemplars of the PSF according to some metric.
(vii) Iterate from step (iii) to refit and reject outliers until convergence is reached.
An alternative approach, described in §4.4 and also implemented in PSFEX, is to merge steps (iii) and (iv) into a direct solution for the parameters of the interpolating function, bypassing the single-star solutions.This has a higher level of computational complexity, but enables the use of PSF models that are not fully specified by a single star, e.g if the model has significant power below the Nyquist scale for the given pixel size, or if any stars have missing data.
We will detail the components of the procedure in the following sections.First, though, we note that there is a subtlety to the process: there is a degeneracy between the choice of center (ui, vi) for a star and the functional form of I(u, v).We can always shift the nominal center, insert a countering shift into the model, and obtain the same fit.There are two possible means to break this degeneracy.PIFF allows one to choose whichever is more appropriate, based on the nature of the input centers.
The first mode, which we label "centered-PSF," is to force the model to be centered, i.e. du dv {u, v}I(u, v) = 0, and treat the positions (ui, vi) of the stars as free parameters.This is appropriate when the stellar positions and/or the WCS solution are not trusted to be extremely accurate, so offsets are more likely indicative of errors in the input rather than real PSF centroid motion.In this mode, the centroid parameters of the PSF model are forced to zero and are not included in the interpolation.It would be left to other processing steps to evaluate any variable astrometric displacements across the exposure.This option is the default in PIFF, or it can be explicitly specified by setting centered=True when defining a Model.This is the mode that we used for the DES Y3 WL analysis.
The second mode, which we will call "fixed-star," is to trust the input positions (ui, vi) of the stars, and assign the shift freedom to the parameters of the PSF function I(u, v).This would be appropriate if the input positions and WCS are both known to be extremely accurate.In this case the centroid offset can be taken to be due to stochastic atmospheric refraction or other instrumental effects, which can then be interpolated along with the other PSF parameters.This option is enabled by setting centered=False in PIFF when defining a Model.

PSF MODEL
PIFF provides a number of possible choices for the functional form of the model, I(u, v), with different advantages and disadvantages in terms of simplicity and realism.

Analytic radial profiles
The simplest PSF models in PIFF are based on isotropic radial functions of the intensity.This radial function may be sheared and dilated, but is otherwise fixed to a given functional form:.
where g1, g2, and s are free parameters in the fit, which effect the shear and dilation4 .Normally, these analytic profiles are taken to describe the PSF profile without the pixel response, so the full ePSF profile, which we've been calling I(u, v), would be where * denotes convolution and P (u, v) is the square pixel response projected into sky coordinates according to the WCS.This convolution is handled automatically by GALSIM5 (Rowe et al. 2015) when the profile is being drawn.However, there is also an option to tell GALSIM not to include the pixel convolution when drawing the PSF by setting in-clude_pixel=False, in which case I(u, v) = Ip(u, v).This is not normally recommended (not least because the pixel response is not a radial function), but it can be useful in some very simple scenarios for testing purposes.
There are three available options in PIFF for the radial function, f (r): • Gaussian uses the radial function where σ is taken to be 1 arcsecond, since it is degenerate with the overall dilation that is allowed by the fit.Gaussian profiles are not usually particularly good matches for real PSFs, but they are very simple, and therefore are sometimes useful for simulations.
• Moffat uses the radial function where r0 is degenerate with the dilation so is essentially arbitrary at this point6 .The β parameter controls the concentration of the profile and must be explicitly specified.We do not yet have the capability to fit for β as part of the fit, although this could be added if someone has a use case that requires it.
• Kolmogorov is defined in Fourier space (Racine 1996) where r0 here is the Fried parameter (Fried 1966), a property of the atmospheric conditions at the time of the observation.The realspace function is the Hankel transform of F Kolmogorov (k): In all cases, the overall amplitude A is set such that the integrated flux is unity.Because the radial functions are naturally centered, implementing the centered-PSF mode is simple.To induce fixed-star mode, we need to add two additional parameters (uc, vc) to the set of model parameters to specify how the center is offset from the nominal position.These are subtracted from (u, v) prior to the application of the shear and dilation (equation 3.3).
The maximum likelihood solution is found for 6 parameters: , g1, g2, s, uc, vc] (3.9) using non-linear least squares (with scipy).The estimated covariance matrix of the solution is computed from the Jacobian matrix J returned by scipy: which may be used by the interpolator if needed (currently only GPInterp can use it; cf.§4.3).Then the final parameter vector for the PSF model is a subset of the full maximum likelihood solution: [g1, g2, s, uc, vc] fixed-star (3.11)where the flux of the star A is ignored.For centered-PSF mode, the fitted uc, vc values are used to update the nominal position of the star.

Pixel grid
The PixelGrid model is much more general than the analytic profiles.It models the PSF profile as a two-dimensional grid of points, smoothed into a continuous function by a 1d kernel function K(x): (3.12) There are Npix free parameters p k in the model, where Npix is the total number of pixels in the model grid.The center (u k , v k ) of each pixel k are set onto a regular square grid of chosen spacing and dimension.The grid orientation is parallel to the u and v directions.
For the kernel function, we use the Lanczos interpolation kernels, K(x) = Ln(x), where and n is a free (integer) parameter, whose default in PIFF is n = 3.The grid of pixels where the model is defined does not need to be the same size or orientation as the pixels in the observed image.Indeed, since the model is constructed in sky coordinates, there is always some WCS transformation from model space to the data, which means that in general an interpolation would always be required to the constrain the model parameters defined in (u, v) coordinates from data in (x, y) coordinates.This is the model we used for the DES Y3 WL analysis.Specifically, we used a 17 × 17 grid of pixels with a pixel scale of 0.30 arcseconds.These are about 15% larger than the data pixels (0.263 arcsec), which we found helped improve the robustness of the fit compared to using model pixels nearly the same size as the data pixels.
The coefficients p k for a given star can be constrained by minimizing where the sum on α is over the observed data pixels and the star has some flux f and centroid (uc, vc).Minimizing this leads straightforwardly to a design matrix for the coefficients for which the maximum likelihood solution is Since the model has translational freedom, the centered-PSF mode described in §2 is not as simple as it was for radial profiles.The centroid of the pixel grid profile is a derived property based on all Npix parameters-as is the overall flux constraint.To implement this mode, the current version of PIFF starts with an initial estimate of the flux and centroid based on simple (0th and 1st order) moments of the data, diα.The model centroid is not forced to zero during the fit, and the flux is not forced to unity, so the solution can have non-zero centroid and non-unit flux.Then, at the start of each iteration, the position of each star is updated such that the best match to the current model would have zero centroid.Similarly, the model is renormalized to have unit flux, and the flux of the star is updated to match the best fit to this model.This is essentially a projected gradient descent algorithm on these three parameters.This algorithm tends to converge quickly to models with zero centroid and unit flux in almost all cases.The fixed-star mode uses the same pattern, but only updating the flux, which tends to converge even faster.
The version of PIFF that we used for the Y3 analysis (version 0.2.4) used a different algorithm for the centered-PSF mode.The details of this algorithm are given in Appendix A. However, we found that this could sometimes lead to numerical instabilities in the solutions during the interpolation step, leading to spurious checkerboard patterns when extrapolated to locations not near any constraining stars (cf. Figure 1).This failure mode was mitigated by using a model grid size that was somewhat larger than the data pixel size (0.3 arcseconds vs 0.263 arcseconds), which made the checkerboard failures rare.We believe the blacklist procedure described in §6.4 removed most of the CCDs that were still affected by it.The algorithm described above does not suffer from this failure mode, but we did not discover this solution until after the Y3 PSF solutions were finalized.We expect that future applications of PIFF will be able to use a grid size commensurate or even smaller than the data pixels, although this has yet to be extensively tested.

PSF INTERPOLATION
The profile of the PSF is not constant across the field of view.We have measurements of the PSF at the locations of the stars, but we generally need to know the PSF at other locations, such as where galaxies are observed.
PIFF has a number of potential methods for doing the interpolation, which can typically be matched with any of the various PSF models described in §3.The information about the model is parameterized by a vector p with measurements (or constraints) taken at the locations of the stars pi(ui, vi).
All currently-implemented interpolation schemes except the BasisPolynomial ( §4.4) follow the separated-solution scheme of §2, whereby a maximum-likelihood solution pi is first derived for each star for the specified model.Then a second solution is derived for the parameters of the interpolated function p(u, v, c) by fitting to or training on the pi vectors.

Simple polynomial interpolation
The simplest possible interpolation scheme is to take the model as constant across the field of view.This is not usually a particularly good approximation to reality, but it can be useful in some cases, such as a simulation that really does have a constant PSF model.This simplistic interpolation scheme is called Mean in PIFF.
Slightly more complicated, but still rather simple, is Polynomial interpolation.In this scheme, each coefficient in the vector p, p k , is interpolated according to an arbitrary polynomial in u and v (and possibly other parameters assigned to each star, such as color).
where Ki is a basis vector giving the relevant polynomial terms at the location of star i: Ki = {1, ui, vi, u 2 i , uivi, v 2 i , . . .}.Note that the basis may include other factors, such as ones involving a color term ci, if desired.Terms in the u and v parameters are included up to a maximum total m+n.The order of the polynomial may be different for each parameter if desired.The coefficients Q km are found by a maximum-likelihood fit to the pik values.

K-nearest neighbors
The kNNInterp class in PIFF implements a k-nearest neighbor regression (Altman 1992) on the parameter vectors pi.The estimated parameter vector at an arbitrary location (u, v) is taken to be the weighted average of the k constraining stars nearest to that location.
The implementation is based on the SCIKIT-LEARN7 class KNeighborsRegressor, which can weight the k points either uniformly or inversely by their distance to the interpolation location.The default weight is uniform, which we find usually gives better results for PSF interpolation, since it smooths over typical noise in the measurements of the PSF vectors.
This interpolation scheme can potentially give better performance for PSF patterns that have complicated functional forms, which are not well modeled by a polynomial.The appropriate value of k to choose (default is 15) depends on both the stellar density of the observations and the expected scale length of variations in the true PSF pattern.

Gaussian processes
The GPInterp class in PIFF implements a Gaussian process regression (Rasmussen & Williams 2006) on the parameter vectors pi.Gaussian process regression, also known as "Kriging," assumes that the parameter vector p at every location is drawn from a multidimensional Gaussian random field across the (u, v) space.It requires an estimate of the spatial covariance function of p, commonly referred to as the kernel.The interpolation estimate at an arbitrary location (u, v) is the minimum variance unbiased estimate from the Gaussian distribution at that location conditioned on the values pi measured at all the PSF stars.
The GPInterp class is implemented using the TREEGP module8 .It can use a number of possible kernels from SCIKIT-LEARN to define the covariance matrix along with a few custom options, which we use in PIFF.The default kernel is the so-called "squared  3) the residuals from a second order polynomial fit on each CCD, and (4) the residuals from an anisotropic Gaussian process fit over the whole FOV.Each panel shows "whiskers" whose length is proportional to the ellipticity of the PSF, and whose orientation matches the orientation of longest axis of the PSF shape.In both cases, the residuals are shown only for a set of reserve stars, which were not used to fit the PSF model.
exponential" or "radial basis function" kernel, known as RBF in SCIKIT-LEARN.
The TREEGP module has several methods for how to optimize the kernel's hyper-parameters, which are available in PIFF via the optimizer parameter: (i) likelihood is the traditional maximum likelihood optimization.It finds the hyper-parameters that maximize the likelihood of the Gaussian process solution.This is similar to the optimization done by the GaussianProcessRegressor in SCIKIT-LEARN, although technically TREEGP has a custom implementation, which uses scipy.optimizefor its back end.
(ii) isotropic uses a direct measurement of the two-point correlation function of the pik values using TREECORR9 (Jarvis et al. 2004;Jarvis 2015).This is the default optimizer, since it is typically much faster and may be somewhat more accurate than the the maximum likelihood method.
(iii) anisotropic is similar to isotropic except that it uses the TwoD binning option in TREECORR, which measures the correlation function in two spatial directions, rather than just radially.This is particularly useful if the PSF pattern is significantly anisotropic (e.g.due to a predominant wind direction).In this case, the resulting kernel is anisotropic as well.
(iv) none does no optimization of the hyper-parameters.This is probably only useful for tests where you may know the true kernel and don't want any optimization to be performed.
Typically when Gaussian processes are used on noisy data, one should include a "white noise kernel" representing noise in the individual measurements pik .PIFF (by default) accounts for this using the estimated variance from the model fits.All models in PIFF calculate variance estimates along with the best fit values, which are usually sufficient for this purpose.However, there is an option to include an additional white noise kernel if one thinks that these estimates are insufficient to completely describe the measurement noise.
Gaussian process regression makes the assumption that the expectation value of the parameters p k at each location is zero (or some function that can be modeled with hyperparameters).If this is not true, e.g. because there is some static pattern imposed on top of the Gaussian variation, then one would need to subtract off this static pattern before running the Gaussian process interpolation.PIFF includes a mechanism to do this, called meanify, which computes the mean values of the measured parameters over a number of exposures to estimate the static pattern.This can then be subtracted before using the Gaussian process interpolation.
In simulations of atmospheric PSF variation, we have found that Gaussian process interpolation works very well when the true PSF only includes an atmospheric component.Figure 2 shows an example where we simulated a purely atmospheric PSF pattern.The first panel shows the true pattern of PSF ellipticities without noise.The second shows the ellipticities with measurement noise.We fit this pattern both with a second order polynomial across each CCD and with a Gaussian process (using the anisotropic optimizer) across the full FOV.In both cases, we reserved 20% of the stars to use for validation.The residuals on the reserve stars in the two cases are shown in the third and fourth panels.The Gaussian process is clearly superior, leaving much smaller residuals than the polynomial, although to be fair, this is experiment is slightly circular, since the simulation was made with a Gaussian process for the atmosphere.
We have not used this method for DES Y3 data though, because it does not work very well when the real PSF includes an optical component, including discontinuities at the chip boundaries.We are still working on a PSF mode that can properly account for optical effects (including the chip boundaries) coupled to a Gaussian process interpolation for the atmospheric component (cf.§8.2).

Basis-function polynomial interpolation
An alternative to the simple polynomial interpolation implemented in Polynomial is to delay the full solution of the p k values at the location of each star until the code is ready to also fit for the interpolation coefficients.This scheme is implemented in the Ba-sisPolynomial class.This is the interpolation scheme we used in the DES Y3 WL analysis to interpolate the PixelGrid model.
For the previous interpolation schemes, the maximumlikelihood estimate pi is derived using an iterative process, since the model equation 2.3 is not linear in the fluxes and centroids of the stars, and may not be linear in the parameters p ki .Furthermore it is important to note that the likelihood in equation 2.4 contains the model in the noise estimate.The iteration step consists of finding the least-squares solution for the differential parameter shift δpi to the equation Aiδpi = bi (4.2) PIFF computes Ai and bi of the design equation using the values of pi and diα derived in the previous iteration.The derived δpi are then added to the solution to obtain the parameters for the next iteration.After the first pass, the shifts are generally relatively small adjustments in the fit.
For the BasisPolynomial interpolation, we instead directly model pi in terms of the interpolation coefficients Q km via where Ki is a basis vector giving the relevant polynomial terms at the location of star i: Ki = {1, ui, vi, u 2 i , uivi, v 2 i , . . .}.The design equation for the iteration step of an individual star can be rewritten as The BasisPolynomial algorithm essentially concatenates the design equations for all the stars i into a single system which can be solved for δQ, thus using the information from all of the stars at once.One advantage of the BasisPolynomial interpolation scheme over the simpler Polynomial scheme is that it is less affected by missing data such as from cosmic rays or bad columns.The design matrix for a star with missing data can omit those pixels from the constraint, in which case the solution for the coefficients Q km will use the other stars to constrain any degrees of freedom that involve those pixels.This feature is especially important for the PixelGrid model, where missing data generally leads to singular (or at least poorly conditioned) matrices for those stars.With an interpolation scheme that requires the solution pi be completed for each star, these stars would have to be excluded.But using Ba-sisPolynomial, such stars can still provide the information that is available, while leaving out the pixels that cannot provide any constraining power.
Another nice feature of this scheme is that it is straightforward to properly include the uncertainties.This scheme directly uses the variance in the pixel counts without requiring the intermediate calculation of Cov (p) and propagation of that through the interpolation fit.

SOLVING FOR THE PSF
When solving for the full PSF solution, PIFF uses an iterative approach to successively improve the solution.This has a number of advantages over a single-pass direct solution.First, it allows for the possibility of rejecting outlier stars after each iteration if there are some stars that are not good exemplars of the PSF (perhaps because they are binary or have a neighbor contaminating the image around the star).Second, it makes it easier to handle missing data (e.g.bad pixels or columns); we may, if we wish, use the solution from the other stars to backfill the missing pixels for a particular star, although this is not necessary when using BasisPolynomial.Third, as noted above, the iterative process allows us to use the model diα rather than the measurement diα to estimate each star's Poisson contribution to the noise.Using the Poisson shot noise estimated from the observed signal is necessarily biased, since pixels that happen to scatter high will have too high an estimated variance and be under-weighted, and those that scatter low will have too low an estimate and be over-weighted.
The overall procedure starts by making a small cutout image of the observed pixels from the full image around each star, centered as nearly as possible at the star's input position.By default these cutout images are 32 × 32 pixels square, but this can be changed if desired.There are options to remove some stars at this point based on measurements of the star on these cutout images.
Then for each input star, PIFF initializes the solution with a rough guess for the model.Different models do this in slightly different ways, but generally they start with something like a Gaussian profile matched to the stars' measured second moments.This defines the initial model vectors pi.
During each iteration of the solution phase, PIFF either solves for the shifts δpi according to the chosen model, as described in §3, or computes the design equation for these shifts (equation 4.2) if using BasisPolynomial for the interpolation.
Then PIFF solves for the interpolation coefficients according to the chosen interpolation scheme, as described in §4.This finishes the solution for this iteration.The solution is evaluated at the location of each input star as the starting point for the next iteration.
At the end of each iteration, PIFF can look for outliers and remove them from consideration for the next pass.It also computes the overall χ 2 and degrees of freedom (dof) of the solution.If there were no outliers removed and the change in χ 2 is below a userset threshold or if it has reached a user-set maximum number of iterations, then the process stops, and the solution is written to an output file.

Input data
PIFF has a number of options for specifying the input data.First, it needs the image (or images) making up the field for which to solve for the PSF.One can run PIFF on a single CCD image or on all the images from a single exposure (or some fraction thereof).Each image is typically input as a FITS file, including a weight or variance map, the world coordinate system (WCS), and possibly a bad pixel mask.The user can specify specific bits in the bad pixel mask to exclude from consideration.
One can also specify a different WCS to use rather than the one in the FITS file if desired.We used this feature in the DES Y3 WL analysis to use the improved PIXMAPPY solution (cf.§6.2), which includes tree rings and other subtle effects that cannot be expressed in the regular FITS WCS specification.
The other required input is a list of stars and their positions on the image(s).These can be given either as x and y pixel values or as right ascension and declination.PIFF does not currently have any ability to determine which objects are stars automatically, but it can use a flag column to select only a portion of the given input catalog if appropriate.
There are also options to remove some stars based on features of their images.For instance, one may specify a saturation threshold to exclude stars that have any pixels above this value.If the star's cutout image is partially off the image, it is normally excluded, but one can optionally keep such stars.One can also exclude stars whose measured size is an extreme outlier compared to the other stars (e.g.due to neighbors or image artifacts) or whose measured signal-to-noise ratio (SNR) is smaller than a given value.For the DES Y3 WL analysis, we excluded stars with SNR < 20.
In addition to removing stars that are deemed bad for some reason, there is also an option to reserve some stars from being used for the PSF solution to serve as fair test stars for diagnostics.These stars are not used for any part of the iterative solution, but they are included in the output catalog, marked with a flag indicating that they are reserve stars.For DES Y3, we reserved 20% of the input stars.
The weight map image is used to determine the measurement noise on the pixel values.The weight map is typically the inverse variance of each pixel, but PIFF can also read a (not-inverse) variance map.The input weight should ideally include only the estimated variance from the sky, read noise, dark current, etc., not including the Poisson shot noise of the signal itself.(This is the case for the processed images produced by DES data management.)As noted above, estimating the source shot noise from the measured counts will induce a bias on the measured PSF and fluxes.If the input weight (or variance) map includes the variance of the signal as well as the uniform sources of noise, then PIFF has an option to remove it using the gain (either provided or fit for by PIFF from the variance map and the image).
Finally, another option that we found to be important in the DES Y3 WL analysis is to down-weight the brightest stars.If each star keeps its real noise estimate, then most of the weight comes from just the few brightest stars in the image.This tends to bias the interpolation solution to overfit the modes that pass through these stars.Therefore, we have an option to effectively limit the nominal SNR of the bright stars to a given maximum.PIFF doesn't actually add noise to achieve this nominal SNR; rather, it just decreases the weight map to the level that would give bright stars this SNR value.The default for this parameter is 100, which was used for DES Y3.This choice was found to produce good results (cf.§7), but the results were not very sensitive to changing this by a factor of 2 or so.
Note that while Piff can be used on co-added images, rather than single-epoch images, it is not recommended if the dithering strategy includes large offsets.The PSF in the co-add is discontinuous at the location of every chip edge from the single-epoch exposures.These cause problems when trying to interpolate the PSF across the co-add image.Additionally, the current implementation assumes the noise is uncorrelated across pixels, which would not be true in general for co-added images.

Outliers
At the end of each iteration, there is an option to remove stars that are determined to be outliers and thus are probably not good exemplars of the PSF.Currently there is only one outlier method, although the code is written to accommodate the addition of other algorithms for identifying stars to remove.
The Chisq outlier method looks for stars whose measured χ 2 is very large.Specifically, for each star i, it sums over pixels α in the cutout of the star: where the sum is over all of the pixels in the cutout image for that star.As discussed above, the total noise in each pixel is taken to be the sum of the read/background variance σ 2 iα and the Poisson variation of the expected counts, diα.If the χ 2 value is more than some threshold, then the star is removed from consideration for subsequent iterations.
The usual way to specify the threshold is in terms of an effective number of "sigma".Given a specified nsigma value, PIFF calculates the corresponding probability that a Gaussian distribu-tion could exceed this many sigma.10For instance, nsigma=2 corresponds to p = 0.05, nsigma=3 corresponds to p = 0.003, etc.If preferred, users can also input the probability directly.
Then for each star, PIFF calculates the threshold for which this is the probability that the measured χ 2 would exceed the threshold purely from statistical noise, given the number of degrees of freedom for that star11 .For the DES Y3 WL analysis, we used nsigma=5.5,which corresponds to p = 4 × 10 −8 .
One can also specify a maximum number of stars to reject in each iteration.This is generally a good idea, since a small number of outliers can potentially skew the solution to the point where almost all of the stars have a bad χ 2 value.For DES Y3, we limited the rejection to at most 1% of the stars in the exposure (rounded up), which typically translated into either 1 or 2 stars per iteration.

Output files
Once the iteration has converged, PIFF writes the final solution into an output FITS file.The file format is rather complicated, using many HDUs (header/data units) to store the various kinds of information in a modular way.For instance each Model class and each Interp class stores different kinds of information, so each uses one or more HDUs in a class-specific way.
Users do not need to know anything about this file format however, since the piff Python module has code to read the output file and reconstruct a PSF object that can compute the correct PSF profile at an arbitrary location.PIFF is designed to serve the roles both of solving for the PSF solution and of using that solution for further analysis.
In addition to the FITS file containing the final PSF solution, one can also choose to have PIFF generate a number of diagnostic output files.These include a number of plots including residuals as a function of position in the field, diagnostic statistics, and more.These plots are not very useful for characterizing the quality of the fits for a large data set (such as DES Y3), since they are made for one exposure at a time, but they can be very useful for simple sanity checks when trying out different configuration choices for a particular data set.
One can also have PIFF output a catalog with measurements of the size and shape of both the PSF model and the actual star images.These catalogs are more useful for large scale diagnostics, since one can generate statistics for many images by combining these data.The plots in §7 are made from these residual measurements. 126 DATA The first three years of data from the Dark Energy Survey (DES Y3) covers nearly 5000 square degrees of the (mostly) southern sky, and includes close to 40,000 exposures reaching an i-band limiting magnitude (10σ detection, 2" aperture) of 23.4 (DES Collaboration 2018).We refer to Sevilla-Noarbe et al. ( 2020) and Morganson et al. (2018) for most of the details about the data reduction, including flat fielding, sky-subtraction, noise characterization, and object detection, but we mention some relevant points here.

Brighter-fatter correction
One of the most important improvements in the data reduction process compared to the DES Y1 reduction is that a correction was applied to remove the "brighter-fatter effect" (BFE; Antilogus et al. 2014;Guyonnet et al. 2015).The BFE is a natural consequence of Coulomb's Law for the electrons being accumulated in the detector.The electrons accumulated in high-flux pixels repel some of the other electrons arriving later in the exposure, which would have been expected to fall in these pixels.The later-arriving electrons tend to be pushed outward from the centers of bright objects, causing these objects to appear somewhat larger than they would have appeared in the absence of BFE.The enlargement is more significant for brighter and more compact sources (i.e. the ones with the highest surface brightness).
The impact of BFE is quite significant in DECam images, noticeably affecting the sizes of the brightest three magnitudes of stars on any given image (Melchior et al. 2015;Jarvis et al. 2016;Zuntz et al. 2018).In the Y1 analysis, we were thus forced to remove these stars from our sample of PSF stars to avoid biasing the PSF size.
The correction procedure applied to the Y3 images was originally proposed by Antilogus et al. (2014) and quantified for DE-Cam by Gruen et al. (2015).It involves moving some of the flux observed on the image back to where it would have fallen in the absence of BFE.This correction is applied directly to the pixel values early in the data reduction process (Morganson et al. 2018).
As we will see below ( §7.2), the correction does not work perfectly, but it corrects for about 90% of the full effect.We found that there was still a non-negligible bias in the sizes and shapes of stars within 1.2 magnitudes of saturation.Therefore, we still needed to remove these very bright stars from our PSF sample before running PIFF.However, this means that we included almost two magnitudes more stars in our PSF sample than in Y1, which significantly helped improve the Y3 PSF solutions.

World Coordinate System
Another important detector effect seen in earlier DES analyses is a circular pattern most easily seen in the flat field images.This effect, known as "tree rings", is due to lateral electric fields in the CCDs due to impurities in the silicon.The impurities are deposited as the silicon crystals are grown, which leads to a ring pattern around the initial crystal seed location.Plazas et al. (2014b) showed that the tree rings are primarily an astrometric effect causing the effective pixel size to vary radially around the centers of the silicon crystals.
For the DES Y3 WL analysis, we used an astrometric solution that included the tree ring information as part of the model, fit from the positions of objects on overlapping images (Bernstein et al. 2017).The model also included other astrometric effects including the telescope distortion pattern, edge distortion at the edges of each CCD, and adjustments in the precise positions of each CCD, which were determined to move slightly each time the camera warmed up.
The full solution to the world coordinate system (WCS) is implemented by the PIXMAPPY13 software package.We used this WCS solution for each exposure rather than the simpler TAN-PV WCS solutions stored in the FITS files.Since PIFF models the PSF in sky coordinates, the astrometric effects of the tree rings and other distortion effects are accounted for by transforming the image data of each star to sky coordinates.The led to a significant improvement in the PSF shapes compared to using the native FITS-based WCS solutions.

Selection of PSF Stars
The selection of stars to use for PSF modeling in DES Y3 used the same algorithm as has been used in both of the previous DES analyses: Science Verification (Jarvis et al. 2016) and Y1 (Zuntz & Sheldon et al., 2018).We refer to those papers for details, but we provide a brief summary here.
For the initial catalog of objects, we used SEXTRACTOR (Bertin & Arnouts 1996).We then used a size-magnitude diagram to identify stars as a locus of points with constant size separating from the larger cloud of galaxies.For the magnitude, we used the SEXTRACTOR measurement MAG_AUTO.For the size, we used the scale size, σ, of the best-fitting elliptical Gaussian profile using an adaptive moments algorithm.The locus is easy to identify by eye at bright magnitudes.Figure 3 shows an example size-magnitude diagram for a representative DES Y3 image.The pink and green points were identified as stars, and the black points are other objects (both galaxies and stars that may be too faint or noisy to be identified as such).
The algorithm we used to automate this identification starts with the brightest 10% of the objects (excluding saturated objects) and finds a tight locus at small size for the stars and a broad locus of galaxies with larger sizes.Then the algorithm proceeds to fainter magnitudes, building up both loci, until the stellar locus and the galaxy locus start to merge.The precise magnitude at which this happens is a function of the seeing as well as the density of stars and galaxies in the particular part of the sky being observed.The faintend magnitude of the resulting stellar sample thus varies among the different exposures.The green and pink points in Figure 3 were identified as stars by this algorithm.
As discussed above ( §6.1), the initial data processing included a correction for the brighter-fatter effect.However, we found that the brightest stars still showed some significant size residuals.We therefore removed the identified stars within 1.2 magnitudes of saturation to avoid these stars biasing the inferred PSF.We also removed stars that were close to the edge of the CCD or near the DECam "tape bumps" (Flaugher et al. 2015).We also removed a random 20% of the remaining stars as "reserve" stars for the diagnostic shown below in §7.The pink points in Figure 3 show the stars that were removed for one of these reasons.The green points represent the final stellar sample for this CCD, which was input into PIFF.

Blacklist
Immediately upon completion of a PIFF model, we perform some basic sanity checks to make sure the model seems plausible.If a PSF model for a particular CCD is considered suspect for any of the following reasons, we enter it into a "blacklist" and exclude this CCD from any further analysis.
• Too few stars: We flag images for which fewer than 25 stars survived the outlier rejection.
• Too many stars: We flag images where more than 30% of the objects were considered stars.This is unusual and generally indicates a problem with the star selection, rather than a truly dense stellar field.
• Outlier size: If the mean size of the PSF solution for one CCD was very different (4 sigma outlier) from the others on the same exposure, we assume either the fit or the star selection for that image failed.
• Large spread in the PSF model sizes: If the standard deviation of the sizes of the final PSF stars is more than 20% of the mean size, then this tends to indicate a bad PSF solution.
• Errors when running PIFF or the stellar locus codes: These were rare but happened occasionally.
• No PIXMAPPY WCS solution: A few exposures were taken during periods with insufficient calibration information to produce a reliable PIXMAPPY WCS solution.
A little fewer than half the exposures had at least one CCD blacklisted by these checks.Of these, the average number of CCDs removed from consideration of the downstream analysis was close to two.This blacklisting procedure thus led to the removal of about 2% of the Y3 data.

Tests of Stellar Purity
To produce an accurate PSF model, the selection of PSF stars should be very pure.If there are any galaxies included in the sample, then this will bias the resulting PSF size to be slightly larger than the true size.We therefore performed two tests to check whether and to what extent any galaxies were erroneously included in the stellar sample.First, we follow a test presented in Amon et al. (2018, Appendix C) for the Kilo-Degree Survey (KiDS) to match our catalog of PSF stars to the Gaia Data Release 2 (DR2) stellar catalog (Gaia Collaboration 2018).These are believed to be a very pure sample of stars (Bailer-Jones et al. 2019), but they do not extend as faint as our data.Thus, the matching is only expected to be close to complete at brighter magnitudes.
The left side of Figure 4 shows histograms of our stellar sample as a function of magnitude including the portion that matches the Gaia catalog (blue) and the portion that does not (red).The match is very close to complete at bright magnitudes; more than 98% of the PSF stars match a Gaia star over at least 2 magnitudes in all bands.At the fainter end, the Gaia catalog becomes incomplete, and the DES data include more stars, which are not matched to any Gaia stars.The g and Y bands also show an unmatched population at the bright end, since some stars brighter than the Gaia sample are unsaturated in DES data.Within the range of magnitudes where the Gaia catalog is complete, we find that essentially all the selected PSF stars are matched to Gaia stars.This implies that there are very few galaxies being included in the sample at these magnitudes.
In Figure 5, we show the colors of our selected stars with (left) and without (right) matches in the Gaia DR2 high purity stellar catalog.The objects without Gaia matches follows the same stellar locus as is seen for the objects with Gaia matches.In particular, this is true both for the bright unmatched objects (shown as individual points) and for the fainter objects (shown as the blue intensity scaling).There does not seem to be any significant sub-population of the bright or faint selected stars with different color properties than those of the high-confidence Gaia stars.
At fainter magnitudes, where the Gaia catalog is incomplete, we also looked at the size distribution of the stellar sample.The right side of Figure 4 shows size-magnitude diagrams of the stars, color coded in the same way as the histograms on the left.This test is less clear than where a direct match is possible, but there does not appear to be any significant population of objects with sizes at faint magnitudes that differ noticeably from the sizes of the bright stars, with the possible exception of g band.For g band, there does seem to be a cloud of red points with larger sizes than would be expected from the blue points.It is thus possible that there are some galaxies being included in the stellar population for g band.We chose not to .Histograms of the sizes (cf.equation 7.4) for PSF stars in the VHS overlap region according to whether they were successfully matched to the VHS catalog, and if so, whether they had colors consistent with being a star (gray) or not (red).The unmatched stars are blue.The outliers show a slightly higher mean size than the high-confidence stars, but we calculate that this bias is not significant.use the g band for any Y3 weak lensing analysis, in part because of this apparent contamination14 .
Another test of the stellar purity comes from the infrared (IR) colors of the objects.In particular, if we include K-band colors from the VISTA Hemisphere Survey (McMahon et al. 2019, VHS), then we find a simple color selection, provides a very good discrim-ination between stars and galaxies: This selection is shown in Figure 6.We consider the points that appear above this line (i.e. the > condition in equation 6.1) as likely to be non-stellar.These "color outliers" are thus likely to be poor objects to include in the PSF catalog.Some motivation for the use of IR bands for star-galaxy discrimination was given by Jarrett et al. (2000), who point out that galaxy light is dominated by old stellar populations with significant flux at 2 microns and that their redshifts tends to push additional light into IR bands.Similar color cuts using a combination of optical and near-IR colors have been found to be effective by, e.g., Ivezić et al. (2002); Baldry et al. (2010); Sevilla-Noarbe et al. (2018).
Unfortunately, the VHS catalog only covers about half of the DES Y3 footprint, and 2MASS (The Two Micron All Sky Survey, Skrutskie et al. 1997) and WISE (The Wide-field Infrared Survey Explorer, Wright et al. 2010) are both too shallow for this purpose.Therefore, we cannot apply this cut to our entire input sample.Additionally, this idea for selection was proposed after our Y3 PSF catalogs were finalized, so we did not even apply it for the portion where we had VHS overlap.Rather, we use this test to quantify how much the interloping galaxies might be biasing the size of the PSF.
Figure 7 shows the distribution of observed sizes of PSF stars in a single representative exposure according to their infrared colors.Blue shows the stars that were not matched to VHS observations, and so do not have a K-band magnitude.Gray shows the matched stars that fall below the condition in equation 6.1, and thus are expected to be true stars.Red shows the matched stars that fall above this condition, called color outliers, which are likely to be galaxies.For this exposure, the color outliers constitute less than 1% of our PSF stars, and their mean size is larger than the highconfidence stars by about 4%.This means the interloping galaxies may be inducing a fractional bias in the size of about 4 × 10 −4 for this particular exposure.
Figure 8 shows the fraction of color outliers and their mean fractional shift in measured size, ∆T / T , for a random sample of Y3 exposures in the i band with matching VHS data.While the specific values vary somewhat from exposure to exposure, there is no evidence of any exposures with particularly bad stellar identification.Nearly all exposures have less than 3% outlier fraction with a fractional size shift less than 0.05.This implies that the fractional bias in the PSF size is nearly always less than 1.5 × 10 −3 .This bias is small enough not to be important for Y3 weak lensing analyses, but it is not completely negligible, so we will try to improve upon this for the Y6 PSF modeling.

PSF DIAGNOSTICS
We have performed a number of diagnostic tests of the quality of the DES Y3 PIFF solutions.Among the various uses of the PSFs, weak lensing shear estimation generally has the strictest requirements on the quality of the PSF estimates.We have therefore primarily focused our attention on tests of errors in the PSF solutions that could adversely impact weak lensing shear estimates.
All of the diagnostics are calculated using a set of "reserve" stars, which were not used to constrain the PSF solutions.We reserved 20% of the selected stars (cf.§6.3) at random to constitute this reserve set and removed them from the list of stars passed to PIFF.These stars thus provide an unbiased estimate of the errors at random locations in the image.
We calculated the PSF solution at the location of each of these reserve stars to compare to the actual observed surface brightness profile of the star.We used NGMIX (Sheldon 2015) to measure the second moments of both the observed stars and the PIFF models drawn at these locations: These moments are not computed by direct summation, which is somewhat unstable in practice.Rather NGMIX finds the best fitting elliptical Gaussian profile to the observed flux distribution.The above moments are then taken to be the analytic second moments of this profile.The size T is defined as the trace of the moment matrix, and the complex ellipticity can be calculated as

Residuals in the field of view
Figure 9 shows the residuals of the the shape (e1 and e2) and size (T ) measurements for the reserve stars in riz bands as a function of position on the DECam focal plane.All three show a noticeable oscillatory pattern consistent with a 4th order polynomial on each CCD.This is due to the fact that our interpolation scheme is only at 3rd order, so the smallest order of variation not captured by our PSF model is at 4th order15 .
The size residuals show some additional, much smaller, circular patterns, which are more prominent on some chips than others.The most obvious example of this is found in the lower of the two left-most CCDs, but it appears quite significantly on several others as well.These patterns are very similar to the tree-ring patterns in the astrometry, implying that our procedure of measuring and interpolating the PSF in sky coordinates was not sufficient to fully remove the effects of the tree rings on the PSF size.
We investigated switching to using chip coordinates rather than sky coordinates to model the PSF, and the tree ring patterns in the residuals became significantly worse.The same was true when we used the simpler WCS solutions in the FITS files rather than using PIXMAPPY.Thus we know that using the PIXMAPPY WCS is working to reduce the impact of the tree rings; it just isn't sufficient to fully remove all of the effect.
We believe that at least part of the reason for this is that the total PSF size includes a component due to electron diffusion in the CCD.This component of the PSF is explicitly generated in chip coordinates, not sky coordinates, so modeling that part of the PSF in sky coordinates means that the WCS (including tree rings) is being applied where it should not be, leading to a signature of the WCS in the size residuals.However, we also note that Magnier et al. (2018) have identified variations in the charge diffusion size itself associated with the same doping variations that cause the astrometric tree-ring effect in the Pan-STARRS1 CCDs.This effect may be present in DECam CCDs as well, which could be contributing to the residuals seen in Figure 9.
When we develop the full optical plus atmospheric PSF model (cf.§8.2), we plan to include the possibility of having a Gaussian component in CCD coordinates applied at the end to better model this effect.

Residuals by magnitude
As discussed above in §6.1, an important feature of CCD data is the "brighter-fatter effect" (Antilogus et al. 2014;Guyonnet et al. 2015), where the charge of electrons accumulated in high-flux pixels repels some of the other electrons arriving later in the exposure, causing bright objects to appear somewhat larger than they would otherwise have appeared.Without any correction, this leads to an obvious trend of PSF size with magnitude.The effect is also anisotropic, which leads to a strong trend in the e1 shape residuals as well (Zuntz & Sheldon et al., 2018).As we have discussed, the Y3 data reduction process included a correction for this effect, which we can test by looking at PSF size and shape residuals as a function of magnitude.
The black points in Figure 10 show size residuals (upper panel), fractional size residuals (second panel), and e1 and e2 shape residuals (lower two panels) of the riz reserve stars as a function of their magnitude.The size residual is well below the level measured in Y1, and shows very little trend with magnitude, remaining below 0.5% over the entire range.The shape residuals also show no significant trend with magnitude.This implies that the brighter-fatter correction we applied, along with the bright star cut, is sufficient to remove the impact of this effect on the PSF solutions.

Trends with color
There are several physical effects that are expected to cause the PSF to be wavelength-dependent (Plazas & Bernstein 2012;Meyers & Burchat 2015).The PSF size from Kolmogorov seeing is expected to vary as λ −0.2 (Hardy 1998, p. 92) or even steeper when taking into account the so-called outer scale (Xin et al. 2018).Differential chromatic diffraction (DCR) causes the PSF to spread along the direction towards zenith, affecting bluer stars more than redder stars.Diffraction and other optical aberrations generally scale proportionally with λ.There are a few refractive elements, which have a non-trivial wavelength dependence.And the conversion depth of photons in the silicon increases with wavelength, which affects the PSF size, due to the fast beam leading to a shallow depth of focus.In each case the "PSF" subscript refers to the true PSF, as estimated from direct measurements of images of reserve stars, and "model" refers to the PIFF estimate of the PSF at those locations.The black points show the average values for all reserve stars in the r, i, and z bands, which are the bands used for cosmic shear measurements in the DES Y3 WL analysis.Cyan, orange, and pink points show the average values in thin slices of r − z color, ranging from blue to red.To reduce the impact of the brighter-fatter effect bright stars are excluded from our PSF models; the exact cut-off varies among CCD exposures but the shaded grey region shows a typical example.
We do not explicitly include any of these effects in the PSF modeling for the Y3 analysis (although see §8.4 for discussion of how we plan to include color dependence in the future).We thus expect the size residuals to be a function of the color of the stars.
The colored points in Figure 10 show the size and shape residuals for three thin slices in r − z color.Cyan shows the bluest stars, pink the reddest, and orange in between.The size residuals show a very clear trend with color.The blue stars are larger than the average model, and the red stars are smaller.This implies that the atmosphere (which causes red stars to appear smaller) is probably dominating over other effects (which mostly cause the size to increase with λ).
This can thus cause a bias in the inferred shapes of galaxies if the mean galaxy color is significantly different from the mean color of the stars used to constrain the PSF.See Gatti & Sheldon et al., (2020) for further discussion of the impact of this on the sample of galaxies used for the Y3 weak lensing shear catalog.
The e1 residuals (third panel of Figure 10) show very little color dependence, but the e2 residuals (fourth panel of Figure 10) do show a significant color dependence.The redder stars have a smaller than average e2 shape, and the bluer stars have a larger than average e2.
This trend seems to be primarily due to DCR.We have calculated the expected direction and magnitude of the DCR effect across the DES Y3 footprint, quantified by two numbers: where z is the zenith angle and q is the parallactic angle.We calculate the weighted mean of these numbers for each location on the sky, based on the observations that contributed to the co-add images (Sevilla-Noarbe et al. 2020), using HEALSPARSE16 for efficient access.Figure 11 shows the mean shape residuals binned by these DCR quantities for just the r-band observations where the DCR effect is strongest.The three sets of points correspond to the same r − z colors as in Figure 10.The DES observing history (mostly the particular history of the hour angle of each observation) happens to have favored negative values of both quantities.The binning scheme in Figure 11 is such that each point includes equal numbers of stars.The mean residuals for the red and blue color slices are close to zero when the DCR quantities are near zero, and they are large when the DCR quantities are large (in absolute value).The points do not perfectly follow the linear fits, but it is clear that DCR is the dominant effect driving the difference between the residuals for red stars and blue stars.The corresponding plots for i-band (not shown) are similar, but with smaller amplitude.The ones for z-band show almost no effect at all.These are what would be expected, since the magnitude of the DCR effect decreases quickly with increasing wavelength.Figure 13.The ρ statistics for the PSF residuals in the g band.These statistics are much worse that than the ones for the riz bands shown in Figure 12.Possible reasons for this are given in the text.These results contributed to the decision not to use the g band for DES Y3 WL analysis.

Rho statistics
Rowe (2010) introduced the original two rho statistics for quantifying the spatial correlations of errors in PSF models, ρ1(θ) and ρ2(θ).Jarvis et al. (2016) developed three more related statistics that appear at the same order in their potential impact on weak lensing two-point correlation functions: where ePSF is the observed ellipticity of the PSF stars, TPSF is the observed size of the PSF stars, δePSF is the difference between the measured ellipticity of the observed stars and the ellipticity of the PIFF models at the same locations, and δTPSF is the difference in the sizes of the observed stars and the PIFF models.These statistics, if non-zero, imply some systematic errors in the weak lensing shear correlation function, ξ+.There are corresponding statistics for ξ− contamination, but we find these to be negligible, so we focus our attention henceforth on these five rho statistics.
The five rho statistics for the riz bands are shown in Figure 12.The left panel shows ρ1, ρ3, and ρ4, which represent direct systematic errors in ξ+ with a leading coefficient of order unity (Jarvis et al. 2016).The right panel shows ρ2 and ρ5, whose impact on ξ+ is mediated by a coefficient, α, describing the amount of "PSF leakage" that occurs during the shear measurement process.
All of the rho statistics are small enough that they are not expected to cause significant systematic errors in the cosmic shear measurements.In particular, ρ1 is about a factor of 10 smaller at large scales and a factor of 4 smaller at small scales that what was achieved for the Y1 PSF solution (Zuntz & Sheldon et al., 2018, Figure 9).The impact of this on ξ+ had been one of the largest systematic uncertainties in the Y1 analysis.
At large scales, ρ2 is not much smaller than we found for the Y1 analysis, but its impact is also expected to be small because our shear measurement method has very little PSF leakage.The estimated value of the leakage parameter, α, for the DES Y3 WL analysis is of order 10 −2 .See Gatti & Sheldon et al., (2020) for details.
The rho statistics for each of r, i, and z bands separately (not shown) are somewhat noisier, but none of them show any particular problems.On the other hand, the g-band rho statistics, shown in Figure 13 are an order of magnitude or more larger for all five statistics, particularly at large scales.This could be due to the increased impact of DCR in the g band, or the non-stellar contamination in g band discussed in §6.5, or possibly other factors.Regardless of the cause, these rho statistics were considered unacceptably high, and we decided to exclude the g-band data from our Y3 weak lensing analysis.

PLANNED FUTURE IMPROVEMENTS
We have several plans for how the PSF solution could be further improved for the final DES year six analysis (Y6) as well as for other surveys with similar or smaller statistical errors.These features are all currently being developed for release in a future version of PIFF.

DECam Optical Model
We have been developing an optical model of the Blanco telecope and DECam focal plane to directly predict the effects of the op-tics on the PSF pattern.A preliminary version of this model was described in Davis et al. (2016).The model is based on measurements of images of stars in out-of-focus exposures, which produce "donut" images.These reference wavefront images allow for very precise estimation of the Zernike decomposition of the optical wavefront as a function of position in the focal plane, with only a small number of degrees of freedom for how these Zernike patterns change from one exposure to another.This model has now been implemented in PIFF, with either 10 or 11 17 free parameters that are fit from a given in-focus image.Most of the Zernike coefficients are obtained from the reference wavefront images and are not fitted using the in-focus image.We generally include coefficients up to Zernike order 38 in the reference information, but even higher order can be accommodated if desired.
Three of the degrees of freedom are currently the average size and shape of the atmospheric component of the PSF, which are not part of the optical model.This makes the current implementation of the optical model very accurate on average, but not particularly accurate in the variation across any single exposure, since the atmospheric component is the dominant contributor to this variation.This leads us to our next planned improvement.

Composite PSF
We have started to develop a composite PSF class in PIFF with multiple components convolved together to produce the complete PSF solution.Each component will be able to be separately fit, using different models and different interpolation schemes as appropriate for each.
This composite class is primarily intended to allow for the convolution of an optical model and an atmospheric model.These two components of the PSF are constrained in very different ways.The optical model, as described above, has very few free parameters to be fit for a given in-focus exposure, but its model of the surface brightness is quite complex including very high order Zernike aberrations.
The atmospheric component, on the other hand, is fairly simple in its description at the location of a single star, being well approximated by an elliptical Kolmogorov or von Karman profile.However, the variation of the PSF parameters across the field of view is quite complicated.Gaussian process interpolation is generally found to work well, which involves a large matrix inversion to solve for the relevant parameters.
We also expect that a third Gaussian component, modeling charge diffusion inside the CCD, will also be important to include, since this component interacts differently with the WCS (especially the tree rings) than the other two components (cf.§7.1).This component would probably only require a single fitted parameter for the entire focal plane, being the average scale size of the Gaussian diffusion.
Development of a composite model that combines all three components in a single iteration cycle is still ongoing.We are hopeful that we will have this working in time to use it for DES Y6 analysis. 17The amplitude of the spherical aberration may be either fixed or fitted.

Using Gaia stars as input
We have seen in §6.5 that the Gaia stellar catalog is somewhat shallower than the input stellar catalog we used for the DES Y3 WL analysis.Using this as our input stellar catalog would result in losing about half of the potential stars for constraining the PSF.
We are nonetheless considering switching to using that as our input catalog for Y6.The most obvious advantage to this is that it would avoid any concerns about galaxies leaking into the stellar sample (especially in g band where this was a particular problem).The Gaia catalog constitutes an extremely pure stellar sample.Also, the stars that would be included are mostly high signalto-noise; the stars we would be losing have less useful information about the PSF.
However, the bigger advantage is that the Gaia stars include extremely precise astrometric information.We could therefore switch to the "fixed-star" mode described in §2, where we would fix the true positions of the stars at the Gaia positions (taking into account parallax and proper motion to find the position at the time of each DES exposure) and allow the observed PSF profile to include a small centroid offset.
These centroid offsets are expected from the effects of atmospheric seeing.The atmospheric component of the PSF is integrated over a finite exposure time, and the net integrated pattern includes a small shift in the observed centroid.Furthermore, we expect these centroid offsets to be well fit by the same kind of Gaussian process interpolation that is effective for interpolating the size and shape parameters.
We thus plan to investigate including this centroid offset as part of the atmospheric component of the PSF to see if it can improve the overall astrometric modeling of the galaxies.

Including chromatic dependence
We saw in §7.2 that the size and shape residuals vary with the color of the stars.This is completely expected, since several of the physical effects that make up the PSF pattern are wavelength-dependent.
For the Y6 PSF solution, we plan to include a single color parameter (e.g.g−r) in the fit to account for the wavelength dependence on the PSF to first order.Then when using the PSF model for galaxies, the color of the galaxy would need to be provided to produce an estimate of the correct PSF for that specific galaxy.We are hopeful that including this color dependence will improve the g-band PSF solution sufficiently to allow it to be used for weak lensing in the Y6 analysis.
We designed the PIFF API to allow for this kind of color term to be included in the interpolation scheme, anticipating this use case.However, we have not yet tried using this functionality on real data, so there will likely be some development required to get it working on DES Y6 data.

SUMMARY
We have presented a new software package for PSF estimation of astronomical images, called PIFF, which was developed primarily for the DES Y3 WL analysis.The PIFF PSF models were tested on the Y3 data and show significantly smaller residuals than had been seen in the Y1 data.Most notably, the ρ1 statistic is more than an order of magnitude smaller at large scales than what was found for the Y1 PSF model.This had been one of the most significant sources of systematic uncertainty in the Y1 cosmic shear analysis.Development of PIFF is still very active.We have described in §8 several potential improvements we hope to include in the DES Y6 analysis.In addition, as part of the Dark Energy Science Collaboration (DESC) for the Vera Rubin Observatory Legacy Survey of Space and Time (LSST), we have been testing PIFF on simulated LSST images.Results of these tests will be reported separately, but we fully expect that PIFF will prove to be useful for LSST and other surveys in addition to DES.Indeed, the design of PIFF is very general, so we expect that it will be useful to many other current and future surveys who need accurate PSF modeling based on observations of stars.

Figure 1 .
Figure 1.An example of a "checkerboard" failure mode that can sometimes happen with PixelGrid solutions with version 0.2.4 of PIFF.On the right is the model for the PSF in one of the Y3 DES images, extrapolated to a location far from any of the constraining stars.Notice the checkerboard pattern, especially near near the upper right corner of the model.On the left is the PSF for the same location of the same image using PIFF version 0.3.0.The checkerboard pattern is no longer present.

Figure 2 .
Figure2.A simulation of a purely atmospheric PSF pattern.From left to right the panels show (1) the true PSF shapes without noise, (2) the shapes with measurement noise, (3) the residuals from a second order polynomial fit on each CCD, and (4) the residuals from an anisotropic Gaussian process fit over the whole FOV.Each panel shows "whiskers" whose length is proportional to the ellipticity of the PSF, and whose orientation matches the orientation of longest axis of the PSF shape.In both cases, the residuals are shown only for a set of reserve stars, which were not used to fit the PSF model.

Figure 3 .
Figure3.An example size-magnitude diagram for a single CCD image, used to identify stars.The size T = 2σ 2 is based on the scale size of the best-fitting elliptical Gaussian.The pink and green points are the objects initially identified as stars.The green points are the ones that pass our selection criteria outlined in §6.3, most notably the magnitude cut to avoid objects contaminated by the brighter-fatter effect.These objects are then used to constrain the PSF model.

Figure 4 .Figure 5 .
Figure 4. On the left are histograms of the PSF stars by magnitude for each band: g, r, i, z, and Y .The blue region are the PSF stars that were also in the Gaia DR2 catalog.Red indicates objects that we consider stars, which are not in the Gaia catalog.(Purple is all PSF stars.)On the right are plots showing the density of objects in size-magnitude space, using the same color scheme.

Figure 6 .
Figure 6.A color-color plot using VHS K band for one of the colors, motivating our VHS-based color selection test, equation 6.1.A random sample of 0.5 million matched DES-VHS objects is plotted.Morphological classifications indicated by the colorbar come from the Y3 Gold catalog (Sevilla-Noarbe et al. 2020).

Figure 8 .
Figure8.The fraction of identified PSF stars found to be color outliers (cf.Figure7) and their mean fractional shift in measured size, compared to the stars that fall below the condition in equation 6.1 and are thus highconfidence stars.Each point represents a single exposure.Results are plotted for a random sample of 1685 exposures in the i band each having at least 100 matched VHS objects (median of 7000 matched objects).

Figure 9 .
Figure9.Maps of the two components of the residual shape (δe 1 and δe 2 ) and the fractional residual size (δT /T psf ) as a function of position in the focal plane for riz bands.The shape components have a slight radial pattern, corresponding to high spatial frequency modes that our 3rd order interpolation across each CCD was unable to completely model.The size residuals show noticeable tree ring patterns in many CCDs.These patterns are much smaller with the PIXMAPPY astrometric solution than when we use the native FITS WCS solutions, but they are still significantly non-zero.

Figure 10 .
Figure10.The PSF residual size (top), fractional size (middle) and shape (bottom two) of stars as a function of their magnitude.In each case the "PSF" subscript refers to the true PSF, as estimated from direct measurements of images of reserve stars, and "model" refers to the PIFF estimate of the PSF at those locations.The black points show the average values for all reserve stars in the r, i, and z bands, which are the bands used for cosmic shear measurements in the DES Y3 WL analysis.Cyan, orange, and pink points show the average values in thin slices of r − z color, ranging from blue to red.To reduce the impact of the brighter-fatter effect bright stars are excluded from our PSF models; the exact cut-off varies among CCD exposures but the shaded grey region shows a typical example.

Figure 11 .
Figure11.The PSF shape residuals in r-band as a function of the mean direction and magnitude expected for differential chromatic refraction (DCR).The three sets of points correspond to the same three thin slices of r − z color as shown in Figure10.

Figure 12 .
Figure 12.The ρ statistics for the PSF residuals (equations 7.8 -7.12) in the riz bands.Negative values are shown in absolute value as dotted lines.