Friday, March 14, 2014

Kepler False Positives: Separating the True from the False

The Problem: Electronic False Positives in Kepler Results

As described in my last post, the Kepler mission’s data pipeline has a flaw that generates electronic false positives – signals that look like a planet but correspond to nothing in the real world at all. My purpose here is to characterize that flaw in order to better determine which of the Threshold Crossing Events (TCEs) detected by Kepler are real.

One of the challenges for Kepler’s transiting method is determining when temporary dimming events in a star’s brightness are caused by a real planet transiting its star and when they are due to random fluctuations, called noise. This determination begins with the calculation of a TCE's signal-to-noise ratio (SNR), the strength of the dimming signals divided by the size of the variation that is typical in measurements of that star’s brightness. When SNR is high, the likelihood increases that the observation was of a real event, and not noise.

Observations take place over time. For a planet orbiting a sunlike star in an earthlike, the signal – a real transit – is up to about 12 hours long, whereas the noise is calculated over a period of up to 38 days. A problem with this approach is that changes in the performance of the electronics in Kepler’s instrument cause noise to vary significantly on time scales less than 38 days, so the actual noise at the time of a dimming event can be significantly higher than the average over 38 days. This means that noise is underestimated, and therefore SNR is overestimated. That is the root cause of an excess of electronic false positives: A modest discrepancy in SNR corresponds to a large increase in the probability of spurious detections. For example, a 20% overestimate in SNR, given a Gaussian distribution of occurrence and a detection threshold of SNR>7.1, increases the false positive rate by a factor of 10,800 (from 6.2•10-13 to 6.7•10-9).

The Kepler data processing pipeline is sophisticated, and the factors cited above are generally known to the mission team. Nevertheless, they are not fully accounted for. What I offer here is not a complete, analytical correction of the flaw in the pipeline, but a demonstration that unaccounted-for variations in noise produce the overwhelming majority of false positives in the TCE pipeline, and a starting point for accounting for that flaw.

Seasonal Noise

Noise varies widely for different Kepler targets, and for each target it varies over time. One way of measuring noise in Kepler light measurements, called Combined Differential Photometric Precision (CDPP), have been published for each Kepler target, for each quarter that it was observed. A formula that predicts the rate of Annual TCE false positives is as follows:

For a Kepler target, t, observed in three quarters x, x+4, and x+8, with the lowest three quarters of CDPP for t occurring in quarters a, b, and c, we define the seasonal noise, NS(t, x) as:

NS(t, x) = (CDPP(t, x) + CDPP(t, x+4) + CDPP(t, x+8))/(CDPP(t, a)  + CDPP(t, b)  + CDPP(t, c))

For example, for a Kepler target, t, with CDPP = 40 in quarters 1, 5, and 9, and CDPP = 20 in all other quarters, the
NS(t, 1) = 120/60 = 2, and
NS(t, 2) = NS(t, 3) = NS(t, 4) = 60/60 = 1.

This formula is motivated by two inferences: That the minimum noise over three quarters defines a useful baseline for a target's noise and that quarterly excess in comparison to that baseline will correlate with the potential excess noise at the time of a false transit in comparison to the ~38-day time scales over which the Kepler pipeline calculates noise. These inferences are shown to be justified by the empirical results given below, showing that the formula is excellent at predicting the circumstances in which false positives occur.

Results: Noise and False Positives, All Kepler Targets

113,860 Kepler targets were observed each quarter during Q1-Q12. Each such Kepler target was therefore observed during four seasons, where a season consists of three quarters, {x, x+4, x+8}. Each target-season combination has an associated value of seasonal noise, according to the definition given above.

For each Kepler target in this sample, we can rank its four seasons in terms of NS. If NS does not correlate with the detection of Annual TCEs, then we would expect its four seasons to have, on average, the same number of detected Annual TCEs. However, the set of all targets' lowest-NS seasons yielded only 8 Annual TCEs, whereas the set of highest-NS seasons yielded 866 Annual TCEs, 108 times higher than the lowest-NS seasons, and 88.4% of the total detected in all four seasons. We also see a strong effect of NS on the detection of multiple Annual TCEs. There were 168 targets with multiple Annual TCEs observed in the same season, and for 157 of those (93.5%), it was in the target's season of maximum NS.

The number of real planets transiting stars is unrelated to variations in Kepler's performance, so the excess seen in noisy quarters is wholly due to electronic false positives, and make up the great majority of all Annual TCEs. These results make it clear that seasonal noise is overwhelmingly the governing factor in the generation of false positives in the Kepler pipeline.

Results: Noise and False Positives, Sunlike Stars

To focus on finding earthlike planets around sunlike stars, we narrow the focus to stars that are about as hot as the Sun (classes K, G, and F) and about the same size (logg, a measure of the star's density, between 4.0–4.9), and we only consider Annual TCEs with periods that are earthlike (300-425 days). We also loosen one constraint from the previous analysis by including stars that were observed for as few as six quarters out of Q1-Q12.

It is easier to comprehend the data if we put the count of Annual TCEs in terms of the number of planets per star implied if all the detections are real. To do this, we need to estimate the completeness of the data: For every such planet that actually exists, how many could Kepler possibly detect? For the aforementioned earthlike/sunlike conditions, only 0.47% of such planets will have orbits aligned to allow a transit, and only 23.3% of those planets will happen to have had three transits recorded while Kepler was making observations, so only about 0.11% of all such planets that exist could possibly have been detected. It's actually worse than that if we take into account the fact that small planets can be lost in a star's noise, whereas for large planets like Jupiter and Saturn, that's rarely a concern. Here, we'll ignore that and note that any results we get could be considerably higher if we took into account planet size. We will call the number of planets in earthlike orbits per sunlike star ƒE, equalling the count of Annual TCEs per Kepler target divided by 0.11%, and remember that our measure of ƒE ignores the undercount of small planets.


Recalling the observation that certain areas of the Kepler instrument’s detector are responsible for an excess of Annual TCE detections, we call those areas, shaded pink at right, Hot Zones, and the remainder of the detector surface Cool Zones. In the graph below, we show ƒE as calculated for the Cool Zones in blue, and for the Hot Zones in red. In addition, we show the incidence of systems with multiple Annual TCEs, scaled in the same way, as dashed lines, also separated into Cool and Hot zones using the same color coding.



We see, as in the previous section, the significant impact of NS on the detection of Annual TCEs. For higher levels of NS, even in the Cool Zone, the detection is approximately 30 times that of the lowest values of NS. Hot Zones show higher rates of detection than Cool Zones, which indicates that NS alone does not account for all of the factors governing false positive generation, and I would suggest that this is because quarterly CDPP correlates with noise fluctuation over the shorter ~38-day time scale of noise calculation highly, but imperfectly. We also see that the proportion of targets showing multiple Annual TCEs is low, but not zero, in the lowest-NS condition, but nearly equal to the number of single-detection cases for the noisiest cases at right. The NS metric identifies the leading cause of electronic false positives, and further refinement that takes into account shorter time scales might be able to clean up the results considerably.

Real planets in earthlike orbits


It is intriguing to consider the cases of least noise (the solid blue line, furthest at left). This is the condition with the smallest number of electronic false positives. (Besides electronic false positives, there are also astrophysical false positives, which are larger bodies that only seem to be the size that Kepler seems to show, which is another story.) The analysis here doesn’t show that the rate is zero for the lowest values of NS, but it is lower than elsewhere. In my next post, I’ll examine the 87 TCEs that resemble earthlike planets orbiting sunlike stars and use NS as a filter to begin examination of the ones most likely to be real.

Tuesday, March 4, 2014

Kepler and Earthlike Planets: Lost in the Noise?


From 2009-2013, the Kepler space telescope monitored over 150,000 stars in its search for extrasolar planets. Kepler used the transiting method, watching its targets stars almost continuously for four years, searching for signs of a possible planet as it transits in front of its star, blocking some of the star's light and therefore making the star appear to dim temporarily at regular intervals. As of this writing, Kepler's data has been used to find over 3,600 planet candidates, of which 961 have been classified as definite planets, while the rest comprise a mixture of real planets and false positives with statuses yet to be determined.

With the transiting method, it is easier to find planets that are large – like Jupiter in our solar system – and close to their star – closer in many cases than Mercury is to the Sun. Accordingly, Kepler found many planets that are larger than Earth, and many that are closer to their stars than Earth, and are therefore hot. Kepler also found some planets Earth-sized or smaller and some other planets about the same temperature as Earth, but it is not clear yet if it detected any planets that were earthlike in terms of both size and temperature, which would seemingly make them good candidates for supporting earthlike habitats and perhaps life. Unfortunately, that combination of properties also makes earthlike planets hard for Kepler to detect, even if they are in fact common. This is because small planet with earthlike temperature would create transit signals that are relatively weak and, during Kepler's four years in service, relatively few in number. To search for planets like these depends upon an excellent understanding of the noise, or random variation (not literally sound like we hear), in Kepler's measurements of a star's brightness.

The following points are particularly relevant:

1) Kepler discovered that stars are noisier than was expected. This is a scientific discovery in its own right, but unfortunately, it means that finding earthlike planets is more difficult than expected.
2) Kepler’s instrument exhibited significant discrepancies in noise levels across the surface of its detector.

Making the situation more complicated, mission operations required Kepler to rotate 90° clockwise in one maneuver at the end of each three-month quarter, so the noisy areas of its electronics moved from one place to another in Kepler's observed sky field, returning to the same place every four quarters. In addition, the level of electronic noise depended upon the temperature of the electronics, and so the extent of the problem varied from quarter to quarter and even within quarters. Analyzing Kepler observations without taking into account these seasonal noise variations leads to the detection of electronic false positives, anomalies where purely coincidental fluctuations in a star's brightness are misinterpreted as the transits of a real planet. 

What makes this particularly insidious is that false positives caused by Kepler's noisy electronics often fit this profile: Three relatively minor dimming events that occur four quarters apart as the same noisy region on Kepler's detector surface goes through four 90° rotations to come back around to report that a star experienced three dimming events with a period of about a year. That is very much like what a real earthlike planet orbiting a sunlike star would look like in the data. Therefore, any real earthlike planets in Kepler's data are lost in a much larger number of false positives.

This problem showed up profoundly in a December 2012 release of Kepler results. That analysis used the first three years of observations to produce a list of 18,406 possible planets. These entities, called Threshold Crossing Events (TCEs) are results at an early stage in the data processing, with further checks needed to validate which of the TCEs might be planets and which are false positives.

If we define “earthlike” as planets that have radius 50–125% of the Earth’s and receive radiative heating from their star at levels between that of Venus and Mars, then the TCE release contained 87 possible earthlike planets. This news produced a great deal of excitement because it would mean that at least a few earthlike planets had been found among those 87 even if only a few percent of them were real. However, because of the noise problem, it is credible that each of the 87 was an electronic false positive (caused by noise) or astrophysical false positive (the detection of a real astronomical object that only seems to be an earthlike planet).

A dramatic way to illustrate the problem of false positives caused by electronic noise is to look only at those TCEs which were detected as a result of three total dimming events occurring four quarters apart. This means that the same region of detector surface was used to detect all three dimming events. Because the time between dimming events was about a year, we call these Annual TCEs. They are displayed in Figure 1, below (click to enlarge).

Figure 1: Annual TCE: TCEs detected as three possible transits at four-quarter intervals in the same location of Kepler’s detector surface. 1a: Annual TCEs plotted according to sky coordinates, color-coded by season. 1b: Annual TCEs plotted according to instrument surface coordinates, color-coded by number of Annual TCEs per star, with black=1, blue=2, red≥3. Shaded boxes indicate selected regions of elevated incidence of Annual TCEs.

Figure 1a shows the Annual TCEs as they appear in the sky with the number for the corresponding instrument module during Spring observation seasons (Quarters 1, 5, 9, and 13). In each successive quarter, Kepler rotated 90° clockwise around the center of this field so that, for example, Module 2 spent Summer quarters observing the same patch of sky that Module 10 watched during Spring quarters. The most obvious motif in Figure 1a is a collection of three prominent square regions seen in sectors 7, 9, and 17, which are densely packed with TCEs as a result of the same anomalous electronics hosted in Module 17 rotating throughout the seasons, creating a high number of false positives in all of the seasons except Winter. This anomaly is seen more prominently in Figure 1b, which plots the Annual TCEs in instrument coordinates, so all four seasons' TCEs associated with the same area of the electronic detector surface appear atop one another. This shows the unique nature of the Module 17 anomaly; although there are other areas of anomalously many Annual TCEs, none is as large and densely packed as this one. Modules 9 and 18 contain two of the next-most prominent anomalous regions. Overall, the anomalies are most prevalent in the upper right and lower left quadrants of this display.

It may seem tempting, at this point, to try to create a purified data set by identifying all combinations of detector area and season that produce an undue number of false positives, and excluding them from further consideration. In the ideal case, we might hope to identify a sufficient number of anomalies to exclude, keep the rest of the observations, and thus arrive at a set of TCEs relatively cleansed of electronic false positives.

Achieving his goal is not as straightforward as a casual glance might seem to indicate. While the areas shaded in pink certainly contain many of the total false positives, that does not imply that the areas outside them lack false positives. Note the number of blue and red points plotted in Figure 1b. Each of these is a Kepler object where multiple TCEs with a period of about one year were detected. In many of these cases, the data, if all such TCEs were planets, would show that the star hosted multiple planets with about the same orbital period, and in many cases implausibly close to one another at the time of detection. These orbits are not plausible, so most blue points and essentially all red ones indicate electronic false positives. These systems are concentrated in the most anomalous zones, but are scattered all around the detector surface, indicating that the problem is not confined to a limited set of anomalous zones. To produce a clean set of possible planets, we need to account for the specific conditions that cause electronic false positives.

The basic problem described above has led to Geoffrey Marcy and Erik Petigura of UC Berkeley to create their own data processing pipeline, called TERRA, which accounts for the same systematic sources of noise I described above. In my next post, I’ll provide my description of the noise and how it can be accounted for in order to focus on which of those 87 earthlike TCEs might be real planets.