Lightcurve

class pyke.lightcurve.LightCurve(time, flux, flux_err=None, meta={})[source]

Implements a simple class for a generic light curve.

Attributes

time (array-like) Time measurements
flux (array-like) Data flux for every time point
flux_err (array-like) Uncertainty on each flux data point
meta (dict) Free-form metadata associated with the LightCurve.

Methods

bin([binsize, method]) Bins a lightcurve using a function defined by method on blocks of samples of size binsize.
cdpp([transit_duration, savgol_window, …]) Estimate the CDPP noise metric using the Savitzky-Golay (SG) method.
flatten([window_length, polyorder, return_trend]) Removes low frequency trend using scipy’s Savitzky-Golay filter.
fold(period[, phase]) Folds the lightcurve at a specified period and phase.
normalize() Returns a normalized version of the lightcurve.
plot([ax, normalize, xlabel, ylabel, title, …]) Plots the light curve.
remove_nans() Removes cadences where the flux is NaN.
remove_outliers([sigma, return_mask]) Removes outlier flux values using sigma-clipping.
stitch(*others) Stitches LightCurve objects.
to_csv([path_or_buf]) Writes the LightCurve to a csv file.
to_pandas() Export the LightCurve as a Pandas DataFrame.
to_table() Export the LightCurve as an AstroPy Table.
bin(binsize=13, method='mean')[source]

Bins a lightcurve using a function defined by method on blocks of samples of size binsize.

Parameters:

binsize : int

Number of cadences to include in every bin.

method: str, one of ‘mean’ or ‘median’

The summary statistic to return for each bin. Default: ‘mean’.

Returns:

binned_lc : LightCurve object

Binned lightcurve.

Notes

  • If the ratio between the lightcurve length and the binsize is not a whole number, then the remainder of the data points will be ignored.
  • If the original lightcurve contains flux uncertainties (flux_err), the binned lightcurve will report the root-mean-square error. If no uncertainties are included, the binned curve will return the standard deviation of the data.
cdpp(transit_duration=13, savgol_window=101, savgol_polyorder=2, sigma_clip=5.0)[source]

Estimate the CDPP noise metric using the Savitzky-Golay (SG) method.

A common estimate of the noise in a lightcurve is the scatter that remains after all long term trends have been removed. This is the idea behind the Combined Differential Photometric Precision (CDPP) metric. The official Kepler Pipeline computes this metric using a wavelet-based algorithm to calculate the signal-to-noise of the specific waveform of transits of various durations. In this implementation, we use the simpler “sgCDPP proxy algorithm” discussed by Gilliland et al (2011ApJS..197….6G) and Van Cleve et al (2016PASP..128g5002V).

The steps of this algorithm are:
  1. Remove low frequency signals using a Savitzky-Golay filter with window length savgol_window and polynomial order savgol_polyorder.
  2. Remove outliers by rejecting data points which are separated from the mean by sigma_clip times the standard deviation.
  3. Compute the standard deviation of a running mean with a configurable window length equal to transit_duration.

We use a running mean (as opposed to block averaging) to strongly attenuate the signal above 1/transit_duration whilst retaining the original frequency sampling. Block averaging would set the Nyquist limit to 1/transit_duration.

Parameters:

transit_duration : int, optional

The transit duration in cadences. This is the length of the window used to compute the running mean. The default is 13, which corresponds to a 6.5 hour transit in data sampled at 30-min cadence.

savgol_window : int, optional

Width of Savitsky-Golay filter in cadences (odd number). Default value 101 (2.0 days in Kepler Long Cadence mode).

savgol_polyorder : int, optional

Polynomial order of the Savitsky-Golay filter. The recommended value is 2.

sigma_clip : float, optional

The number of standard deviations to use for clipping outliers. The default is 5.

Returns:

cdpp : float

Savitzky-Golay CDPP noise metric in units parts-per-million (ppm).

Notes

This implementation is adapted from the Matlab version used by Jeff van Cleve but lacks the normalization factor used there: svn+ssh://murzim/repo/so/trunk/Develop/jvc/common/compute_SG_noise.m

flatten(window_length=101, polyorder=3, return_trend=False, **kwargs)[source]

Removes low frequency trend using scipy’s Savitzky-Golay filter.

Parameters:

window_length : int

The length of the filter window (i.e. the number of coefficients). window_length must be a positive odd integer.

polyorder : int

The order of the polynomial used to fit the samples. polyorder must be less than window_length.

return_trend : bool

If True, the method will return a tuple of two elements (flattened_lc, trend_lc) where trend_lc is the removed trend.

**kwargs : dict

Dictionary of arguments to be passed to scipy.signal.savgol_filter.

Returns:

flatten_lc : LightCurve object

Flattened lightcurve.

If return_trend is True, the method will also return:

trend_lc : LightCurve object

Trend in the lightcurve data

fold(period, phase=0.0)[source]

Folds the lightcurve at a specified period and phase.

This method returns a new LightCurve object in which the time values range between -0.5 to +0.5. Data points which occur exactly at phase or an integer multiple of phase + n*period have time value 0.0.

Parameters:

period : float

The period upon which to fold.

phase : float, optional

Time reference point.

Returns:

folded_lightcurve : LightCurve object

A new LightCurve in which the data are folded and sorted by phase.

normalize()[source]

Returns a normalized version of the lightcurve.

The normalized lightcurve is obtained by dividing flux and flux_err by the median flux.

Returns:

normalized_lightcurve : LightCurve object

A new LightCurve in which flux and flux_err are divided by the median.

plot(ax=None, normalize=True, xlabel='Time - 2454833 (days)', ylabel='Normalized Flux', title=None, color='#363636', linestyle='', fill=False, grid=True, **kwargs)[source]

Plots the light curve.

Parameters:

ax : matplotlib.axes._subplots.AxesSubplot

A matplotlib axes object to plot into. If no axes is provided, a new one be generated.

normalize : bool

Normalize the lightcurve before plotting?

xlabel : str

Plot x axis label

ylabel : str

Plot y axis label

title : str

Plot set_title

color: str

Color to plot flux points

fill: bool

Shade the region between 0 and flux

grid: bool

Add a grid to the plot

**kwargs : dict

Dictionary of arguments to be passed to matplotlib.pyplot.plot.

Returns:

ax : matplotlib.axes._subplots.AxesSubplot

The matplotlib axes object.

remove_nans()[source]

Removes cadences where the flux is NaN.

Returns:

clean_lightcurve : LightCurve object

A new LightCurve from which NaNs fluxes have been removed.

remove_outliers(sigma=5.0, return_mask=False, **kwargs)[source]

Removes outlier flux values using sigma-clipping.

This method returns a new LightCurve object from which flux values are removed if they are separated from the mean flux by sigma times the standard deviation.

Parameters:

sigma : float

The number of standard deviations to use for clipping outliers. Defaults to 5.

return_mask : bool

Whether or not to return the mask indicating which data points were removed. Entries marked as True are considered outliers.

**kwargs : dict

Dictionary of arguments to be passed to astropy.stats.sigma_clip.

Returns:

clean_lightcurve : LightCurve object

A new LightCurve in which outliers have been removed.

stitch(*others)[source]

Stitches LightCurve objects.

Parameters:

*others : LightCurve objects

Light curves to be stitched.

Returns:

stitched_lc : LightCurve object

Stitched light curve.

to_csv(path_or_buf=None, **kwargs)[source]

Writes the LightCurve to a csv file.

Parameters:

path_or_buf : string or file handle, default None

File path or object, if None is provided the result is returned as a string.

**kwargs : dict

Dictionary of arguments to be passed to pandas.DataFrame.to_csv().

Returns:

csv : str or None

Returns a csv-formatted string if path_or_buf=None, returns None otherwise.

to_pandas()[source]

Export the LightCurve as a Pandas DataFrame.

Returns:

dataframe : pandas.DataFrame object

A dataframe indexed by time and containing the columns flux and flux_err.

to_table()[source]

Export the LightCurve as an AstroPy Table.

Returns:

table : astropy.table.Table object

An AstroPy Table with columns ‘time’, ‘flux’, and ‘flux_err’.

class pyke.lightcurve.KeplerLightCurve(time, flux, flux_err=None, centroid_col=None, centroid_row=None, quality=None, quality_bitmask=None, channel=None, campaign=None, quarter=None, mission=None, cadenceno=None, keplerid=None)[source]

Defines a light curve class for NASA’s Kepler and K2 missions.

Attributes

time (array-like) Time measurements
flux (array-like) Data flux for every time point
flux_err (array-like) Uncertainty on each flux data point
centroid_col, centroid_row (array-like, array-like) Centroid column and row coordinates as a function of time
quality (array-like) Array indicating the quality of each data point
quality_bitmask (int) Bitmask specifying quality flags of cadences that should be ignored
channel (int) Channel number
campaign (int) Campaign number
quarter (int) Quarter number
mission (str) Mission name
cadenceno (array-like) Cadence numbers corresponding to every time measurement
keplerid (int) Kepler ID number

Methods

bin([binsize, method]) Bins a lightcurve using a function defined by method on blocks of samples of size binsize.
cdpp([transit_duration, savgol_window, …]) Estimate the CDPP noise metric using the Savitzky-Golay (SG) method.
correct([method]) Corrects a lightcurve for motion-dependent systematic errors.
flatten([window_length, polyorder, return_trend]) Removes low frequency trend using scipy’s Savitzky-Golay filter.
fold(period[, phase]) Folds the lightcurve at a specified period and phase.
normalize() Returns a normalized version of the lightcurve.
plot([ax, normalize, xlabel, ylabel, title, …]) Plots the light curve.
remove_nans() Removes cadences where the flux is NaN.
remove_outliers([sigma, return_mask]) Removes outlier flux values using sigma-clipping.
stitch(*others) Stitches LightCurve objects.
to_csv([path_or_buf]) Writes the LightCurve to a csv file.
to_fits()
to_pandas() Export the LightCurve as a Pandas DataFrame.
to_table() Export the LightCurve as an AstroPy Table.
correct(method='sff', **kwargs)[source]

Corrects a lightcurve for motion-dependent systematic errors.

Parameters:

method : str

Method used to correct the lightcurve. Right now only ‘sff’ (Vanderburg’s Self-Flat Fielding) is supported.

kwargs : dict

Dictionary of keyword arguments to be passed to the function defined by method.

Returns:

new_lc : KeplerLightCurve object

Corrected lightcurve

to_fits()[source]
class pyke.lightcurve.KeplerLightCurveFile(path, quality_bitmask=1130927, **kwargs)[source]

Defines a class for a given light curve FITS file from NASA’s Kepler and K2 missions.

Attributes

path (str) Directory path or url to a lightcurve FITS file.
quality_bitmask (str or int) Bitmask specifying quality flags of cadences that should be ignored. If a string is passed, it has the following meaning: * default: recommended quality mask * hard: removes more flags, known to remove good data * hardest: removes all data that has been flagged
kwargs (dict) Keyword arguments to be passed to astropy.io.fits.open.

Methods

compute_cotrended_lightcurve([cbvs]) Returns a LightCurve object after cotrending the SAP_FLUX against the cotrending basis vectors.
get_lightcurve(flux_type[, centroid_type])
header([ext]) Header of the object at extension ext
plot([plottype]) Plot all the flux types in a light curve.
PDCSAP_FLUX

Returns a KeplerLightCurve object for PDCSAP_FLUX

SAP_FLUX

Returns a KeplerLightCurve object for SAP_FLUX

cadenceno

Cadence number

campaign

Campaign number

channel

Channel number

compute_cotrended_lightcurve(cbvs=[1, 2], **kwargs)[source]

Returns a LightCurve object after cotrending the SAP_FLUX against the cotrending basis vectors.

Parameters:

cbvs : list of ints

The list of cotrending basis vectors to fit to the data. For example, [1, 2] will fit the first two basis vectors.

kwargs : dict

Dictionary of keyword arguments to be passed to KeplerCBVCorrector.correct.

Returns:

lc : LightCurve object

CBV flux-corrected lightcurve.

get_lightcurve(flux_type, centroid_type='MOM_CENTR')[source]
header(ext=0)[source]

Header of the object at extension ext

mission

Mission name

plot(plottype=None, **kwargs)[source]

Plot all the flux types in a light curve.

Parameters:

plottype : str or list of str

List of FLUX types to plot. Default is to plot all available.

quarter

Quarter number

time

Time measurements

class pyke.lightcurve.KeplerCBVCorrector(lc_file, likelihood=<class 'oktopus.likelihood.LaplacianLikelihood'>, prior=<class 'oktopus.prior.LaplacianPrior'>)[source]

Remove systematic trends from Kepler light curves by fitting cotrending basis vectors.

\[\arg \min_{\bm{\theta} \in \Theta} \sum_{t}|f_{SAP}(t) - \sum_{j=1}^{n}\theta_j v_{j}(t)|^p, p>0, p \in \mathbb{R}\]

Examples

>>> import matplotlib.pyplot as plt
>>> from pyke import KeplerCBVCorrector, KeplerLightCurveFile
>>> fn = ("https://archive.stsci.edu/missions/kepler/lightcurves/"
...       "0084/008462852/kplr008462852-2011073133259_llc.fits") 
>>> cbv = KeplerCBVCorrector(fn) 
Downloading https://archive.stsci.edu/missions/kepler/lightcurves/0084/008462852/kplr008462852-2011073133259_llc.fits [Done]
>>> cbv_lc = cbv.correct() 
Downloading http://archive.stsci.edu/missions/kepler/cbv/kplr2011073133259-q08-d25_lcbv.fits [Done]
>>> sap_lc = KeplerLightCurveFile(fn).SAP_FLUX 
>>> plt.plot(sap_lc.time, sap_lc.flux, 'x', markersize=1, label='SAP_FLUX') 
>>> plt.plot(cbv_lc.time, cbv_lc.flux, 'o', markersize=1, label='CBV_FLUX') 
>>> plt.legend() 

Attributes

lc_file (KeplerLightCurveFile object or str) An instance from KeplerLightCurveFile or a path for the .fits file of a NASA’s Kepler/K2 light curve.
likelihood (oktopus.Likelihood subclass) A class that describes a cost function. The default is oktopus.LaplacianLikelihood, which is tantamount to the L1 norm.

Methods

correct([cbvs, method, options]) Correct the SAP_FLUX by fitting a number of cotrending basis vectors cbvs.
get_cbv_url()
get_cbvs_list([method]) Returns the subsequence of subsequent CBVs that maximizes Bayes’ factor [R5].
coeffs

Returns the fitted coefficients.

correct(cbvs=[1, 2], method='powell', options={})[source]

Correct the SAP_FLUX by fitting a number of cotrending basis vectors cbvs.

Parameters:

cbvs : list of ints

The list of cotrending basis vectors to fit to the data. For example, [1, 2] will fit the first two basis vectors.

method : str

Numerical optimization method. See scipy.optimize.minimize for the full list of methods.

options : dict

Dictionary of options to be passed to scipy.optimize.minimize.

get_cbv_url()[source]
get_cbvs_list(method='bayes-factor')[source]

Returns the subsequence of subsequent CBVs that maximizes Bayes’ factor [R1].

Returns:

cbv_list : list

Subsequence of subsequent CBVs that maximizes the Bayes’ factor.

References

[R1](1, 2) https://en.wikipedia.org/wiki/Bayes_factor
lc_file
opt_result

Returns the result of the optimization process.

class pyke.lightcurve.SPLDCorrector[source]

Implements the simple first order Pixel Level Decorrelation (PLD) proposed by Deming et. al. [R2] and Luger et. al. [R3], [R4].

Notes

This code serves only as a quick look into the PLD technique. Users are encouraged to check out the GitHub repos everest and everest3.

References

[R2](1, 2) Deming et. al. Spitzer Secondary Eclipses of the Dense, Modestly-irradiated, Giant Exoplanet HAT-P-20b using Pixel-Level Decorrelation.
[R3](1, 2) Luger et. al. EVEREST: Pixel Level Decorrelation of K2 Light Curves.
[R4](1, 2) Luger et. al. An Update to the EVEREST K2 Pipeline: short cadence, saturated stars, and Kepler-like photometry down to K_p = 15.

Methods

correct(time, tpf_flux[, window_length, …])
Parameters:
correct(time, tpf_flux, window_length=None, polyorder=2)[source]
Parameters:

time : array-like

Time array

tpf_flux : array-like

Pixel values series

window_length : int

polyorder : int

class pyke.lightcurve.SFFCorrector[source]

Implements the Self-Flat-Fielding (SFF) systematics removal method.

This method is described in detail by Vanderburg and Johnson (2014). Briefly, the algorithm implemented in this class can be described as follows

  1. Rotate the centroid measurements onto the subspace spanned by the eigenvectors of the centroid covariance matrix
  2. Fit a polynomial to the rotated centroids
  3. Compute the arclength of such polynomial
  4. Fit a BSpline of the raw flux as a function of time
  5. Normalize the raw flux by the fitted BSpline computed in step (4)
  6. Bin and interpolate the normalized flux as function of the arclength
  7. Divide the raw flux by the piecewise linear interpolation done in step [(6)
  8. Set raw flux as the flux computed in step (7) and repeat

Methods

arclength(x1, x) Compute the arclength of the polynomial used to fit the centroid measurements.
bin_and_interpolate(s, normflux, bins, sigma)
breakpoints(campaign) Return a break point as a function of the campaign number.
correct(time, flux, centroid_col, centroid_row) Returns a systematics-corrected LightCurve.
fit_bspline(time, flux[, s]) s describes the “smoothness” of the spline
rotate_centroids(centroid_col, centroid_row) Rotate the coordinate frame of the (col, row) centroids to a new (x,y) frame in which the dominant motion of the spacecraft is aligned with the x axis.
arclength(x1, x)[source]

Compute the arclength of the polynomial used to fit the centroid measurements.

Parameters:

x1 : float

Upper limit of the integration domain.

x : ndarray

Domain at which the arclength integrand is defined.

Returns:

arclength : float

Result of the arclength integral from x[0] to x1.

bin_and_interpolate(s, normflux, bins, sigma)[source]
breakpoints(campaign)[source]

Return a break point as a function of the campaign number.

The intention of this function is to implement a smart way to determine the boundaries of the windows on which the SFF algorithm is applied independently. However, this is not implemented yet in this version.

correct(time, flux, centroid_col, centroid_row, polyorder=5, niters=3, bins=15, windows=1, sigma_1=3.0, sigma_2=5.0)[source]

Returns a systematics-corrected LightCurve.

Parameters:

time : array-like

Time measurements

flux : array-like

Data flux for every time point

centroid_col, centroid_row : array-like, array-like

Centroid column and row coordinates as a function of time

polyorder : int

Degree of the polynomial which will be used to fit one centroid as a function of the other.

niters : int

Number of iterations of the aforementioned algorithm.

bins : int

Number of bins to be used in step (6) to create the piece-wise interpolation of arclength vs flux correction.

windows : int

Number of windows to subdivide the data. The SFF algorithm is ran independently in each window.

sigma_1, sigma_2 : float, float

Sigma values which will be used to reject outliers in steps (6) and (2), respectivelly.

Returns:

corrected_lightcurve : LightCurve object

Returns a corrected lightcurve object.

fit_bspline(time, flux, s=0)[source]

s describes the “smoothness” of the spline

rotate_centroids(centroid_col, centroid_row)[source]

Rotate the coordinate frame of the (col, row) centroids to a new (x,y) frame in which the dominant motion of the spacecraft is aligned with the x axis. This makes it easier to fit a characteristic polynomial that describes the motion.

Implements a brute force search to find transit-like periodic events. This function fits a “box” model defined as:

\[\Pi (t) = \left\{ egin{array}{ll} a, & t < t_o,\ a - d, & t_o \leq t < t_o + w, \ a, & t \geq t_o + w \end{array}\]

ight.

to a list of nperiods periods between min_period and max_period. It’s assumed that the best period is the one that maximizes the posterior probability of the fit.
Parameters:

lc : LightCurve object

An object from KeplerLightCurve or LightCurve. Note that flattening the lightcurve beforehand does aid the quest for the transit period.

min_period : float

Minimum period to search for. Units must be the same as lc.time.

max_period : float

Maximum period to search for. Units must be the same as lc.time.

nperiods : int

Number of periods to search between min_period and max_period.

prior : oktopus.Prior object

Prior probability on the parameters of the box function, namely, amplitude, depth, to (time of the first discontinuity), and width.

Returns:

log_posterior : list

Log posterior (up to an additive constant) of the fit. The “best” period is therefore the one that maximizes the log posterior probability.

trial_periods : numpy array

List of trial periods.

best_period : float

Best period.

Inheritance Diagram