GSoC - Week 5-6

Project Taking Shape

Posted by Gaurav Joshi on July 11, 2023 · 7 mins read

Making a Demo

I had recently made some changes to my GP and GPResult class on the advice of my mentors and I realised that I am not making the tool keeping the user in mind.

See, in contemporary data analysis, Gaussian Processes GP’s are very useful to fit and extrapolate the data. If we have some data points, we can use any suitable kernel function and a 0 mean and we will get a nicely fitting GP, which will also allow us to know function values at new points with error bars. But this is not what the users need in astronomy. They do not usually need to extrapolate the data to get values at new points, rather, they want to make a compare models with different charachteristics, signals, and identify which signal is present in the time series.

Thus making the demo notebook gave me an extrinsic view of the problem, and it made me change the arrangement of a lot of classes and functions in the code.

New implimenatations

GPR class

The big new implimentation that I made, was to entirely change the feature from having a GP class for data fitting and a GPResult class for bayesian inference to just the GPResult class for inferencing and plotting.

I changed the tool to just a GPResult class, that takes a suitable prior and log likelihood function samples out the posterior parameters. This also has a lot of new plotting features which are helpul to visualise the posterior parameter.

class GPResult:
    """
    Makes a GPResult object which takes in a Stingray.Lightcurve and samples parameters of a model
    (Gaussian Process) based on the given prior and log_likelihood function.
    """
    def __init__(self, Lc: Lightcurve) -> None:
        self.lc = Lc
        self.time = Lc.time
        self.counts = Lc.counts
        self.Result = None

    def sample(self, prior_model=None, likelihood_model=None, **kwargs):
        """ Makes a Jaxns nested sampler over the Gaussian Process, given the
            prior and likelihood model """
        self.prior_model = prior_model
        self.likelihood_model = likelihood_model

        NSmodel = Model(prior_model=self.prior_model, log_likelihood=self.likelihood_model)
        NSmodel.sanity_check(random.PRNGKey(10), S=100)

        self.Exact_ns = ExactNestedSampler(NSmodel, num_live_points=500, max_samples=1e4)
        Termination_reason, State = self.Exact_ns(
            random.PRNGKey(42), term_cond=TerminationCondition(live_evidence_frac=1e-4)
        )
        self.Results = self.Exact_ns.to_results(State, Termination_reason)
        print("Simulation Complete")

    def get_evidence(self):
        """ Returns the log evidence of the model """

    def print_summary(self):
        """ Prints a summary table for the model parameters """

And many other plotting functions like

  • plot_diagnostics

  • plot_cornerplot

  • get_parameters_name

  • get_max_posterior_parameters

  • get_max_likelihood_parameters

  • posterior_plot

  • weighted_posterior_plot

  • corner_plot

Get Prior function

Earlier the get_prior function was a big function which took in your kernel and mean type, and gave you the prior function, but I had to write a separate function for each comibination making it a gignantic function of unecessary repeated code, also the prior ranges and distribution types (uniform, cauchy etc) was fixed according to what I had hard-coded without any way to change it. If someon wanted to implement a prior with even a small change, it was not possible.

So My mentor suggested making the function, such that the user inputs the tensorflow priors based on their needs, and we just make a jaxns prior for it. This led to the new code:-

def get_prior(params_list, prior_dict):
    """
    A prior generator function based on given values
    """
    def prior_model():
        prior_list = []
        for i in params_list:
            if isinstance(prior_dict[i], tfpd.Distribution):
                parameter = yield Prior(prior_dict[i], name=i)
            else:
                parameter = yield prior_dict[i]
            prior_list.append(parameter)
        return tuple(prior_list)

    return prior_model

Get likelihood function:

Similarly I changed the likelihood functin so that it takes a parmeter array, and the kernel, mean type and gives us a log_likelihood function which calculates and gets the likelihood of the data being fitted for the parameters.

def get_likelihood(params_list, kernel_type, mean_type, **kwargs):
    """
    A likelihood generator function based on given values
    """
    @jit
    def likelihood_model(*args):
        dict = {}
        for i, params in enumerate(params_list):
            dict[params] = args[i]
        kernel = get_kernel(kernel_type=kernel_type, kernel_params=dict)
        mean = get_mean(mean_type=mean_type, mean_params=dict)
        gp = GaussianProcess(kernel, kwargs["Times"], mean_value=mean(kwargs["Times"]))
        return gp.log_probability(kwargs["counts"])

    return likelihood_model

Jit issues