Notes from CalStats WG Discussion 2020-dec-1

The meeting agenda is at https://iachec.org/calibration-statistics/#2020dec1

Questions and answers paraphrased by Vinay Kashyap.  Comments from chat window denoted with "Name :", verbal discussions from video transcript.

Statistical Topics in X-ray Polarimetry: Herman Marshall (MIT) 
	Vinay Kashyap : I have a dumb q: what makes IXPE good at measuring these tracks that, e.g., ROSAT/PSPC could not do?
		Herman Marshall -- basically requires much higher spatial resolution, of ~mm scale, in the detector
		Keith Arnaud -- like putting the HRI below the PSPC
	Diab Jerius -- Do you have time data or just an image?
		HM -- time for each event
	DJ -- how do you incorporate systematic errors into analysis
		HM -- systematic errors still much smaller than statistical
	DJ -- what about known biases?
		HM -- await paper to be published
	KA -- what about correlations between parameters?
		HM -- in an ideal instrument they are uncorrelated. and in practice we don't see them so it is a reasonable assumption that there is no significant cross term
	Diab Jerius : How did you calibrate the simulations?
		HM -- by comparing the tracks you measure with simulations against observations at different energies and compare modulation factors

Handling model uncertainties by means of comparison densities: Sara Algeri (Minnesota)
	VK -- how practical is it to implement this in XSPEC or Sherpa
		Sara Algeri -- there exists a python package too.  sharp departures from nominal background hard to implement
	KA -- how large should datasets be for asymptotic results to hold?
		SA -- a few hundred photons is OK.  But no such limitation if you use resampling method, which does not rely on asymptotics

Modeling the background: a case study with Suzaku XIS and N132D: Eric Miller (MIT)
	VK: Q for Eric and Sara: Would the comparison density approach be useful in modeling the residuals in Eric's NXB model in slide 8?
		SA -- if Poisson, then modeling smooth changes are possible
		Eric Miller -- yes, the feature near O Ka is persistent
	Andy B : Are your xcm files available so we can see how you set up the data and models in xspec ?
		EM -- Yes, will make it available afterwards
		EM (following up) -- I've assembled a tarball of XSPEC command files and examples spectra and responses showing how I fit the background to Suzaku XIS data: https://wikis.mit.edu/confluence/download/attachments/75637206/suzaku_background_fit_example.tar.gz?version=1&modificationDate=1606839425248&api=v2

Likelihood selection: Guillaume Belanger (ESA) 
	VK -- what is the decision flowchart you would use to come up with what the likelihood should be?
		Guillame Belanger -- start with histograms, trace each step, try Monte Carlo and get some kind of empirical likelihood function
	Aneta Siemiginowska : How much bias do you expect in fitting the periodograms with the incorrect likelihood?
		GB -- Unable to say yet.  But likely a smaller error with exponential likelihoods than in Gaussian case because of longer tail.
	AB : Xspec lets you choose the whittle statistic when fitting power spectra
	J. Michael Burgess : Basically generative modeling?
		JMB -- like in forward modeling software, e.g., Stan
		GB -- following coordinate transformations in a general way can indeed be useful for complex problems

Using the Kaastra cstat goodness-of-fit in Sherpa: Vinay Kashyap (CfA) 

	Ivan Valtchanov : How much the C-stat static would change if one assumes zero (or constant) background?
	JMB : Not for Poisson
		VK -- Yes, cstat calculation must include background part too.  If fit simultaneously, will essentially double dof.
	KA : My understanding is that one should not use the same statistic for goodness-of-fit as one uses for determining parameters.
		VK -- True about estimating parameters from goodness check, but not sure about this.  Would like guidance from statisticians.
	JMB : Maybe a provocative comment, but should we be using goodness of fit at all? Why not posterior predictive checks?
	AS : Yes, good point Michael!
	G. Belanger : Yes, goodness of fit seems to be a little passe
	Kristin Madsen : I recall that several IACHEC meetings ago, perhaps when we were in the UK, someone (Andy Pollock?) recommended exactly what Keith said, that we can use the chi test to compare models as long as you didn't use it to fit

	HM : Does the Kaastra variance method rely on symmetric error distribution?  Is there a "skew" test?
		VK -- yes, it assumes distribution is symmetric

	JMB : There is also this paper: https://arxiv.org/abs/1012.3754
	IV : Can this simple example be expanded to include also the suggested idea of using the posterior predictive checks?
	GB : Andy was using C-stat to fit, and Chi2 as goodness of fit. But I think we are converging to the view that model comparison is the way to go, leaving behind goodness of fit altogether.
	Brian W. Grefenstette : @Michael Out of ignorance, is there an astronomer-friendly paper showing posterior predictive checking? Is there a large difference between that and doing a Monte Carlo based on your estimated model parameters and comparing the resulting fit statistics (whichever fit statistics you prefer)?
	JMB : But bayesians also do this as well
	SA : @Michael Yes but it has some issues attached to it. For instance, the model fits one gets over the n datasets with n-1 observations are going to be very similar and highly correlated. As a result, the resulting measure of GOF considered will have large variance due to such correlation. 
	AS : Protassov et al 2002 has an example
	JMB : https://projecteuclid.org/euclid.ssu/1356628931
	AS : of the ppp calculations
	AS : https://arxiv.org/abs/astro-ph/0201547
	JMB : But for example LOO uses the same dataset, no?
	AS : here is the ppp and model selection for line spectra in more details https://arxiv.org/abs/0808.3164
	JMB : @aneta^ That is a great paper

	Doug Burke (following up) -- if you changed plot style, you would run into trouble with this script
		AS -- Yes, should have robust approach.
		VK -- yes, best to be able to hook into Sherpa internal arrays.  Tool doesn't handle background or simultaneous fits right now.  Also, idea was merely to get people familiar with this concept.  Lots of limitations, but better than nothing.
	AB (following up) -- a future session could be devoted to the choice of fit statistic, or goodness of fit estimators - ideally with input from Aneta, Keith and Michael with worked examples like you showed for sherpa.
		VK -- yes, plus statisticians

Concordance Q&A: Yang Chen (Michigan) and Herman Marshall (MIT)
	BG -- If you achieve the goal of Concordance, what is the next step?
		HM -- The output goes to calibration scientists, it is up to them to figure out how to handle that
		VK -- There is adjusting effective areas, and then there is figuring out source intensitives given EA, which may be something for general user look into for using this kind of technique
	KM -- What if all instruments are wrong, and there is a common bias to all measurements?
	AB -- new missions calibrate against old missions and they assume old versions are correct
		HM -- Tough noogies, that would be uncorrectable.  Ground calibration is done at different places, so it should average out.  It all comes down to basic Physics calibration, and NIST.