waveform_fit_rpt_1

David Seckel - 2/24/02

This report describes some tests of Paul's technique for waveform fitting in the DOM, and some proposed enhancements.

Contents:

1. Orthogonal functions.

2. Paul: Standard basis or a basis which recognizes data?

3. Data based designer basis functions

4. Discussion

5. Gallery of reconstructions

Comments welcome.

Orthogonal functions and data description

We anticipate generation of waveforms in the DOM with 128 samples. Most of these waveforms will describe impulsive events in the tank, and should have very similar waveforms, apart from amplitude correction. We would like to characterize the waveforms by a small set of numbers to reduce the data traffic to the DOM hub. If such a data reduction takes place, we also need to design an in-DOM filter to recognize when the characterization is adequate and when not.

Paul's idea is to characterize the waveforms by projecting onto a set of orthogonal basis functions. For example, one could use discrete fourier functions, Chebyshev polynomials, or any other basis set. Viewed in this way, the waveform time series is just a special case, where the basis set is a set of discrete delta functions at the sample points. Since the waveforms are fairly smooth, it seems reasonable that a few smooth functions should be able to describe the waveforms more succintly than by stating all 128 time values.

Let the waveform be denoted as [Graphics:Images/waveform_fit_rpt_1_gr_1.gif], and suppose there are n basis functions [Graphics:Images/waveform_fit_rpt_1_gr_2.gif], with the property that [Graphics:Images/waveform_fit_rpt_1_gr_3.gif]. Then the coefficients [Graphics:Images/waveform_fit_rpt_1_gr_4.gif] form a description of the waveform. If the [Graphics:Images/waveform_fit_rpt_1_gr_5.gif] are complete, then the [Graphics:Images/waveform_fit_rpt_1_gr_6.gif] can reconstruct [Graphics:Images/waveform_fit_rpt_1_gr_7.gif] exactly; otherwise [Graphics:Images/waveform_fit_rpt_1_gr_8.gif] is approximated by [Graphics:Images/waveform_fit_rpt_1_gr_9.gif]. A measure of how good the approximation is can be formed from the variance [Graphics:Images/waveform_fit_rpt_1_gr_10.gif]. The size of the variance can be used as a filter for accepting [Graphics:Images/waveform_fit_rpt_1_gr_11.gif]or deciding to send the complete waveform back to the DOM hub.

The effectiveness of this strategy depends on finding a small basis set that does a good job. Paul's first attempt was to use the first 8 Chebyshev polynomials. The Chebyshev of order n, is an nth degree polynomial defined over a set range usually taken to be 0-1. The left panel shows a plot of the first 8 Ch's sampled at 128 points. In fact, although these are in fact the first 8 CH's, the plot was generated by applying the GramSchmidt orthogonalizing procedure to the 8 linearly independent functions shown in the right hand panel. Those 8 functions are simply t^n, sampled at 128 points for n=0-7.

[Graphics:Images/waveform_fit_rpt_1_gr_12.gif]

When applied to a sample waveform taken from data this set of 8 functions is not particularly spectacular. Here is an example. The x-axis is labeled by sample number. The samples are 2 ns apart, being averages of adjacent points in the raw data. The y-axis is mV.

[Graphics:Images/waveform_fit_rpt_1_gr_13.gif]

A better basis set

Paul's insight was that one ought to do better if one started with a basis function that looked more like the data. Specifically, he said replace the 0th basis function (the constant) with a [Graphics:Images/waveform_fit_rpt_1_gr_14.gif]that looks like the average `muon'. He determined the average muon from 81 waveforms (channel 1), originally generated by Albrecht, cleaned by Bai, and filtered by Dave. As suggested in mu_response_rpt_3, they may or may not be muons. In any event, they are a sample of similar looking things and Paul used a simple average of them to play with. After substituting this [Graphics:Images/waveform_fit_rpt_1_gr_15.gif] into his function set, he performs GramSchmidt. The new orthogonal basis functions are on the left, the input to GramSchmidt is on the right.

[Graphics:Images/waveform_fit_rpt_1_gr_16.gif]

With this basis set, waveform reconstruction looks much better

[Graphics:Images/waveform_fit_rpt_1_gr_17.gif]

The waveform just above is just one example, of course, so we ran the reconstruction algorithm on the rest of the 81 waveforms. There are many that look as good as the original test, but there are a few that don't look as good. The two samples below are typical of the variation. Upon inspection, it becomes apparent that the event shown on the left is typical of the case where the pulse arrives a little bit later than the average pulse, and the event on the right is one where the pulse arrives before the average pulse. It is apparent that although the mean muon is charcterized by the new [Graphics:Images/waveform_fit_rpt_1_gr_18.gif] the rest of the CH's do not do a particularly good job of describing the deviations from the mean.

[Graphics:Images/waveform_fit_rpt_1_gr_19.gif]

Putting the physics back in ...

There are two points to be learned from Paul's effort: a) using basis functions which are directly related to the data works and b) using GramSchmidt allows flexibility in the choice of basis functions beyond what one might find in a book on mathematical physics. With these thoughts in mind, reponse of a phototube to a short impulse may be described by amplitude and arrival time. As the impulse widens the pulse width will increase, until eventually the pulse shape is distorted beyond recognition. Within this model, a phototube pulse can be descibed by

[Graphics:Images/waveform_fit_rpt_1_gr_20.gif]

where [Graphics:Images/waveform_fit_rpt_1_gr_21.gif] is the gain, [Graphics:Images/waveform_fit_rpt_1_gr_22.gif] is the arrival time relative to the mean arrival time, and [Graphics:Images/waveform_fit_rpt_1_gr_23.gif] describes the deviation of the pulse width from its nominal value. For modest deviations from the mean, we may expand

[Graphics:Images/waveform_fit_rpt_1_gr_24.gif]

where [Graphics:Images/waveform_fit_rpt_1_gr_25.gif], [Graphics:Images/waveform_fit_rpt_1_gr_26.gif], and [Graphics:Images/waveform_fit_rpt_1_gr_27.gif]. This suggests that to capture the essence of describing a phototube pulse one should use an orthogonal basis generated from the three functions [Graphics:Images/waveform_fit_rpt_1_gr_28.gif]. Such a basis is shown on the left, with the input functions on the right. A couple of comments. The black points [Graphics:Images/waveform_fit_rpt_1_gr_29.gif] are the same in both panels. The red points are almost the same, due to the happenstance that [Graphics:Images/waveform_fit_rpt_1_gr_30.gif] is nearly othogonal to [Graphics:Images/waveform_fit_rpt_1_gr_31.gif] anyway. The next remark is that although [Graphics:Images/waveform_fit_rpt_1_gr_32.gif]is fairly smooth, the numerical derivative appears to be fairly noisy outside the peak. Presumably this would get better with better statistics.

[Graphics:Images/waveform_fit_rpt_1_gr_33.gif]

So, how does this ratty looking diminutive basis set do? Surprisingly well. The three panels below show reconstructions for the three events used above, the nominal on top, the early and late events are left to right on bottom. Comparison with Paul's [Graphics:Images/waveform_fit_rpt_1_gr_34.gif] + CH's shows a distinct improvement

[Graphics:Images/waveform_fit_rpt_1_gr_35.gif]

Comparison with Paul's [Graphics:Images/waveform_fit_rpt_1_gr_36.gif] + CH basis set shows a distinct improvement. This can be made somewhat more quantitative by calculating the residual  [Graphics:Images/waveform_fit_rpt_1_gr_37.gif] for the two cases. The scatterplot below shows the residual for each of the 81 waveforms. The black points are for Paul's basis, the red points for the new basis. Pretty clearly, the new basis does better. Averaging over 81 waveforms to get a single statistic gives a mean residual of 99.07 for the 8 function set and 85.4 for the 3 function set. As an indication of the importance of the three parameters, utilizing just [Graphics:Images/waveform_fit_rpt_1_gr_38.gif] gives a residual of 143, using functions characterizing amplitude and arrival time gives 95, while amplitude, arrival time and width gives the stated 85.4. I suspect these are all meaningful improvements, but have not attempted any statistical tests yet.

[Graphics:Images/waveform_fit_rpt_1_gr_39.gif]

Discussion

Encouraged by the above result, I tried to improve the basis functions. Here is a list of what I tried.

Trim the noise: Setting [Graphics:Images/waveform_fit_rpt_1_gr_40.gif]=0 for samples below 25 and above 115 reduces the residual in the prepulse and postpulse regions. Presumably this could also be accomplished with more data to determine [Graphics:Images/waveform_fit_rpt_1_gr_41.gif]. residual = 83.1

Adjust baseline: Bai has got the baseline subtracted out pretty well. But adding a constant offset as a fourth function reduces the residual a bit. trim +offset residual = 82.1. Here is the trimmed and offset, data based, orthogonal basis, which is my best effort to date.

[Graphics:Images/waveform_fit_rpt_1_gr_42.gif]

Tried analytic forms for [Graphics:Images/waveform_fit_rpt_1_gr_43.gif]: I tried various parameters in a phototube waveform generator until I got to the following. Red is the analytic, black is data driven [Graphics:Images/waveform_fit_rpt_1_gr_44.gif]. Its bang on at the peak and rise, but misses a bit for samples between 70 and 110. The analytic [Graphics:Images/waveform_fit_rpt_1_gr_45.gif] produces a 3 function set which performs worse than the data [Graphics:Images/waveform_fit_rpt_1_gr_46.gif]. residual = 88.4

[Graphics:Images/waveform_fit_rpt_1_gr_47.gif]

Analytic form for [Graphics:Images/waveform_fit_rpt_1_gr_48.gif], with data based values pasted in for samples 70-110. This performs similarly to the noise trimmed and offset best effort.

Basis set consisting of offset analytic pulses: Instead of using [Graphics:Images/waveform_fit_rpt_1_gr_49.gif] to characterize arrival time, I tried using input functions consisting of explicit pulses just offset to arrive at different times. This works better than the [Graphics:Images/waveform_fit_rpt_1_gr_50.gif] + CH solution, but not as well as [Graphics:Images/waveform_fit_rpt_1_gr_51.gif].

Tried the technique on channel 2: The mean waveform was derived from channel 1 data. Does it work on channel 2 data? In this case, not particulalry well. The reason is that the arrival times of channel 1 and channel 2 are offset, as in the figure, and this offset is too big for [Graphics:Images/waveform_fit_rpt_1_gr_52.gif] to deal with.

[Graphics:Images/waveform_fit_rpt_1_gr_53.gif]

To correct for this, I tried offsetting the channel 2 waveforms by 13 samples. Here is a typical reconstruction. The poor shape near the peak is typical. Apparently at least one more shape parameter is needed. More practically, the whole approach here suggests an independent calibration of the DOM waveform fitter for each DOM. The mean residual for channel 2 data, offset and reconstructed with channel 1 basis set is 108.

[Graphics:Images/waveform_fit_rpt_1_gr_54.gif]

And here are some things I haven't tried ... creating a mean waveform for channel 2. Cleaning up the mean waveform - for example the pulses clearly have some arrival time fluctuations. This presumably makes the mean pulse wider than individual pulses. Looking at the statistics of the projection coefficients. Trying an analytic form that more closely follows the data.

Compilation of 81 reconstructions

Here are the 81 reconstructions,using the best effort basis. They are in essence three parameter reconstructions since the baseline offset parameter carries very little weight. It is important that this can all be done in the DOM. In each case, the black dots are what the FPGA would get from the ATWD. After projection onto 4 basis functions the DOM would send just the coefficients to the HUB. The red line is what would be reconstructed in software by the filter/analysis computers at the backend of the DAQ. One caveate...haven't attempted to quantize the coefficients. TBD.

[Graphics:Images/waveform_fit_rpt_1_gr_55.gif]


Converted by Mathematica      February 24, 2002