A Study of The Clinical Viability of A Prototype Compton Camera for Prompt Gamma Imaging Based Proton Beam Range Verication

. We present Compton camera (CC) based PG imaging for proton range verification at clinical dose rates. PG emission from a tissue-equivalent phantom during irradiation with clinical proton beams was measured with a prototype CC. Images were reconstructed of the raw measured data and of data processed with a neural network (NN) trained to identify “true” and “false” PG events . From these images, we determine if PG images produced by the prototype CC could provide clinically useful information about the in vivo range of the proton beams delivered during proton beam radiotherapy. NN processing of the data was found necessary to allow identification of the proton beam path from the PG images. Furthermore, to allow the localization of the end of the proton beam range with a precision of ≤ 3mm with the prototype CC, ~1 x 10 9 protons would need to be delivered, which is on the order of magnitude delivered for a standard proton radiotherapy treatment field. To obtain higher precision in beam range determination and to allow imaging a single proton pencil beam delivered within the full treatment field, further improvements in PG detection rates by the CC, NN data processing, and image reconstruction algorithms are needed. Experimental For this study, PG data was measured using the PJ3 CC during the delivery of a 150 MeV proton pencil beam to a 15 cm x 30 cm x 35 cm high-density polyethylene (HDPE;  =0.97 g/cm 3 ) target. The data was measured for dose rates of 20,000 Monitor Units/min (kMU/min; 1.22 × 10 9 protons/s, minimum clinical dose rate at 150 MeV), 100 kMU/min (6.08 × 10 9 protons/s), and 180 kMU/min (1.09 × 10 10 protons/s; maximum clinical dose rate at 150 MeV), which covers the full range of clinical dose rates for a 150 MeV proton beam delivered from the Varian Pro-Beam treatment delivery system (Varian Medical Systems, Palo Alto, CA) located at the Maryland Proton Treatment Center (MPTC) in Baltimore, MD. For all irradiations, 5 kMU were delivered equating to 1.82 x 10 10 protons and delivery times of 15 seconds, 3 seconds, and 1.67 seconds at dose rates of 20 kMU/min, 100 kMU/min, and 180 kMU/min.


Introduction
Compton cameras (CC) are multistage detectors that use the principles of Compton scattering to produce 2-dimensional (2D) and 3-diminesional (3D) images of gamma ray and x-ray sources [1][2][3] . CCs measure the energy deposition and position for each interaction of a gamma as it scatters in the different detection stages of the camera. From the kinematics of the Compton scattering process, the energy deposition and position data for each scatter of the gamma can be used to determine the gamma's incident energy and the angle of its initial scatter in the detector 4,5 .
CCs have been widely studied as a tool to image secondary prompt gammas (PG) emitted along the proton beam as one potential method for verifying the range of the proton beam within the patient during proton radiotherapy (RT) treatment delivery 6 . The use of CCs for proton beam range verification is of particular interest due their ability to reconstruct full 3D images of PG emission. These 3D images of the PG emission could, in principle, be registered and overlaid onto the patients CT dataset for visual (and analytical) comparison to the planned treatment dose.
The potential for 3D imaging of the in vivo path of the proton beam has led to a wide range of studies into the application of CCs to PG imaging 7,8 . While 3D image reconstruction of PG emission with a CC during proton beam delivery has been proven feasible 9,10 , the ability to do so at full clinical proton RT dose rates and under full clinical treatment conditions has thus far not been possible. Several studies of prototype CCs with high energy accelerator beams and clinical proton beams have shown rather poor performance for detecting the "true" double-scatter (DS; a single PG interacting twice in the CC) and "true" triple-scatter (TS; a single PG interacting three times in the CC) PG events needed for CC image reconstruction 9,[11][12][13] . This poor performance is due to: 1) high detector dead time caused by the large signal environment encountered during proton RT, 2) interactions of secondary particles other than PGs 14 , 3) "mis-ordered" DS and TS events whose individual interactions in the CC are readout and recorded in the wrong order, 4) the detection of "false" events, which are DS or TS events that are due to more than one PG interacting simultaneously in the CC 14-16 and 5) "double-to triple" (D-to-T) events, which occur when a true DS and single-scatter from a separate PG are recorded together as a TS event.
Several studies [16][17][18] have shown that mis-ordered, false and D-to-T events, do not contribute to the image signal and act only to increase noise and reduce the achievable contrast of the image. Methods to determine correct event ordering 4,18,19 based on classical Compton kinematics have been studied, however no efficient method has been developed to identify the correct interaction order of DS or TS events in which the initial PG energy isn't known (or assumed) a priori. Recently, studies have shown that Neural Networks (NN) can be employed to determine the interaction order of the individual PG event interactions 20 , the event type (true/false) 10 , or both simultaneously 21 with accuracies greater than 80%. These reports have shown that NN processing can improve the quality of CC images of PGs emitted during proton beam irradiation. Additionally, recent studies have shown how CC imaging can be improved by designing new data acquisition and readout electronics 17 .
In this paper we study the clinical viability of a prototype CC based PG imaging system to acquire 3D PG images during the delivery of proton RT pencil beams at clinical dose rates. The PG imaging system consists of the prototype PJ3 CC (H3D, inc., Ann Arbor MI), a Neural Network based data processing module, and back-projection based image reconstruction software. We study the number of true and false DS and TS events and D-to-T events recorded by the CC (as predicted by the NN) and determine the CC's PG detection efficiency (PGs/proton) during the delivery of a 150 MeV clinical proton pencil beam delivered at the lowest, middle, and highest clinical dose rates. We then determine the ability of images reconstructed from the data collected at each dose to localize the delivered proton beam range. Finally, we assess the clinical viability of this system by determining the lowest number of PGs needed to produce an image that can localize the end of the proton beam range with a precision of ≤ 3 mm, and determine the minimum PG detection efficiency needed to reach this level of precision for the delivery of a single proton pencil beam delivered during a standard fraction treatment.

Results
PG measurement and image reconstruction. Table 1 shows a breakdown of the PG events measured by the PJ3 CC during the irradiation of a highdensity polyethylene phantom with 1.82 x 10 10 protons from a 150 MeV clinical proton beam, as described in the Methods section. As the proton beam dose rate increases, the raw detection rate (PG events/protons delivered) of DS and TS events recorded by the PJ3 decreases from a rate of 7.67 x 10 -4 events/proton at 20kMU/min, to 2.7 x 10 -4 events/proton at 100 kMU/min, to 2.16 x 10 -4 events/proton at 180 kMU/min. When this raw data is processed by the NN, the number of "usable events" for reconstruction (True DS + True TS + DS events recovered from D-to-T events), as identified by the NN, detection rate drops to 2.03 x 10 -4 events/proton, 5.48 x 10 -5 events/proton, and 778,447 4.28 x 10 -5 events/proton for the 20kMU/min, 100 kMU/min, and 180 kMU/min dose rates, respectively. Figure 1 shows the PG image reconstructions (details given in Methods section) from the raw CC data and the NN processed CC data collected during the 20kMU/min proton beam delivery. These PG images were reconstructed using both DS and TS events with a calculated initial energy ranging from 0.6 MeV -4.5 MeV. Each image in Figure 1 was reconstructed using 150,000 PG events, with the raw CC data image using all recorded data and the NN processed data image using only "usable" true DS and TS events. From these images we see that no image of the proton beam is visible in the image reconstructed with the raw CC data. However, the path of the proton beam can be identified and localized from the image reconstructed from the NN processed data. This indicates that by removing the false DS and TS events, extracting the True DS events from the D-to-T events, and ordering the individual interactions within the DS and TS events to match that predicted by the NN, the quality of the measured data is improved to the level that PG images of the proton beam path can be reconstructed using a relatively straightforward, and fast backprojection algorithm (described in the Methods section).

Raw Total
Also, as shown in Figure 2, the proton beam path can also be easily seen in PG images from NN processed data measured at 100 kMU/min and 180 kMU/min dose rates (reconstructed with the same number and energy range of PGs as used for images in Figure 1). From the 2D images of the PG emission, it can be seen that the background noise in the images increases with increasing dose rate. This is further illustrated in the greater than 2x decrease in the contrast-to-noise (CNR) values (defined in Image Assessment section of Methods) of the 1D profiles. For the 20 kMU/min, 100 kMU/min, and 180 kMU/min dose rates, the CNR values are 151.53, 91.04, and 70.87, respectively when using 150,000 PG events for image reconstruction.
To study the correlation between the end of the PG image range and the end of the proton Bragg peak, one-dimensional (1D) profiles, as shown in Figure 2c, were extracted from the NN processed data images for each dose rate and for the proton beam dose distribution ( Figure 1b). We see that the PG profiles start at a minimum at the left edge of the image and increases to a 60% of its maximum value at the front edge of the phantom (0 cm depth) for all three studied dose rates. Additionally, each  PG profile increases to a maximum value at ~7.55 cm depth, and falls off to a value of 60% distal to the maximum (PG60) at the same depth (15.25 cm) that the Bragg peak falls off to 60% distal to reaching its maximum (BP60).
Assessing PG profile uncertainty.
To study uncertainty in the depth of the PG60 profile, we performed reconstructions using the number of events that would be recorded during the delivery of all pencil beams contained in the deepest energy layer of a high dose, hypo-fractionated treatment field all the way down to the number of events that would be recorded for a single pencil beam within the deepest energy layer of a standard fractionation treatment. For all pencil beams contained within the deepest energy layer for a hypofractionated treatment 9,12 we estimate the total number of protons delivered to be ~1 x 10 9 . While for a single pencil beam within this layer for hypo-fractionation or standard fractionation treatments, we estimate the number of protons 22 would be ~1 x 10 8 to 5 x 10 7 , respectively. Using the calculated usable ( NN processed) PG event detection rate for 180 kMU/min of 4.28 x 10 -5 events/proton, the calculated number of PG events that would have been recorded by the CC for the delivery of 5 x 10 7 protons up to 1 x 10 9 protons are listed in Table 2. Five image reconstructions were performed using the number of PG events (taken from independent datasets as described in Methods section) for each delivered proton number listed in Table  2. As can be seen, when reducing the PG events used for reconstruction to 42,800 (1 x 10 9 protons) the CNR only slightly decreases from the CNR (70.87) of the image in Fig. 2. However, as the number of PG events further decreases, the CNR decreases sharply. In fact, the CNR for PG images from a single pencil beam delivering 5 x 10 7 protons (2,140 events) is 3.5x lower than that for the delivery of 1 x 10 9 protons.
1D PG profiles were extracted from each image and plotted along with the "average" PG profile. Figure 3 shows the five individual profiles and the average profile for the images reconstructed for 42,800 PG events recorded during the delivery of 1 x 10 9 protons. Noticeable differences are visible in the individual profiles, especially in the region beyond the Bragg peak. However, all profiles converge to the same value at the depth of the PG60, which corresponds to the BP60 depth. As can be seen in Figure  3b, the PG60 (60% of maximum value) only intersects with the uncertainty band for one depth (0.2 cm PG events PG60 (cm)

PG60
(cm) Protons CNR pixel width) in the image. The (±2) uncertainty band (vertical gray bars in Fig 3c) represents the range of PG values at each depth that would encompass 95% of the PG profiles one would obtain from reconstructions made with a given PG event number. Therefore, we interpret this to mean that the uncertainty in the depth at which the PG60 would occur (PG60) is 0.2 cm (or one pixel) for images reconstructed with 42,800 PG events. Alternatively we could say the precision of the PG60 for determining the BP60 and thus the BP range, is 0.2 cm for reconstructions made with 42,800 PG events (~1 x 10 9 protons delivered) Figure 3c-e shows the five individual and average 1D profiles from the images reconstructed for PGs measured during the delivery of 5 x 10 8 , 2.5 x 10 8 , 1 x 10 8 , 5 x 10 7 protons, or 21,250, 10,625, 4,280, and 2,140 PG events, respectively. As the number of measured PGs decreases for the delivery of fewer protons, the noise in the individual profiles increases. This results in an increasing size of the uncertainty band about the average profile. In fact, as the number of delivered protons (and thus number of measured PG events) decreases from 1 x 10 9 to 5 x 10 7 , PG60 (represented by the width of the blue brackets) as shown in Figure 4, increases from 0.2 cm up to 6.4 cm, respectively. This means that if PG60 is used as a method of determining in vivo proton beam range, the uncertainty in the determined range would be greater than 6 cm for a single pencil beam delivered during a standard fraction treatment with the current CC prototype.

Discussion
Assessment of clinical viability.
The number of PG events recorded by the CC is highly dependent on the proton beam dose rate, with the measured PG events dropping by 3.5x when the dose rate is increased from the lowest to highest clinical values. Plus we see that the measured data is dominated by false DS and TS events, which only produce noise in the image 14,17 . This can be seen by the reduction of 3x -5x in the number of events in the measured data file after NN processing to remove the false events and recover the DS events from D-to-T events. We were not able to produce adequate images of PG emission from the raw data measured with our prototype CC even at the lowest clinical dose rate. However, use of the NN to process data and remove false PG events allowed an image to be reconstructed from which the path of the proton beam could be easily identified. PG images of the proton pencil beam could even be reconstructed for measurements made at the highest clinical dose rate (for the 150 MeV proton beam), which has so far not been possible with our prototype CC, and to our knowledge has not been reported in the literature for other prototype CC designs. These results underscore the importance of measured data quality showing the high impact that false PG events can have on CC images, and provides further evidence to the power of NN processing of the data for improving CC imaging within the proton RT environment.
We saw that the PG images were able to highlight the path of the proton beam through the phantom, although the image did not exactly reproduce the proton depth dose profile. However, it is never expected that PG images will exactly reproduce the proton BP since the proton-nuclear interactions that produce the PGs are fundamentally different from the mostly electromagnetic proton interactions that lead to dose deposition. By comparing the 1D depth profiles from the PG images to the depth dose profiles of the proton BP, we do notice that the two profiles intersect at the depth distal to the maximum at which their intensity falls to 60% of the maximum value. This correlation in the BP60 and PG60 was found to occur across all dose rates studied, making it a possible marker for determining To assess the clinical viability of the prototype CC imaging system, we produced images using the number of PG events that we expect to measure (based on the determined PG detection rate of the CC) for the delivery of proton pencil beams during clinical proton radiotherapy. We saw that as the number of protons delivered decreases, the variability of the position of the PG60 increased, which greatly reduces precision (increases the uncertainty) in any determination of in vivo proton beam range based on this value.
To make PG based proton beam range verification a useful clinical tool, we ideally need to be able to detect shifts in the beam range of 3 mm or less. Based on this study that would mean producing PG images with PG60 < 3mm, which based on Figure 4 for our current PJ3 CC would require recording at least 25,000 "useable" PG events (after NN processing). Therefore for a single pencil beam delivered for a standard fractionation treatment (~5 x 10 7 protons), this would equate to a detection rate of 5 x 10 -4 usable (NN processed) PG events/proton. If detection of shifts of 2mm are desired then a minimum detection rate of up to 1 x 10 -3 usable PG events/proton would be required. For our current CC this would require an increase in the current detection rate of up to a factor of 25x. Additionally, if experimental ultra-high dose rate (FLASH) radiotherapy techniques are translated into routine practice, it is possible that CCs with even higher detection rates would be needed.
These results illustrate the difficulties and limitations that have been encountered in the development of CCs for PG imaging during proton radiotherapy. However, CC prototypes that have thus far been tested have been, for the most part, been built with data acquisition and readout electronics that were designed for measuring low energy (< 1 MeV) gamma rays in low signal intensity and low background environments. So the fact that they have found limited success for PG imaging in the high intensity, high background environments encountered with proton RT is not surprising. Recent studies 17 have shown that detection rates, during clinical proton beam delivery, can be improved by a factor of ~20x and the percentage of false DS and TS events in the data can be reduced by ~4x by redesigning the CC data acquisition and readout electronics for the high PG signal and background environments. This means that more PG events are recorded, and that a higher percentage of those events will be "usable" for image reconstruction. Additionally, this study and other recently published studies 10,21 have shown that the use of NNs to identify false PG events, recover DS events from D-to-T events, and determine the correct order of the measured DS and TS events can greatly improve the viability of CC imaging. Thus, we believe that further developments of NNs and other artificial intelligence based methods could help to further push CC based PG imaging toward clinical viability by increasing the quality of the PG data and reducing the number of PGs needed to produce the high quality images needed for proton range verification.
In conclusion, this study provides an initial evaluation of a prototype CC, showing what PG detection rates would be needed by this imaging system to make it useful for clinical proton beam range verification. We believe that with further development of NN data processing techniques, improved CC hardware and electronics, and improved image reconstruction methods, that CC based PG imaging could become a clinically viable method for in vivo proton range verification.

Methods
The experimental work flow for this study is shown in Figure 5. The measured CC data was either: 1) passed directly (raw data) to the reconstruction software, or to the NN for event type and interaction order processing (NN processed data) prior to image reconstruction. Comparisons of the raw data images and NN processed data images were then performed.
Experimental Measurements. For this study, PG data was measured using the prototype PJ3 CC (H3D, Inc., Ann Arbor, MI) during the delivery of a 150 MeV proton pencil beam to a 15 cm x 30 cm x 35 cm high-density polyethylene (HDPE; =0.97 g/cm 3 ) target. The data was measured for dose rates of 20,000 Monitor Units/min (kMU/min; 1.22 × 10 9 protons/s, minimum clinical dose rate at 150 MeV), 100 kMU/min (6.08 × 10 9 protons/s), and 180 kMU/min (1.09 × 10 10 protons/s; maximum clinical dose rate at 150 MeV), which covers the full range of clinical dose rates for a 150 MeV proton beam delivered from the Varian Pro-Beam treatment delivery system (Varian Medical Systems, Palo Alto, CA) located at the Maryland Proton Treatment Center (MPTC) in Baltimore, MD. For all irradiations, 5 kMU were delivered equating to 1.82 x 10 10 protons and delivery times of 15 seconds, 3 seconds, and 1.67 seconds at dose rates of 20 kMU/min, 100 kMU/min, and 180 kMU/min. As shown in Figure 5, the PJ3 CC (design details in Polf et al. (2021) 17 ) was mounted beneath the patient positioning couch, with the HDPE phantom placed on the couch directly above the PJ3. The beam was delivered to the center of the HDPE phantom, located 15 cm above the top of the couch, corresponding to 35 cm from the top of the PJ3. The patient couch was positioned so that the beam path was aligned with the center of the PJ3, and so that treatment isocenter was located at a depth of 15 cm in the phantom.
Neural Network Data Processing. A fully connected NN was constructed, with Keras using Tensorflow 2.1.0 23 . The NN was trained and validated using MCDE 15 simulated generated datasets which contained 2.1 x 10 6 PG events consisting of True DS and TS (correctly ordered and mis-ordered) events, false DS and TS events, and D-to-T events. The measured CC data was passed to the trained NN which then predicted the type (true/false/D-to-T) of each DS and TS event. False DS and TS events were removed from the data. Next, the random single scatter contained within each D-to-T event, identified by the NN, was removed to leave the true DS. Finally, the NN then predicted the correct order of each interaction Figure 5: Schematic of (a) clinical proton beam irradiations for CC measurement of PG emission, (b) NN processing of the raw CC data to identify event ordering for true events, false DS/TS and D-to-T events followed by (c) KWBP reconstruction the raw CC data and the NN processed data. In the HDPE phantom in (a) the blue arrow represents the proton beam path and the blue circle represents isocenter of the treatment delivery system. within the true DS and TS events. The NN identified true DS and TS events (including the D-to-T converted to DS events) were written to the final "NN processed" data set with the interactions for each event written in the order predicted by the NN.
Image Reconstruction. Image reconstruction of the PG data was performed using the Kernel Weighted Backprojection (KWBP) algorithm, described by Panthi et al 14 . For this study, a full 3D image was reconstructed with KWBP using a 50 cm × 6 cm × 50 cm imaging space. This was processed into 20 separate two-dimensional slices (3 mm thick) in the XZ-plane, with each slice image having 256 x 256 pixels. The KWBP reconstructions were performed using an NVIDIA P4000 GPU, with reconstruction times ranging from ~10 -250 seconds depending on the number of events used for the reconstruction.
Image assessment and range estimation. A 1D profile along the beam central axis (z = 0 cm), representing the integral of three rows of pixels centered on z = 0 cm, was extracted from the 2D slice images. The profiles were normalized to their maximum values, and the depth of the maximum value (Dmax) and the distal depth (beyond the maximum) at which the profiles fall off to 60% (PG60) of the maximum values was determined. The resolution of the profiles is limited to the 2D image pixel size (2 mm), and the PG60 values were determined by a linear interpolation between the center position of the voxels before and after the PG profile falls below 60% of its peak value.
Improvements to the PG images due to NN processing, were quantified using the contrast-to-noise (CNR) values of the images. This is defined as CNR = |Speak -Sdistal|/distal, where Speak is the average image "signal" in the peak intensity region of the individual profiles ranging from depths of 2.6 cm to 8.5 cm, Sdistal is the average image "noise" in the individual reconstruction profiles ranging from depths of 22 cm to 28 cm that are well beyond the depth of the proton BP. Finally, distal is the standard deviation of the image noise values.
To study the uncertainty in the position of the PG60 as a function of the number of protons delivered, as shown in Figure 6, we created five independent PG datasets each containing 50,000 PG Figure 6: A schematic of PG60 uncertainty analysis. Five independent datasets containing 50,000 PG events were created from the NN processed dataset measured at 180 kMU/min. Then five independent PG images were created using the number (N) of PG events recorded for the desired number of protons delivered. 1D PG profiles were extracted from each image (black lines) and the "average" of the five profiles created (red line). Gray band represents 2 uncertainty in average PG value. events (an event is only included in one data file) randomly selected from the NN processed PG data file measured at 180 kMU/min. We then produced five separate images for each number of delivered protons using these datasets, using only the number of detected PG events (Table 2). 1D profiles were extracted from each PG image (using the process described above) and an "average" PG profile (red line in Fig. 6) was created as the average value of the five individual profiles as a function of depth.
The uncertainty in the value of the PG profiles at all depths was then defined as twice the standard deviation (2) of the PG values at each depth for the five individual PG profiles. Figure 6 shows the 2 "uncertainty band" (Gray band) around the average PG profile. The process shown in Figure 6 was repeated for all delivered proton numbers listed in Table 2.