The GISMO 2-millimeter Deep Field in GOODS-N

We present deep continuum observations using the GISMO camera at a wavelength of 2 mm centered on the Hubble Deep Field (HDF) in the GOODS-N field. These are the first deep field observations ever obtained at this wavelength. The 1 sigma sensitivity in the innermost approx. 4 arcminutes of the 7 utes map is approx. 135 uJy/beam, a factor of three higher in flux/beam sensitivity than the deepest available SCUBA 850 um observations, and almost a factor of four higher in flux / beam sensitivity than the combined MAMBO/AzTEC 1.2 mm observations of this region. Our source extraction algorithm identifies 12 sources directly, and another 3 through correlation with known sources at 1.2 mm and 850 um. Five of the directly detected GISMO sources have counterparts in the MAMBO/AzTEC catalog, and four of those also have SCUBA counterparts. HDF850.1, one of the first blank-field detected submillimeter galaxies, is now detected at 2 mm. The median redshift of all sources with counterparts of known redshifts is med(z) = 2.91 +/- 0.94. Statistically, the detections are most likely real for 5 of the seven 2 mm sources without shorter wavelength counterparts, while the probability for none of them being real is negligible.


INTRODUCTION
Submillimeter and millimeter observations have revealed the existence of a population of previously unknown high-redshift dust-enshrouded starburst galaxies. Virtually all their stellar UV and optical radiation is absorbed and reradiated by the dust at infrared (IR) wavelengths. They are among the most luminous galaxies in the universe, and their relative contribution to the galaxy number counts and co-moving luminosity density increases with redshift (e.g. Sargent et al. (2012)).
The most massive galaxies are predicted to be at the center of galaxy clusters that reside in the most massive dark matter halos. Surveys that map their distribution with redshift will therefore reveal the epochs of cluster formation in the early universe. For example, follow up observations of two submillimeter galaxies at optical and near-IR wavelengths have shown that they are members of protoclusters that formed at z ≈ 5 (AzTEC-3: Capak et al. (2011), HDF850.1: Walter et al. (2012)). A survey of dusty starbursts is also essential for determining the obscured cosmic star formation rate at high redshift, and for understanding the formation and evolution of dust in these objects.
The advantage of using (sub)millimeter wavelength observations to search for these objects stems from the fact that starburst galaxies have typical dust temperatures of 35 K, so that their IR spectrum peaks at ∼ 90 µm. Submm-mm observations therefore trace the Rayleigh-Jeans part of their spectrum, and benefit from the fact that the decrease in flux from high redshift objects is largely offset by the negative K-correction. Figure 1 depicts the flux of a typical dusty submillimeter galaxy versus redshift (solid lines). This starburst galaxy is characterized by an IR luminosity of 10 12 L ⊙ and a dust mass of 10 8 M ⊙ . The dust was assumed to have a κ(λ) ∝ λ −β mass absorption coefficient with a spectral index of β = 1.5, and a temperature of 35 K. An interesting effect at high redshifts is the fact that dust heating by the cosmic microwave background (CMB) becomes comparable to the heating by ambient starlight. When accounting for both sources of heating, the actual dust temperature can be expressed as T d = (T 4+β 0 + T 4+β CMB ) 1/(4+β) , where T CMB = 2.73(1 + z) is the CMB temperature at redshift z, and T 0 is the dust -Redshift-dependent flux density, measured against the CMB, of a galaxy with fixed dust mass M d = 10 8 M ⊙ , dust emissivity index of β=1.5, i.e. with a FIR luminosity of L ∼ 10 12 L ⊙ , shown for a variety of wavelengths, based on the multi-temperature empirical dust models of Kovács et al. 2010). To radiate the same luminosity against an increasingly warmer CMB in earlier epochs, the cold dust temperature (T 0 =35 K) must rise as T (the effective dust emissivity, β eff , is defined in Kovács et al. 2010). For comparison, the dashed lines show the same if the CMB heating is ignored. Note, that the observed flux density at 2 mm wavelengths increases monotonically and steeply as a function of redshift for z >1. temperature when heated by starlight alone. GISMO (the the Goddard IRAM 2 Millimeter Observer) observations are a differential measurement of the flux from a galaxy against the CMB. The observed galaxy spectrum in such measurement is thus given by: (1) which cannot be characterized by a single blackbody with a simple λ −β emissivity law. The dotted lines in the figure show the fluxes that would be measured if the CMB radiation were not present.
The figure shows that the 2 mm fluxes tend to be lower than those at the shorter wavelengths. However, the rising 2 mm flux with redshift provides the strongest bias towards the detection of high redshift galaxies. Furthermore, the atmospheric transmission is higher, and the atmospheric background noise is lower at 2 mm than at shorter wavelengths.
We have developed the GISMO instrument that utilizes a near background-limited detector to fully exploit the advantages of the 2 millimeter window. Here we report the first deep survey conducted with GISMO centered on the Hubble Deep Field North (HDF-N). The Hubble Deep Field North (HDF-N) is one of the best studied regions in the sky. Its sky coverage is one HST WFPC2 pointing, i.e. ∼ 2.5 ′ × 2.5 ′ , which is less than the 2 ′ × 4 ′ instantaneous field of view of the GISMO array. The HDF is located in the greater GOODS-N region, which has also been studied in exquisite detail over the last decade at many wavelengths, X-rays: Brandt (2008), UV: Teplitz et al. (2006), Optical: Giavalisco et al. (2004), optical spectroscopy: Cowie et al. (2004), Near-Infrared: Yan et al. (2006), Mid-Infrared: Rodighiero et al. (2006), Far-Infrared: Borys et al. (2003), Frayer et al. (2006), Perera et al. (2008), Radio: Morrison et al. (2010). In the (sub-)millimeter regime, the HDF and GOODS-N have been studied by the (sub-)millimeter cameras SCUBA, MAMBO, and AzTEC in the past (Hughes et al. (1998), Pope et al. (2005); Greve et al. (2008), Penner et al. (2011). The currently available data of the full HDF at 1 mm reach 1 − σ sensitivities of 0.5 mJy/beam (MAMBO observations combined with AzTEC observations, (Penner et al. 2011), the SCUBA "super"-map (Pope et al. 2005) reaches a peak depth of 0.4 mJy beam −1 at 850 µm, however the sensitivity varies significantly over the observed area in the field.
The paper is organized as follows: We first describe the instrument and its characteristics in §2. In §3 we describe observations and the data reduction. In §4 we describe the source extraction and its results, and present simulations used to characterize the data and to evaluate the completeness and reliability of the extracted sources. §5 presents the 2 mm number counts and the analysis of the properties of select individual sources.

THE GISMO 2 MM CAMERA
Continuum observations in the 2 mm atmospheric window have not been astronomically explored from the ground to the same degree as has been done at shorter wavelengths (1 mm or less), except for Sunyaev-Zel'dovich (S-Z) observations with dedicated 6 to 10 m class telescopes (Swetz et al. (2011), Carlstrom et al. (2011), Dobbs et al. (2006). The reason for this is predominantly of technical nature, in particular the very demanding requirements on the noise performance of a background-limited camera operating in this low opacity atmospheric window. In order to provide backgroundlimited observations in the 2 mm window at a good mountain site such as the the IRAM 30 m telescope on Pico Veleta, the required sensitivity, expressed in Noise Equivalent Power (NEP), for the detectors is ∼ 5×10 −17 W √ s (Staguhn et al. 2006), a requirement that is met by our "high" temperature (T c = 450 mK) Transition Edge Sensor (TES) detectors. Consequently we have proposed and built a 2 millimeter wavelength bolometer camera, the Goddard-IRAM Superconducting 2 Millimeter Observer (GISMO, Staguhn et al. (2008)) for astronomical observations at the IRAM 30 m telescope on Pico Veleta, Spain (Baars et al. 1987). GISMO uses a compact optical design ) and uses an 8×16 array of close-packed, high sensitivity TES bolometers with a pixel size of 2 × 2 mm 2 , which was built in the Detector Development Laboratory at NASA/GSFC . The array architecture is based on the Backshort Under Grid (BUG) design ). GISMO's bandpass is centered on 150 GHz and has a fractional bandwidth of 20%. The superconducting bolometers are read out by SQUID time domain multiplexers from NIST/Boulder (Irwin et al. 2002). This design is scalable to kilopixel size arrays for future ground-based, suborbital and spacebased X-ray and far-infrared through millimeter cameras (e.g. Staguhn et al. (2012)).

OBSERVATIONS & REDUCTION
The GISMO Deep Field (GDF) observations of the HDF-N were obtained between April 13 and 18, 2011, and on April 11, 12 and 23, 2012. The total integration time was t ∼ 39 hours, however 2/3 of those observations were obtained with GISMO's lower sensitivity during the Spring 2011 run (see section 3.4). The FWHM of GISMO's beam is 17.5".

Data Reduction
The data were reduced, using CRUSH 15 (Kovács 2008), which is the standard reduction software for the GISMO camera. CRUSH is open-source and available in the public domain. The data reduction tool of CRUSH consists of a highly configurable pipeline, which uses a series of statistical estimators in an iterated scheme to separate the astronomical signals from the bright and variable atmospheric background and various correlated instrumental noise signals. It determines proper noise weights for each sample in the time series, removes glitches, identifies bad pixels and other unusable data, and determines the relevant relative gains. It also applies appropriate filters for 1/f -type noise, and other nonwhite detector noise profiles. For the detection of point sources, the resulting "deep"-mode maps are spatially filtered above 50 ′′ FWHM to remove spatially-variant atmospheric residuals. The fluxes in each 10-minute scan are corrected for the line-of-sight atmospheric opacities, based on the IRAM radiometer measurements. Pointsource fluxes are also corrected scan-wise for the fluxfiltering effect of each and every pipeline step, and for the large-scale structure filtering of the final map. As a result, comparison to point-like calibrator sources (e.g. planets and quasars) is straightforward, even if different reduction options are used for these and the science targets.

Calibration
Mars, Uranus, and Neptune were observed for primary flux calibration. Of those, measurements of Mars cover the widest range of weather conditions. Using the atmospheric transmission model of the Caltech Submillimeter Observatory 16 and the IRAM 30 m Telescope 225 GHz radiometer readings, we obtain excellent calibration in effectively all weather conditions: a 7% rms blind calibration up to τ 225GHz ∼ 1 is obtained. Note that any model uncertainties due to the different elevation of Mauna Kea, the site of the CSO, and Pico Veleta, will be very small and therefore irrelevant for the accuracy of derived calibration factors. Figure 2 shows the histogram of the 2 mm line-of-sight opacities for all data.

Pointing & Astrometric Accuracy
During the GISMO observing runs in 2011 and 2012, we obtained a large number of pointing measurements over the entire sky, from which we derived appropriate pointing models according to Greve et al. (1996)  respectively). Additionally, we frequently checked pointing on nearby quasars during GDF observations. Triggered by a reduction flag, CRUSH will automatically incorporate the measured differential offsets with respect to the pointing model, to further improve pointing accuracy, and to remove most systematic pointing errors in the pointing model, or due to structural deformations of the telescope. The resulting residual pointing errors are expected to be independent and random between independent pointing sessions. Thus, a representative lower bound to the final astrometric accuracy is given by the instantaneous pointing rms (<4.2") divided by the square root of the number of independent pointing sessions spanning the observations. In our case, approximately 30 independent pointing sessions bracket the GDF observations. Therefore, the astrometric accuracy of our map (notwithstanding the inherent positional uncertainties of any detections) could be as low as 0.8" rms, or somewhat higher in the presence of systematics errors, which are not eliminated by the use of nearby pointing measurements.

Instrument Performance
The noise equivalent flux density (NEFD) of measurements during the 2011 run was typically 15-17 mJy √ s, under most weather conditions. The obtained sensitivity at that time was mainly limited by a neutral density filter with 40% transmission. This filter was needed, since there was a significant amount of THz light scattered into the GISMO beam by the low pass filters, which were positioned very close to the entrance window of the dewar. In early 2012 we mounted a 77 K baffle dewar in front of the GISMO optical entrance window, which reduces the stray light significantly and eliminates the need for a neutral density filter in the instrument (Sharp et al. 2012). As a result of this, the NEFD obtained during the 2012 observing run was typically 10 mJy √ s. To estimate the noise we randomly multiply each of the individual 10 minute scans by +1 or -1, a method known as "jackknifing". This eliminates any stationary noise (including sources, and foreground) but retains random noise, including that from the atmosphere. The histogram of the S/N for the jackknifed, beam-smoothed, and filtered GISMO map is shown in Figure 3. The distribution is well fit by a Gaussian with σ = 1.00. The S/N histogram for the regular (not jackknifed) smoothed and filtered map (also Fig. 3) shows distinct excess deviations from the Gaussian distribution on both extremes of the distribution. When we subtract the 12 detected sources (see section 4.3) from the image, the S/N histogram does approach the expected noise distribution as shown in Figure 4. The subtraction of sources (both real and false) causes this histogram to be truncated at the 2.99σ detection limit. Below this limit both the positive and negative sides of the histogram are more closely Gaussian because the effective smoothed and filtered PSF used for the subtraction has both positive and negative features (main beam and surrounding filter bowl). There is, however, a slight (∼ 5%) symmetric excess remaining in the postextraction residuals, relative to the jackknifed map. This excess would be consistent with the presence of confusion noise in our map. We cannot, however, quantify the corresponding level of 2-mm confusion noise with any precision because the the noise estimation errors are relatively large given the low number of beams in the field.

Extraction -Method and reliability
The Gaussian nature of noise in our map (section 3.5) is a notable feature of the GISMO data and allows us to provide the strongest possible statistical characterization of our source candidates relying exclusively on formal Gaussian statistics. Because we do not detect any deviation from Gaussian noise, nor see any signs of nonradiometric down-integration in the jackknifed maps, we need not worry about potential troublesome statistical biases that could otherwise result from non-Gaussian, or correlated, noise features.
We extract sources from the beam-smoothed filtered map, which were produced by CRUSH. Beamsmoothing is mathematically equivalent to maximumlikelihood PSF amplitude fitting at every map position (Kovács et al. 2006b). I.e., the PSF-smoothed map value and its noise directly provide the amplitude and uncertainty of a fitted PSF at each map position. The effective filtered PSF for GISMO maps reduced in 'deep' mode in CRUSH is accurately modeled as a combination of a 17.5" FWHM Gaussian main beam combined with a negative 50" FWHM Gaussian bowl such that the combined PSF yield zero integral (signifying that we have no DC sensitivity due to the sky-noise removal, and other filtering during the reduction). The 50" FWHM effective PSF bowl is the direct result of explicit large-scale-structure (LSS) filtering during the reduction with a 50" FWHM Gaussian profile. We checked the Gaussian main beam assumption on quasars (detected at high signal-to-noise >1000), and confirmed that >98% of the integrated flux inside a R=50" circular aperture is recovered in the 17.5" FWHM beam-smoothed peak during night-time observations (i.e. for all of our data). We also checked that the filtered PSF accurately recovers the quasar fluxes, when these were LSS filtered the same as our deep field map, and found no further degradation of photometric accuracy associated with the LSS filtering. Therefore, we are very confident that we have sufficient understanding of the effective PSF in our map, and that the systematic errors of the extracted points source fluxes are kept below 2%.
The source extraction code we used is part of the CRUSH software package, and is the same source extraction tool that was used and described in Weiß et al. (2009). It implements an iterated false-detection rate algorithm. Apart from peak position and flux, the algorithm calculates an estimated confidence and an expected cumulative false detection rate for each extracted source. We caution that the confidence levels and falsedetection rates are guiding values only, which represent our best statistical estimate without prior knowledge of the 2-mm source counts. A more accurate characterization of confidence levels and/or cumulative falsedetection rates would require accurate prior information of the true counts of the 2-mm source population.

Overview of the CRUSH source extraction tool.
Here, we offer a concise summary of the approach implemented by CRUSH 'detect' tool, which we used for the source extraction.
The expected false detection rate, i.e. the expected number of pure noise peaks mistakenly identified as a source, is given by N f (Σ) = N Q(Σ), where N is the number of independent Gaussian variables in the map, and Q(Σ) = 1 − P (Σ) is the complement cumulative Gaussian probability, i.e. the probability of measuring a deviation larger than a chosen significance, Σ. A smoothed and filtered map with extraction area A contains independent variables in terms of the FWHM widths of the Gaussian smoothing (∆ smooth =∆ beam ) and the applied large-scale filtering of the map (∆ filter =50"). The right-hand-side term in the formula accounts for the lost degrees of freedom due to the explicit spatial filtering of our map. The approximately 4.5 parameters per smoothing beam were determined empirically based on the occurrence of significant noise peaks in simulated noise maps. The formula was verified to yield close to the expected number of false detections in simulated noise maps with varying areas and filtering properties, and with N f targeted between 0 to 1000. Thus, the above expression will accurately predict the actual false detection rate, as a function of detection threshold, as long as the map noise is known precisely. For our map, with ∼321 GISMO beams, a 2.99 σ cut yields N f (2.99) ≈ 2 expected false detections.
Due to the presence of many resolved but undetected sources in the map (asymmetric confusion), our noise estimates are bound to be slightly overestimated (even with the median-noise based estimate used). To our best knowledge, all statistical estimates of map noise, which are based on the observed map itself, will result in overstated noise estimates in the presence of asymmetric confusion (resolved sources below detection). Neither the jackknifed noise, nor radiometric noise, can help offer better estimates, as the extraction noise should include the effect of symmetric confusion (unresolved faint sources) beyond what these can offer. As a result of an inevitably biased noise estimation process, the corresponding false detection rate estimates are slightly above actual, and represent a useful conservative upper bound. This is confirmed by the simulations, presented in section 4.5, which found that if the 2 mm source counts were those of, e.g., Béthermin et al. (2011) or Lapi et al. (2011, then the actual false detection rate would be 1.34 or 0.55, respectively, vs. the expected 2. However, as we stated earlier, we cannot unbias our noise estimates, or quantify the true false detection rate, without prior knowledge of the true 2-mm source counts, which are not well-constrained at present. Instead, our estimates offer strong upper bounds for the unknown actual falsedetection rates. Each source identified above the significance cut is removed from the map with the smoothed and filtered point-spread function before the extraction proceeds. Subtraction with the filtered PSF allows the detection of further nearby peaks, which may have been previously suppressed by the negative filter bowl surrounding the previous detections. The circular area (r 2 =∆ 2 beam + ∆ 2 smooth ) containing the main beam of the detected source is flagged after the extraction, since it no longer contains meaningful information after the removal of the source from within. To ensure that our catalog is based on the most accurate measure of the map noise and zero levels, CRUSH estimates the zero level using the mode of the map flux distribution, and estimates the noise from the median observed deviation median(x 2 ) ≈ 0.454937σ 2 ). Both measures are relatively robust and reasonably unbiased by the presence of relatively bright sources, or localized features, in the map.
For each source candidate, CRUSH estimates a detection confidence based on the expected false detection rate N f . According to Poisson statistics, the detection confidence C of a single peak is the probability that no such peak occurs randomly, i.e. P 0 (N f ) = e −N f . This is then further refined to include information from other sources already detected in the map. Thus, if n true sources with apparent significance above Σ are known a priori to exist in the map, than any given peak at significance Σ may be one of n sources, or one of the N f expected false detections, hence the probability of false detection for each of n + N f peaks is reduced by a factor of N f /(n + N f ). (In other words, we should expect only N f false detections (noise peaks) for every n actual sources detectable above a given threshold.) CRUSH uses the number of sources N (> Σ) that were already extracted above significance Σ minus the expected false detection rate N f (Σ) as a selfconsistent proxy for n, which is a reasonable assumption when prior knowledge of the actual underlying counts is not readily available (as in our case). As such, the individual confidence levels of consecutive detections are estimated as 4.2. Deboosting Deboosting is a statistical correction to the observed flux densities, when source counts fall steeply with increasing brightness (e.g. Crawford et al. (2010) and references therein). Thus, in a statistical ensemble of sources, the same observed flux arises more often from one of many fainter sources than from the few brighter ones, relative to the measured value. We assume a measurement with Gaussian noise (validated by the closely Gaussian jackknife noise distribution) and a 2 mm source count model scaled from observationally constrained 850µm counts (e.g. Coppin et al. (2006), Weiß et al. (2009)) assuming T d /(1+z) 10 K (Kovács et al. 2006a) and dust emissivity index (β) of 1.5 (Kovács et al. 2010). We also deboosted our data using the physical numbercount models of Lapi et al. (2011) and Béthermin et al. (2011), see section 5.1.
For deboosting we followed a Bayesian recipe, such as described in Coppin et al. (2005Coppin et al. ( , 2006: expressing the probability of intrinsic source flux S i in terms of the observed flux S o and its measurement uncertainty σ. However, we made some important modifications to the recipe to account for the possibility that the observed flux arises from multiple overlapping galaxies, and we account for confusion. Accordingly, we replace the single isolated source assumption p(S i ) ∝ dN dS (S i ) of Coppin et al. (2005Coppin et al. ( , 2006, with the compound probability that one or more (up to m) resolved sources in the beam contribute to an aggregated intrinsic flux S i : (5) Inside the integrals is the product of the individual component probability densities π(S k ), which correspond to S i arising from a specific combination of (S 1 ... S m ) individual components. The delta function ensures that the component fluxes considered add up to the total intrinsic flux S i when integrated. Each nested integral for S(k) is performed up to the previous flux S k−1 , indicating that each successive component S k is no brighter than the previous one S k−1 , and ensuring that each particular combination of fluxes is counted one time only. Once the fluxes in the outer integrals sum up to S i , the remaining inner integrals can be skipped (in numerical implementations) corresponding to fewer than m actual contributors (or to keep to a more formal notation we can add the delta function at zero, i.e. δ(S k ), to the definition of π(S k ) below to achieve exactly without omitting the any inner integrals). When overlaps are ignored (m=1), Eq. 5 reduces to p(S i ) = π(S i ) naturally (no integration required).
The differential source counts dN dS (S k ) determine the probability that there is at least one intrinsic source with brightness S k in the beam, resolved or unresolved. The distinction between resolved and unresolved sources is important: unresolved sources cause a symmetric widening of the map noise distribution (confusion noise) compared to the experimental noise (e.g. radiometric down-integration as measured by a jackknife); resolved sources, on the other hand, detected or not, will manifest as an excess of flux on the positive side of the observed flux (or S/N) distribution. Deboosting naturally needs to consider resolved sources only.
If one distributes N sources randomly in some large area A (for simplicity's sake let's consider the same area to which the counts are normalized, whether deg 2 or sr) , then the chance that none of these sources fall into a given beam (our detection beam) is: Therefore, at a given flux density S k the probability density of a source with brightness S k being resolved (i.e. unblended with brighter ones) is in terms of the integrated number counts N (>S), and the corresponding differential counts dN dS . Here, π + (S k ) measures the probability density that at least one resolved source with flux S k falls inside the detection beam, and does not exclude the possibility of further fainter components within the same beam (hence the plus sign as the subscript). Using π + (S k ), however, we can easily express the probability density π(S k ) for exactly one component with S k in a given beam, by simply subtracting the integrated probability that there is at least one other fainter object in that same beam with the first one: For the highest order m under consideration, we may truncate by setting π(S m ) ≈ π + (S m ). The approximation is valid as long as m is chosen to be large enough such that resolved overlaps with further components (the right-hand integral term) are negligible.
For the particular case of the GISMO 2 mm sources, we considered up to 3 overlapping components (m=3) contributing to the observed fluxes. We verified that this was sufficient, as we noticed no measurable degree of incremental change in the deboosted values (and profiles) between m=2 and m=3. We chose m=3 to be on the safe side. At the same time allowing for at least 2 overlapping components instead of just a single isolated source (m=1) did have a significant impact on the deboosting results, justifying our modified approach.
Since our source extraction algorithm determines the map zero level as the mode (not the mean) of the distribution, the extracted source fluxes are easily measured against the unresolved background. And, because our deboosting method is based on resolved sources only, it also means that no additional zero-level adjustment is necessary. As a result, the distribution naturally does not extend to negative fluxes, as is demonstrated by the posterior probability distributions of the extracted GISMO sources shown in the appendix. Figure 5 shows the beam-smoothed signal-to-noise map of the 2 mm GDF. Figure 6 shows the noise map, demonstrating that the innermost 4 ′ of the GDF observations have an rms of between 130 and 140 µJy. The source extraction was performed out to twice this level, i.e. up to ∼260µJy rms (Fig. 6). The area for the source extraction is ∼31 square arcminutes (∼321 GISMO beams).

Extraction -Results
The source extraction algorithm finds 12 positive sources and 3 negative "sources". The number of significant negatives is consistent with the expected false detection rate of N f = 2 for a 2.99 σ detection threshold in the extraction area (see Section 4.1). The measured 2 mm fluxes range from 400 to 870 µJy. Table 1 shows the measured fluxes of the 12 positive sources with the achieved signal-to-noise, the estimated detection confidence levels, and the expected cumulative false detection rate N f . The fluxes presented in the table show the measured fluxes, while Table 2 shows the de-boosted values, using a scaled version of the broken power-law SHADES galaxy number counts (Coppin et al. 2006), the Lapi et al. (2011) number counts, as well the values corresponding to the Béthermin et al. (2011) counts.
In order to identify (sub)millimeter counterparts of our sources we cross correlated our data with the 1.16 mm combined MAMBO/AzTEC source catalog containing the 1.1 mm the AzTEC data from Perera et al. (2008), and the MAMBO source catalog from Greve et al. (2008), as well as the SCUBA  (Table 1) are marked with circles (sized as the 17.5 ′′ FWHM of the diffraction limited GISMO beam) and squares (for sources with 1.2mm and/or 850 µm counterparts).
850µm source catalog of GOODS-N (Borys et al. (2003), as well as Borys et al. (2004), Pope et al. (2005), and Pope et al. (2006)). Table 3 shows the measured and deboosted 1.2 mm and 850 µm fluxes of the sources with counterparts, if available. The equation we used for the maximum allowable separation between the GISMO/MAMBO/AzTEC/SCUBA sources in order to be considered a counterpart is given by: where r max is the search radius where the counterpart must fall with > 98% confidence, σ p is the 1 sigma catalog position error combined with the rms astrometric accuracy of our map (<0.8") in quadrature, σ beam is the 1 sigma beam size ∼ F W HM/2.35 , and SNR is the observed signal-to-noise ratio of the detection. When searching for a counterpart in another low S/N dataset of (sub-)mm data set, a second identical SNR-dependent term needs to be added to the search-radius expression above, reflecting the inherent positional uncertainty of the other known mm-wave detection. In section 4.5, we demonstrate the applicability of equation 9 for our dataset.
Combining the GISMO, MAMBO/AzTEC/SCUBA observations (Fig. 7), we identify three additional sources with detection confidence level of > 80%. These are tabulated in Table 4 and 5. The median and mean redshifts of all sources with counterparts and known redshifts arez = 2.91 ± 0.94 andz = 3.3, respectively.

GDF sources without a counterpart
Seven of the detected 2 mm sources have no counterpart in either the MAMBO/AzTEC or the SCUBA data (GDF-2000.2, GDF-2000.4, GDF-2000.5, GDF-2000.7, GDF-2000.9, GDF-2000.10, and GDF-2000. Considering their observed signal to noise ratio and the three negative detections in the map, plus the statistical expectation of 2 false detections as derived in section 4.1, the Note. -N f is the expected cumulative false detection rate as defined in Section 4.1. The maps are beam-smoothed (by 17.5" FWHM) to an effective 24.7 ′′ FWHM image resolution for point source extraction  Note. -MAMBO/AzTEC 1.2 mm data are from Greve et al. (2008) and Perera et al. (2008), SCUBA 850µm data from Borys et al. (2003), Pope et al. (2005), and Greve et al. (2008). z is the measured redshift. The deboosted flux values S ′ are based on the SHADES counts, using the same equations used for calculating the 2 mm data counts in order to be consistent with the deboosting of the 2 mm fluxes shown in Table 2. * Both, GDF-2000.3 andGDF-2000.8, are associated with AzGN07, which implies that this 1.2 mm source is a blend of two sources.
GDF 2000.1 0.70 51.2 ± 2.0 8.53 ± 0.07 13.52 ± 0.06 2.83±0.21 3.08 ± 0.21 1.28 GDF 2000.3 1.08 40.8 ± 0.7 8.59 ± 0.14 13.10 ± 0.03 2.68 ± 0.10 2.91 ± 0.10 1.08 Note. -All quantities were fitted using the measured redshifts (Table 2), temperaturedistribution model (dM d /dT ∝ T −7.2 ) with β = 1.5 and assuming a 2 kpc emission diameter. The dust masses assume κ(850µm)=0.15 m 2 kg −1 . Uncertainties are 1 σ total errors of the fits to data, which do not include the uncertainties in the redshift values. The following quantities are shown in the table: |χ 2 | residual scatter around the fit, Tc temperature of the dominant cold component, M d dust mass, L integrated IR luminosity, q L radio-(F)IR correlation constant as defined in Kovács et al. (2010), q IR radio-(F)IR correlation constant as defined in Ivison et al. (2010), and τ peak optical depth around the IR emission peak. most likely scenario is that 5 of these are real detections. Figure 8 shows the redshift-dependent GDF equivalent map sensitivities at 850 µm and 1.2 mm for the detection of a galaxy with the SED shown in Fig. 1. The figure demonstrates that the equivalent 1.2 mm source sensitivity requirement to match the GDF source sensitivity varies by about a factor of two with redshift, whereas the required depth of the 850 µm SCUBA map varies significantly. In order to achieve the same GDF source sensitivity the 1 σ map noise rms at 1.2 mm would need to be ∼ 0.7 mJy/beam if the galaxy were at a redshift of 2, while it would require ∼0.4 mJy/beam if the same source were at an extremely high redshift. At 850 µm the matching map flux sensitivity requirement ranges from about 2 mJy/beam for a source at a redshift z ∼ 2 to ∼ 0.5 mJy/beam for a source at extremely high redshifts. Table 7 shows the actual 850 µm and 1.2 mm map sensitivities at the seven positions of GDF sources without counterparts. The table shows that the depth of the 1.2 mm map is quite homogeneous for all sources without counterparts with an rms of about 0.56 mJy for each of those. This means that for our template SED, the source sensitivity of the GDF map exceeds that of the 1.2 mm data for redshifts of about z = 6 and greater. The situation for the SCUBA data is different, since the sensitivity over that map varies very significantly as is demonstrated by the range as shown in the same table. Only at the position of GDF-2000.7 the sensitivity of the SCUBA map is below an rms of 1 mJy/beam (Table  7). For the other sources the 850 µm map sensitivities are between 1 and 2 mJy/beam rms. A comparison of these numbers with the equivalent 850 µm GDF point source sensitivities shown in Fig. 8 shows that the 2 mm point source sensitivity of all GDF sources without counterparts, with the exception of GDF-2000.7, exceeds the point source sensitivity of the SCUBA map for redshifts z > 6. Taken together, the 850 µm and 1.2 mm data on non-detected GDF sources are entirely consistent with sources at z > 6, assuming the template from Fig. 1 applies. However, another aspect when considering that a detected 2 mm source has no counterpart in another catalog is that of completeness or the probability of a low S/N source to be extracted. For the 850 µm SCUBA map, e.g., only the completeness of sources with S/N 5 is > 90% (Coppin et al. 2006), i.e. the S/N dependent probabilities are similar to those derived in the completeness analysis for the GDF data presented in section 4.5. The situation is similar for the 1.2 mm data. As a result, the S/N limitations of the 850 µm SCUBA and 1.2 mm MAMBO/AzTEC data sets combined with the low number of 2 mm sources prevents us from deriving significant redshift constraints on the GISMO sources without counterparts. Larger deep 2-mm surveys will be needed to photometrically study what fraction of the 2-mm population is at very high-redshift (z > 6), one of the main science goals of GISMO.

Simulations
Additional characterizations of the extracted sources were obtained through simulations, generally following the analysis of Weiß et al. (2009). These simulations demonstrate also the validity of the equations used for source extraction presented in in sections 4.1 and 4.3. The simulations started with the construction of 100 variations of jackknifed noise maps that were generated from the original data. These maps provide  accurate representations of the noise of the observations. We next constructed 3000 versions of the point source distribution across the full area of the map (not just the low noise, high coverage regions) based on the Béthermin et al. (2011) model and the smoothed and filtered GISMO beam. The simulations included sources down to 0.01 mJy in order to provide an appropriate confusion background. Source brightnesses were chosen at random from the cumulative source counts, N (> S), producing a sample size appropriate for 3000 images. These sources were then distributed randomly across the full set of 3000 noise-free images. Therefore, the peak brightness and the total number of sources in each simulated image are subject to variation due to Poisson statistics. There were an average of 2264 sources in each simulated image. The jackknife noise maps were added to the simulations, reusing each of the noise maps 30 times. The 3000 simulated images were then run through the source detection procedure, using the same settings as were applied to the actual data. These procedures were repeated using another set of 3000 simulated point source maps based on the Lapi et al. (2011) models, which predict a higher source density with a mean of 3780 sources (S > 0.01 mJy) per simulated image.
The positions of the extracted sources were matched , vs the prediction from Eq. 9 as a function of S/N (solid curve). To convert the simulated input fluxes to S/N, we estimated an average depth of 0.15 mJy/beam in our map, and we used σp = 0 since there were no intrinsic pointing errors in the simulated data. The dotted lines indicate the range of fluxes extracted from our map, i.e. the range for which search radii are calculated. We note that at low S/N the simulations tend to falsely identify a nearer chance peak (one of the many faint sources filling the map, or a noise peak) as the counterpart to the input. As such, the simulations tend to underestimate the true position errors at low S/N, explaining the deviation from the curve below detection level. Also, the asymptotic behavior at high S/N (>10) is not well modeled by Eq. 9, however for the range of fluxes considered, the calculated search radii are in remarkable agreement with simulations, thus justifying our reliance on them. with those of the simulated input sources for each of the 3000 simulations based on the two different models. For a given extracted source, the matching input simulated source was chosen as the brightest source within a 5 pixel = 15 ′′ radius and with a brightness > 0.1 mJy. In most cases there is only one source within 15 ′′ , but infrequently a faint input source happens to lie closer to the extracted source position than a brighter input source that is the true origin of the extracted source. Figure 9 compares the brightnesses of the extracted sources with the associated input source brightnesses. Results are binned in 0.1 mJy intervals, and error bars indicate the standard deviation of the sources averaged in each bin. For either set of simulations, we find that source boosting becomes significant at S 1 mJy. Consistent with the fluxes we measure for the extracted sources from the GDF, typical fluxes extracted from the simulations are greater than ∼400 uJy/beam, as we have used the same detection criterion (at 2.99-σ) on our simulated data as on our observed map, and because our map has an average 1-σ depth of around 135-140 mJy/beam. Figure 10 shows the completeness of the extracted sources, or the fraction of the input sources that are extracted. Whereas essentially all sources with S > 1 mJy are recovered by the source extraction, the completeness drops to ∼ 50% at S ∼ 0.4 mJy for both models of the source counts. Figure 11 shows the expected trend of increasing positional errors with decreasing source brightness. The positions of fainter sources are more strongly affected by noise and source confusion, consistent with Eq. 9. In Figure 12 we compare the search radius given by Eq. 9 with the results from the simulated positional accuracy. The figure demonstrates that over the relevant range of fluxes, the calculated search radii are in remarkable agreement with simulations, thus justifying our reliance on them. An assessment of the reliability of the source extractions is shown in Figure 13, where we plot the fraction of sources at a given extracted brightness that can be associated with corresponding input source in the simulations. For either model, the reliability of the extracted sources begins to drop at S < 1.2 mJy. The drop is somewhat greater for simulations using the Béthermin et al. (2011) source counts, which has an overall lower density of sources. As expected, the reliability drops more rapidly near the detection threshold (∼400 mJy/beam, given the average 1-σ depth around 135-140 mJy/beam over the full map). Note, that because the reliability depends strongly on the shape of the unknown actual number counts, especially near the chosen detection threshold, the confidence levels shown in Table 1, and calculated using Eq. 3, serve as our guiding estimates of the source reliabilities in the absence of prior knowledge of the actual source counts. Figure 14 shows the same information in a slightly different way. Histograms of the extracted source brightnesses are compared to the histograms of the input simulation source brightnesses (when they exist). The ratio of the two histograms yields the reliability that was plotted in Figure 13  the same mean number of sources extracted. Figure 17 depicts the observed cumulative number counts, N (> S), as a function of the deboosted 2 mm flux density, S, compared with the predicted galaxy counts in Béthermin et al. (2011) (the solid line is a power-law interpolation between their 2.1 and 1.3 mm models), and the model from Lapi et al. (2011) (dashed line). These counts are not binned due to the small number of sources. Instead, as in e.g. Borys et al. (2003), we plot the number of sources at each deboosted flux density, divided by the effective area for the detection of sources of a given flux density. Following Coppin et al. (2006), the effective area is calculated as the product of the maximum area of field and the fit to the completeness function ( Fig.   Fig. 17.-The number counts N (> S) as a function of the deboosted 2mm fluxes (S). The black symbols show the deboosting using the Lapi et al. (2011) model extrapolated down to 0.01 mJy. The blue and red show the deboosted number counts when the Béthermin et al. (2011) and the SHADES (Coppin et al. 2006) models are used to calculate the deboosting. The solid line is an interpolation of the 1.38 and 2.1 mm model number counts of Béthermin et al. (2011), and the dashed line is the model from Lapi et al. (2011). The right-hand axis labels the counts in terms of number of sources per beam for convenience in assessing confusion. The counts were corrected for the signalto-noise dependent effective area for source extraction in the map, as well as for the expected number of false detections from Table  1. 10), although in our case the functional form is simpler and involves only one free parameter. The effective area ranges from 0.21 to 0.90 times the maximum area.

2 mm number counts
Instead of plotting the cumulative number counts as a simple stair-stepped line, we include the uncertainties for each of the deboosted flux densities. Figure 17 also demonstrates that the cumulative number counts are independent of the three deboosting models were used, considering the uncertainties in the deboosted flux densities. We note that the counts are somewhat steeper than the models.
On a cautionary note, our number counts represent measurements from a small patch of the sky, therefore allowing for quite some uncertainty in terms of cosmic variance, which in the context of AzTEC 1.1 mm number counts is being discussed in Scott et al. (2012).

Source properties and associations
The GISMO 2-mm datum typically adds an important long-wavelength constraint to the FIR Spectral-Energy Distribution (SED) of a distant (z >2) infrared galaxy, provided it is measured with sufficient precision. When it is the only datum on the Rayleigh-Jeans side of the thermal greybody spectrum, it provides a critical constraint on the infrared luminosity of the galaxy and the cold star-forming dust mass. And, in combination with other Rayleigh-Jeans data (such as 850 µm or 1.2 mm), the 2-mm point can furthermore constrain the temperature and the dust emissivity index β, which is a diagnostic of the physical geometry of the dust grains (see e.g. Yang et al. 2007).
Because the large statistical uncertainties associated with the underlying source fluxes in case of the low signal-to-noise (SNR<5) detections, where statistical flux boosting is significant (section 4.2), we cannot provide accurate 2-mm photometry individually for most of the detected sources, except for GDF-2000.1 and GDF-2000.3. However, we can use the GISMO data collectively to determine useful constraints for the population as a whole.
To fit the radio-to-FIR SEDs of our sources, we relied on the analytic temperature-distribution models of Kovács et al. (2010), which assumes a power law distribution (dM d /dT ∝ T −γ ) of dust components above a dominant cold-temperature component at T c . For the fitting we have assumed a characteristic emission diameter of 2 kpc, typical to SMGs, and a dust emissivity index of β=1.5, typical for starbursts (Kovács et al. 2006(Kovács et al. , 2010. We use all available continuum data at wavelengths longer than 15 µm (rest frame 3 µm), a regime dominated by thermal dust (millimeter wavelengths to FIR) and synchrotron radiation (at the radio wavelengths). We assume a 10% calibration uncertainty for all measured bands, added in quadrature to the reported detection uncertainties.
The collective fit to all GISMO sources with sufficient photometry (sources 1, 3, 6, 8, 9, 11, 13 and 14) yields γ=7.24±0.23, in excellent agreement with local starburst galaxies (see Kovács et al. 2010). Therefore, for the two most significant individual sources, which we discuss in the following, we fix γ=7.2, and fit only the coldcomponent temperature, dust mass and the radio-FIR correlation constant q L (Kovács et al. 2010) or q IR . For the synchrotron spectral index (S ν ∝ ν −α ) we assume α=0.75.  -2000.1 andGDF-2000.3 are the two most significant GDF detections with counterparts. Our SED fits for these sources follow the method described in Kovács et al. (2010). The plots show IRAC data (at λ < 8 µm), but those were not used for the fit. A summary of the fitted parameters is given in Table 6.

GDF
The detection of GDF-2000.1, has a signal to noise ratio σ > 5 and a statistical detection confidence level of 100%. It is associated with AzGN03 (also dubbed: GN1200.02, GN850.10, MM J123633+621407, SMM J123633.8+621408), and GN10 in Pope et al. (2005) and Pope et al. (2006). This is the source discussed in, e.g., Dannerbauer et al. (2008), and in Daddi et al. (2009) who determine its redshift as z = 4.04 through measurements of CO(4-3), the redshift we assume for this source. We note that Wirth et al. (2004) report a possible counterpart at z = 1.34476, flagged as "very secure z", with > 99% confidence, centered ∼ 3 ′′ north of the nominal center of GDF2000.1 coordinates. Fig. 18 shows the SED of GDF-2000.1, using all available flux information of this source at other wavelengths. GDF-2000.3, is detected with a signal to noise ratio of σ > 4. With 99% the confidence level for detection is very high. GDF-2000.3 has the SCUBA counterpart GN850.39 and MAMBO/AzTEC counterpart AzGN07. The (sub)millimeter counterparts are described in Pope et al. (2006) and Greve et al. (2008). Two sources with known redshifts are positionally consistent with our measured position: GOODS J123711.98+621325.7 with z = 1.992, and SMM J123711.9+621331, with z = 1.990. Since there are no other IR sources similarly close we consider those plausible counterparts and use the redshift for our SED fit (Fig. 19).
GDF  (see text and Table 6), but does not include the 3.6 -8 µm IRAC data as constraints. 10 13 L sun , and SF R > 1000M sun /yr. The estimated optical depths (∼ 1) at the peak of the emission are typical to SMGs. Both q values are significantly higher than the radio-FIR correlation for SMGs (Kovács et al. (2006a) and Kovács et al. (2010) both found < q L >∼ 2.13 for SMGs, with a scatter of 0.12 dex only). Thus, the excess far-infrared emission might need to be explained by the presence of an additional significant heating source besides stars, possibly a powerful AGN. This is unlike the bulk of the known SMG population, where AGNs, although often present, are not significant contributors (<20%) to the infrared emission. GDF-2000.6 GDF2000.6 is HDF850.1 (z = 5.2), the most prominent among the first ever identified SCUBA Deep Field sources (Hughes et al. 1998). The observed GISMO position and the deboosted 2 mm flux are consistent with the data published in Walter et al. (2012).

Other GDF sources with counterparts
We identify AzGN07 as the counterpart for GDF-2000.8, AzGN08 as the counterpart for GDF-2000.11, and GN850.21 and GN1200.29 as counterparts for the low S/N source GDF-2000.13. We associate CXO J123627.53+621218.0 (with a photometric redshift z = 4.69) and AzGN10 as counterparts of GDF-2000.14, since the positions of these two sources are essentially identical, and in very good agreement with the GISMO detection. Furthermore, the high photometric redshift value for the CXO source plus the observed 1 mm flux makes a 2 mm detection at our sensitivity level very likely. Finally, we identify GN850.28 as counterpart to GDF-2000.15.

SUMMARY AND CONCLUSION
We have obtained a 7 ′ diameter 2 mm continuum deep field map centered on the HDF. The rms in the inner part of the map is ∼ 135µJy/beam. The noise in the un-smoothed data very closely follows a Gaussian distribution, indicating its random nature and validating probabilistic source extraction statistics.
We detect 12 sources plus 3 false, negative "detections". A statistical analysis of the data predicts 2 false detections. 5 of the detected 12 sources have known (sub)millimeter counterparts, including HDF850.1, the first submillimeter galaxy detected by SCUBA. Three more low signal to noise sources have been identified through cross correlation with existing (sub)mm data. The mean redshift of all 7 of the counterparts with known redshifts isz = 3.3, the median redshift of those sources, which at this low number of sources is probably a better estimate, isz = 2.91 ± 0.94.
Of the remaining 7 detected sources which have no (sub)millimeter counterpart, statistically we expect 5 to be real.
The jackknife test of the smoothed data shows an almost perfect Gaussian distribution for the S/N histogram. The S/N histogram of the normally processed, smoothed data shows a clear excess beyond a Gaussian distribution, which mostly can be contributed to 12 astronomical sources in the field. A small symmetric excess remains after the resolved sources are subtracted from the image. This most likely indicates the presence of confusion noise in our data. Figure 20 shows the posterior probability distributions for all GISMO sources, indicating the Bayesian probability densities that the detected source flux arises from an intrinsic source flux of S i . The probabilities account for up to 3 overlapping resolved sources contributing to the observed flux, and take confusion at the faint end into account. Since our deboosting method is based on resolved sources only, no additional zero-level adjustment is necessary. As a result, the distribution naturally does not extend to negative fluxes, as is evident in the figure. The distributions are shown for the number count models of Béthermin et al. (2011) in red/solid, the extended Lapi et al. (2011)   -Posterior probability distributions for all GISMO sources, indicating the Bayesian probability densities that the detected source flux arises from an intrinsic source flux of S i . The probabilities account for up to 3 overlapping resolved sources contributing to the observed flux, and take confusion at the faint end into account. The distributions are shown for the number count models of Béthermin et al. (2011) in red/solid, the extended Lapi et al. (2011) counts in blue/dashed, and the scaled broken powerlaw SHADES counts in cyan/dashed-dotted. Note, that the jagged curves resulting from the Béthermin et al. (2011) counts are not a property of our algorithm, but are inherent to the input counts model.