Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling

dc.contributor.authorSaha, Sourajit
dc.contributor.authorGokhale, Tejas
dc.date.accessioned2024-05-06T15:06:05Z
dc.date.available2024-05-06T15:06:05Z
dc.date.issued2024-04-10
dc.description.abstractDownsampling operators break the shift invariance of convolutional neural networks (CNNs) and this affects the robustness of features learned by CNNs when dealing with even small pixel-level shift. Through a large-scale correlation analysis framework, we study shift invariance of CNNs by inspecting existing downsampling operators in terms of their maximum-sampling bias (MSB), and find that MSB is negatively correlated with shift invariance. Based on this crucial insight, we propose a learnable pooling operator called Translation Invariant Polyphase Sampling (TIPS) and two regularizations on the intermediate feature maps of TIPS to reduce MSB and learn translation-invariant representations. TIPS can be integrated into any CNN and can be trained end-to-end with marginal computational overhead. Our experiments demonstrate that TIPS results in consistent performance gains in terms of accuracy, shift consistency, and shift fidelity on multiple benchmarks for image classification and semantic segmentation compared to previous methods and also leads to improvements in adversarial and distributional robustness. TIPS results in the lowest MSB compared to all previous methods, thus explaining our strong empirical results.
dc.description.urihttp://arxiv.org/abs/2404.07410
dc.format.extent20 pages
dc.genrejournal articles
dc.genrepreprints
dc.identifierdoi:10.13016/m2matm-lbrk
dc.identifier.urihttps://doi.org/10.48550/arXiv.2404.07410
dc.identifier.urihttp://hdl.handle.net/11603/33635
dc.language.isoen_US
dc.relation.isAvailableAtThe University of Maryland, Baltimore County (UMBC)
dc.relation.ispartofUMBC Faculty Collection
dc.relation.ispartofUMBC Computer Science and Electrical Engineering Department
dc.relation.ispartofUMBC Student Collection
dc.rightsCC BY-NC-ND 4.0 DEED Attribution-NonCommercial-NoDerivs 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/
dc.subjectComputer Science - Computer Vision and Pattern Recognition
dc.subjectComputer Science - Machine Learning
dc.titleImproving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling
dc.typeText
dcterms.creatorhttps://orcid.org/0000-0003-1357-7813
dcterms.creatorhttps://orcid.org/0000-0002-5593-2804

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
2404.07410.pdf
Size:
7.69 MB
Format:
Adobe Portable Document Format