The recovery of a real signal from its auto-correlation is a wide-spread problem in computational imaging, and it is equivalent to retrieve the phase linked to a given Fourier modulus. Image-deconvolution, on the other hand, is a funda- mental aspect to take into account when we aim at increasing the resolution of blurred signals. These problems are addressed separately in a large number of experimental situations, ranging from adaptive astronomy to optical microscopy. Here, instead, we tackle both at the same time, performing auto-correlation inversion while deconvolving the current object estimation. To this end, we propose a method based on I -divergence optimization, turning our formalism into an iterative scheme inspired by Bayesian-based approaches. https://www.selleckchem.com/products/LBH-589.html We demonstrate the method by recovering sharp signals from blurred auto-correlations, regardless of whether the blurring acts in auto-correlation, object, or Fourier domain.Few-shot learning for fine-grained image classification has gained recent attention in computer vision. Among the approaches for few-shot learning, due to the simplicity and effectiveness, metric-based methods are favorably state-of-the-art on many tasks. Most of the metric-based methods assume a single similarity measure and thus obtain a single feature space. However, if samples can simultaneously be well classified via two distinct similarity measures, the samples within a class can distribute more compactly in a smaller feature space, producing more discriminative feature maps. Motivated by this, we propose a so-called Bi-Similarity Network (BSNet) that consists of a single embedding module and a bi-similarity module of two similarity measures. After the support images and the query images pass through the convolution-based embedding module, the bi-similarity module learns feature maps according to two similarity measures of diverse characteristics. In this way, the model is enabled to learn more discriminative and less similarity-biased features from few shots of fine-grained images, such that the model generalization ability can be significantly improved. Through extensive experiments by slightly modifying established metric/similarity based networks, we show that the proposed approach produces a substantial improvement on several fine-grained image benchmark datasets. Codes are available at https//github.com/PRIS-CV/BSNet.Image fusion plays a critical role in a variety of vision and learning applications. Current fusion approaches are designed to characterize source images, focusing on a certain type of fusion task while limited in a wide scenario. Moreover, other fusion strategies (i.e., weighted averaging, choose-max) cannot undertake the challenging fusion tasks, which furthermore leads to undesirable artifacts facilely emerged in their fused results. In this paper, we propose a generic image fusion method with a bilevel optimization paradigm, targeting on multi-modality image fusion tasks. Corresponding alternation optimization is conducted on certain components decoupled from source images. Via adaptive integration weight maps, we are able to get the flexible fusion strategy across multi-modality images. We successfully applied it to three types of image fusion tasks, including infrared and visible, computed tomography and magnetic resonance imaging, and magnetic resonance imaging and single-photon emission computed tomography image fusion. Results highlight the performance and versatility of our approach from both quantitative and qualitative aspects.Intra/inter switching-based error resilient video coding effectively enhances the robustness of video streaming when transmitting over error-prone networks. But it has a high computation complexity, due to the detailed end-to-end distortion prediction and brute-force search for rate-distortion optimization. In this article, a Low Complexity Mode Switching based Error Resilient Encoding (LC-MSERE) method is proposed to reduce the complexity of the encoder through a deep learning approach. By designing and training multi-scale information fusion-based convolutional neural networks (CNN), intra and inter mode coding unit (CU) partitions can be predicted by the networks rapidly and accurately, instead of using brute-force search and a large number of end-to-end distortion estimations. In the intra CU partition prediction, we propose a spatial multi-scale information fusion based CNN (SMIF-Intra). In this network a shortcut convolution architecture is designed to learn the multi-scale and multi-grained image information, which is correlated with the CU partition. In the inter CU partition, we propose a spatial-temporal multi-scale information fusion-based CNN (STMIF-Inter), in which a two-stream convolution architecture is designed to learn the spatial-temporal image texture and the distortion propagation among frames. With information from the image, and coding and transmission parameters, the networks are able to accurately predict CU partitions for both intra and inter coding tree units (CTUs). Experiments show that our approach significantly reduces computation time for error resilient video encoding with acceptable quality decrement.The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales.
The recovery of a real signal from its auto-correlation is a wide-spread problem in computational imaging, and it is equivalent to retrieve the phase linked to a given Fourier modulus. Image-deconvolution, on the other hand, is a funda- mental aspect to take into account when we aim at increasing the resolution of blurred signals. These problems are addressed separately in a large number of experimental situations, ranging from adaptive astronomy to optical microscopy. Here, instead, we tackle both at the same time, performing auto-correlation inversion while deconvolving the current object estimation. To this end, we propose a method based on I -divergence optimization, turning our formalism into an iterative scheme inspired by Bayesian-based approaches. https://www.selleckchem.com/products/LBH-589.html We demonstrate the method by recovering sharp signals from blurred auto-correlations, regardless of whether the blurring acts in auto-correlation, object, or Fourier domain.Few-shot learning for fine-grained image classification has gained recent attention in computer vision. Among the approaches for few-shot learning, due to the simplicity and effectiveness, metric-based methods are favorably state-of-the-art on many tasks. Most of the metric-based methods assume a single similarity measure and thus obtain a single feature space. However, if samples can simultaneously be well classified via two distinct similarity measures, the samples within a class can distribute more compactly in a smaller feature space, producing more discriminative feature maps. Motivated by this, we propose a so-called Bi-Similarity Network (BSNet) that consists of a single embedding module and a bi-similarity module of two similarity measures. After the support images and the query images pass through the convolution-based embedding module, the bi-similarity module learns feature maps according to two similarity measures of diverse characteristics. In this way, the model is enabled to learn more discriminative and less similarity-biased features from few shots of fine-grained images, such that the model generalization ability can be significantly improved. Through extensive experiments by slightly modifying established metric/similarity based networks, we show that the proposed approach produces a substantial improvement on several fine-grained image benchmark datasets. Codes are available at https//github.com/PRIS-CV/BSNet.Image fusion plays a critical role in a variety of vision and learning applications. Current fusion approaches are designed to characterize source images, focusing on a certain type of fusion task while limited in a wide scenario. Moreover, other fusion strategies (i.e., weighted averaging, choose-max) cannot undertake the challenging fusion tasks, which furthermore leads to undesirable artifacts facilely emerged in their fused results. In this paper, we propose a generic image fusion method with a bilevel optimization paradigm, targeting on multi-modality image fusion tasks. Corresponding alternation optimization is conducted on certain components decoupled from source images. Via adaptive integration weight maps, we are able to get the flexible fusion strategy across multi-modality images. We successfully applied it to three types of image fusion tasks, including infrared and visible, computed tomography and magnetic resonance imaging, and magnetic resonance imaging and single-photon emission computed tomography image fusion. Results highlight the performance and versatility of our approach from both quantitative and qualitative aspects.Intra/inter switching-based error resilient video coding effectively enhances the robustness of video streaming when transmitting over error-prone networks. But it has a high computation complexity, due to the detailed end-to-end distortion prediction and brute-force search for rate-distortion optimization. In this article, a Low Complexity Mode Switching based Error Resilient Encoding (LC-MSERE) method is proposed to reduce the complexity of the encoder through a deep learning approach. By designing and training multi-scale information fusion-based convolutional neural networks (CNN), intra and inter mode coding unit (CU) partitions can be predicted by the networks rapidly and accurately, instead of using brute-force search and a large number of end-to-end distortion estimations. In the intra CU partition prediction, we propose a spatial multi-scale information fusion based CNN (SMIF-Intra). In this network a shortcut convolution architecture is designed to learn the multi-scale and multi-grained image information, which is correlated with the CU partition. In the inter CU partition, we propose a spatial-temporal multi-scale information fusion-based CNN (STMIF-Inter), in which a two-stream convolution architecture is designed to learn the spatial-temporal image texture and the distortion propagation among frames. With information from the image, and coding and transmission parameters, the networks are able to accurately predict CU partitions for both intra and inter coding tree units (CTUs). Experiments show that our approach significantly reduces computation time for error resilient video encoding with acceptable quality decrement.The crowd counting is challenging for deep networks due to several factors. For instance, the networks can not efficiently analyze the perspective information of arbitrary scenes, and they are naturally inefficient to handle the scale variations. In this work, we deliver a simple yet efficient multi-column network, which integrates the perspective analysis method with the counting network. The proposed method explicitly excavates the perspective information and drives the counting network to analyze the scenes. More concretely, we explore the perspective information from the estimated density maps and quantify the perspective space into several separate scenes. We then embed the perspective analysis into the multi-column framework with a recurrent connection. Therefore, the proposed network matches various scales with the different receptive fields efficiently. Secondly, we share the parameters of the branches with various receptive fields. This strategy drives the convolutional kernels to be sensitive to the instances with various scales.
0 Comments 0 Shares 88 Views 0 Reviews
Sponsored