The Visual Computer
Công bố khoa học tiêu biểu
* Dữ liệu chỉ mang tính chất tham khảo
Sắp xếp:
Motion normalization method based on an inverted pendulum model for clustering
The Visual Computer - Tập 34 - Trang 29-40 - 2016
In many creative industries, such as the animation, movie, and game industries, artists often make good use of motion data to create their works by retrieving a particular motion from motion-capture data and reusing it. A large database of human motion is difficult to use unless the motion data are organized according to the type of motion. Although there have been many results for clustering motion capture data, many variations in the motion data complicate the clustering of data by making one type of motion numerically similar to other types of motions. To improve the motion clustering performance, we present a novel physically based motion normalization method that reduces ambiguous elements of motions, so that motions that have different semantics can be differentiated. The normalized motion data generated by our method can be used as input to existing clustering algorithms and improves the results.
Color image encryption algorithm based on Fisher-Yates scrambling and DNA subsequence operation
The Visual Computer - Tập 39 - Trang 43-58 - 2021
In this paper, a color image encryption algorithm based on Fisher-Yates scrambling and DNA subsequence operation (elongation operation, truncation operation, deletion operation, insertion arithmetic) is proposed. Firstly, the three-dimensional color image is transformed into two-dimensional gray image, and the chaotic sequence generated by Chen system and Fisher-Yates scrambling method is used to scramble the plaintext images of R, G and B channels. Secondly, the three channel images of the scrambled plaintext image are transformed into three DNA sequence matrixes by using the DNA coding rules, and then the three DNA sequence matrixes are manipulated by using DNA subsequence operation and DNA addition, subtraction and XOR operation to destroy the scrambled plaintext information. Finally, the color encrypted image is obtained by using the DNA decoding rule. Experimental results and security analysis demonstrate that our encryption algorithm has good performance and may resist against various typical attacks.
CTUNet: automatic pancreas segmentation using a channel-wise transformer and 3D U-Net
The Visual Computer - Tập 39 Số 11 - Trang 5229-5243 - 2023
Diabetes, pancreatic cancer, and pancreatitis are all diseases of the pancreas, which seriously threaten people’s lives. The pancreas has a special anatomical structure, its size, shape, and position are variable, and it is highly similar to other surrounding deep abdominal tissues, so achieving accurate segmentation is still one of the most challenging tasks in the field of medical image segmentation. We propose a new network CTUNet that combines Transformer and 3D U-Net, which can achieve high-precision automatic segmentation of the pancreas. We deploy the Transformer on skip connections to coordinate global explicit features and guide the network learning. In view of pancreas reciprocity and shape variability, we design a Pancreas Attention module and add it to each encoder to further enhance the ability to extract context information and learn distinct features. In addition, in the decoder, we use a novel Feature Concatenation module with an attention mechanism to further promote the fusion of different levels of features and alleviate the problem of loss of down-sampling feature information. We train and test our model on the NIH dataset and evaluate with Dice Similarity Coefficient, Jaccard Index, Precision, and Recall. Experimental results show that our proposed model outperforms most existing pancreas segmentation methods.
Three-stage generative network for single-view point cloud completion
The Visual Computer - Tập 38 - Trang 4373-4382 - 2021
3D shape completion from single-view scan is an important task for follow-up applications such as recognition and segmentation, but it is challenging due to the critical sparsity and structural incompleteness of single-view point clouds. In this paper, a three-stage generative network (TSGN) is proposed for single-view point cloud completion, which generates fine-grained dense point clouds step by step and effectively overcomes the ubiquitous problem—the imbalance between general and individual characteristics. In the first stage, an encoder–decoder network consumes a partial point cloud and generates a rough sparse point cloud inferring the complete geometric shape. Then, a bi-channel residual network is designed to refine the preliminary result with assistance of the original partial input. A local-based folding network is introduced in the last stage to extract local context information from the revised result and build a dense point cloud with finer-grained details. Experiments on ShapeNet dataset and KITTI dataset validate the effectiveness of TSGN. The results on ShapeNet demonstrate the competitive performance on both CD and EMD.
High-to-low-level feature matching and complementary information fusion for reference-based image super-resolution
The Visual Computer - Tập 40 - Trang 99-108 - 2023
The aim of the reference-based image super-resolution (RefSR) is to reconstruct high-resolution (HR) when a reference (Ref) image with similar content as that of the low-resolution (LR) input is given. In the task, the quality of existing approaches degrades severely when there are several similar objects but different contents. Besides, not all similar information in the reference image is useful for the input image. Therefore, we propose high-to-low-level feature matching and complementary information fusion (HMCF) network for RefSR. The matching strategy adopts high-level to low-level feature matching to distinguish similar objects but different contents according to high-level semantics. The complementary information fusion module utilizes the channel and spatial attention to select the complement information for LR image and keeps the pixel consistency of input and Ref image. We perform extensive experiments to demonstrate that our proposed HMCF obtains the SOTA performance on the RefSR benchmarks and presents a high visual quality.
3D object retrieval via range image queries in a bag-of-visual-words context
The Visual Computer - Tập 29 Số 12 - Trang 1351-1361 - 2013
3D object retrieval based on range image queries that represent partial views of real 3D objects is presented. The complete 3D models of the database are described by a set of panoramic views, and a Bag-of-Visual-Words model is built using SIFT features extracted from them. To address the problem of partial matching, we suggest a histogram computation scheme, on the panoramic views, that represents local information by taking into account spatial context. Furthermore, a number of optimization techniques are applied throughout the process for enhancing the retrieval performance. Its superior performance is shown by evaluating it against state-of-the-art methods on standard datasets.
Weakly supervised graph learning for action recognition in untrimmed video
The Visual Computer - Tập 39 - Trang 5469-5483 - 2022
Action recognition in real-world scenarios is a challenging task which involves the action localization and classification for untrimmed video. Since the untrimmed video in real scenarios lacks fine annotation, existing supervised learning methods have limited effectiveness and robustness in performance. Moreover, state-of-the-art methods discuss each action proposal individually, ignoring the exploration of semantic relationship between different proposals from continuity of video. To address these issues, we propose a weakly supervised approach to explore the proposal relations using Graph Convolutional Networks (GCNs). Specifically, the method introduces action similarity edges and temporal similarity edges to represent the context semantic relationship between different proposals for graph constructing, and the similarity of action features is used to weakly supervise the spatial semantic relationship between labeled and unlabeled samples to achieve the effective recognition of actions in the video. We validate the effectiveness of the proposed method on public benchmarks for untrimmed video (THUMOS14 and ActivityNet). The experimental results demonstrate that the proposed method in this paper has achieved state-of-the-art results, and achieves better robustness and generalization performance.
Detail-driven digital hologram generation
The Visual Computer - - 2009
Digital holography is a technology with a potential to provide realistic 3D images. However, generation of digital holograms is a computationally demanding task. Thus, the performance is a major concern. We propose a new method that reduces spatial resolution in order to accelerate hologram generation. It employs the propagation between parallel planes for efficient optical field values evaluation and a computer graphics approach for approximating visibility. Our results show that the proposed reduction has only a minimal impact on the visual quality, while the formal computational complexity confirms performance improvement.
Low-dose CT lung images denoising based on multiscale parallel convolution neural network
The Visual Computer - Tập 37 - Trang 2419-2431 - 2020
The continuous development and wide application of CT in medical practice have raised public concern over the associated radiation dose to the patient. However, reducing the radiation dose may result in increasing the noise and artifacts, which may adversely interfere with the judgment and belief of radiologists. Therefore, we propose a low-dose CT denoising model based on multiscale parallel convolution neural network to improve the visual effect. Residual learning is utilized to reduce the difficulty of network learning, and batch normalization is adopted to solve the problem of performance degradation due to the increase in neural network layers. Specifically, we introduce the dilated convolution to expand the receptive field by inserting weights of zero in the standard convolution kernel, while not increasing the extra parameters. Furthermore, the multiscale parallel method is utilized to extract multiscale detail features from lung images. Compared to the traditional methods such as Wiener filter, NLM, and models based on CNN, e.g., SCNN, DnCNN, our extensive experimental results demonstrate that our proposed model (CT-ReCNN) can not only reduce the LDCT lung images noise level, but also retain more exact information as well.
Tổng số: 2,897
- 1
- 2
- 3
- 4
- 5
- 6
- 10