VNU Journal of Science: Computer Science and Communication Engineering

Công bố khoa học tiêu biểu

* Dữ liệu chỉ mang tính chất tham khảo

Sắp xếp:  
VLSP 2021-ViMRC Challenge: Vietnamese Machine Reading Comprehension
Nguyen Luu Thuy Ngan, Huynh Van Tin, Nguyen Van Kiet, Nguyen Thanh Luan, Luu Thanh Son
One of the emerging research trends in natural language understanding is machine reading comprehension (MRC) which is the task to find answers to human questions based on textual data. While many datasets have been developed for MRC research for other languages, there is a lack of such resources for the Vietnamese language. Although many datasets and methodologies have been developed for English and Chinese, many Vietnamese machine reading comprehension limitations need to be solved further. Existing Vietnamese datasets for MRC research concentrate solely on answerable questions. However, in reality, questions can be unanswerable for which the correct answer is not stated in the given textual data. To address the weakness, we provide the research community with a benchmark dataset named UIT-ViQuAD 2.0 for evaluating the MRC task and question answering systems for the Vietnamese language. We use UIT-ViQuAD 2.0 as a benchmark dataset for the shared task on Vietnamese machine reading comprehension (VLSP2021-MRC) at the Eighth Workshop on Vietnamese Language and Speech Processing (VLSP 2021). This task attracted 77 participant teams from 34 universities and other organizations. Each participant was provided with the training data, including 28,457 annotated question-answer pairs, and returned the result on a public test set of more than 3,821 questions and a private test set of 3,712 questions. In this article, we present details of the organization of the shared task, an overview of the methods employed by shared-task participants, and the results. The highest performances in this competition are 77.24% (in EM) and 67.43% (in F1-score) on the private test set. The Vietnamese MRC systems proposed by the top 3 teams use XLM-RoBERTa, a powerful pre-trained language model based on the transformer architecture that has achieved state-of-the-art results on many natural language processing tasks. We believe that releasing the UIT-ViQuAD 2.0 dataset motivates more researchers to improve Vietnamese machine reading comprehension.
Towards Model-Checking Probabilistic Timed Automata against Probabilistic Duration Properties
Miaomiao Zhang, Hung Van Dang
In this paper, we consider a subclass of Probabilistic Duration  Calculus formula called Simple Probabilistic Duration Calculus  (SPDC) as a language for specifying dependability requirements for  real-time systems, and address the two problems: to decide if a  probabilistic timed automaton satisfies a SPDC formula, and to  decide if there exists a strategy of a probabilistic timed automaton  satisfies a SPDC formula. We prove that the both problems are  decidable for a class of SPDC called probabilistic linear duration  invariants, and provide model checking algorithms for solving these  problems.
A 8x1 Sprout-Shaped Antenna Array with Low Sidelobe Level of -25 dB
Nguyen Minh Tran, Tang The Toan, Truong Vu Bang Giang
This paper proposes a 8 x 1 sprout-shaped antenna array with low sidelobe level (SLL) for outdoor point to point applications. The array has the dimensions of 165 mm x 195 mm x 1.575 mm and is designed on Rogers RT/Duroid 5870tm with the thickness of 1.575 mm and permittivity of 2.33. In order to achieve low SLL, Chebyshev distribution weights corresponding to SLL preset at -30 dB has been applied to design the feed of the array. Unequal T-junction dividers has been used to ensure that the output powers are proportional to the Chebyshev amplitude distribution. A reflector has been added to the back of the antenna to improve the directivity. The simulated results show that the proposed array can work at 4.95 GHz with the bandwidth of 185 MHz. Moreover, it can provide the gain up to 12.9 dBi and SLL suppressed to -25 dB. A prototype has also been fabricated and measured. A good agreement between simulation and measurement has been obtained. It is proved that the array can be a good candidate for point to point communications.
Hyper-volume Evolutionary Algorithm
Khoi Nguyen Le
We propose a multi-objective evolutionary algorithm (MOEA), named the Hyper-volume Evolutionary Algorithm (HVEA). The algorithm is characterised by three components. First, individual fitness evaluation depends on the current Pareto front, specifically on the ratio of its dominated hyper-volume to the current Pareto front hyper-volume, hence giving an indication of how close the individual is to the current Pareto front. Second, a ranking strategy classifies individuals based on their fitness instead of Pareto dominance, individuals within the same rank are non guaranteed to be mutually non-dominated. Third, a crowding assignment mechanism that adapts according to the individual's neighbouring area, controlled by the neighbouring area radius parameter, and the archive of non-dominated solutions. We perform extensive experiments on the multiple 0/1 knapsack problem using different greedy repair methods to compare the performance of HVEA to other MOEAs including NSGA2, SEAMO2, SPEA2, IBEA and MOEA/D. This paper shows that by tuning the neighbouring area radius parameter, the performance of the proposed HVEA can be pushed towards better convergence, diversity or coverage and this could be beneficial to different types of problems.
Coreference Resolution in Vietnamese Electronic Medical Records
Hung D. Nguyen
Electronic medical records (EMR) have emerged as an important source of data for research in medicine andinformation technology, as they contain much of valuable human medical knowledge in healthcare and patienttreatment. This paper tackles the problem of coreference resolution in Vietnamese EMRs. Unlike in English ones,in Vietnamese clinical texts, verbs are often used to describe disease symptoms. So we first define rules to annotateverbs as mentions and consider coreference between verbs and other noun or adjective mentions possible. Thenwe propose a support vector machine classifier on bag-of-words vector representation of mentions that takes intoaccount the special characteristics of Vietnamese language to resolve their coreference. The achieved F1 scoreon our dataset of real Vietnamese EMRs provided by a hospital in Ho Chi Minh city is 91.4%. To the best of ourknowledge, this is the first research work in coreference resolution on Vietnamese clinical texts.Keywords: Clinical text, support vector machine, bag-of-words vector, lexical similarity, unrestricted coreference
Robustify Hand Tracking by Fusing Generative and Discriminative Methods
With the development of virtual reality (VR) technology and its applications in many fields, creating simulated hands in the virtual environment is an e ective way to replace the controller as well as to enhance user experience in interactive processes. Therefore, hand tracking problem is gaining a lot of research attention, making an important contribution in recognizing hand postures as well as tracking hand motions for VR’s input or human machine interaction applications. In order to create a markerless real-time hand tracking system suitable for natural human machine interaction, we propose a new method that combines generative and discriminative methods to solve the hand tracking problem using a single RGBD camera. Our system removes the requirement of the user having to wear to color wrist band and robustifies the hand localization even in di cult tracking scenarios. KeywordsHand tracking, generative method, discriminative method, human performance capture References[1] Malik,  A.  Elhayek,  F.  Nunnari,  K.  Varanasi, Tamaddon, A. Heloir, D. Stricker, Deephps: End-to-end estimation of 3d hand pose and shape by learning from synthetic depth, CoRR abs/1808.09208, 2018. URL http://arxiv.org/abs/1808.09208. [2] Glauser,  S.  Wu,  D.  Panozzo,  O.  Hilliges, Sorkine-Hornung, Interactive hand pose estimation using a stretch-sensing soft glove, ACM Trans, Graph. 38(4) (2019) 1-15.[3] Jiang, H. Xia, C. Guo, A model-based system for real-time articulated hand tracking using a simple data glove and a depth camera, Sensors 19 (2019) 4680. https://doi.org/10.3390/s19214680.[4] Cao, G. Hidalgo, T. Simon, S. Wei, Y. Sheikh, Openpose: Realtime multi-person 2d pose estimation using part a nity fields, CoRR abs/1812.08008, 2018.[5] Tagliasacchi, M. Schroder, A. Tkach, S. Bouaziz, M. Botsch, M. Pauly, Robust articulated-icp for real-time hand tracking, Computer Graphics Forum 34, 2015.[6] Qian, X. Sun, Y. Wei, X. Tang, J. Sun, Realtime and robust hand tracking from depth, in: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.[7] Tomasi, Petrov, Sastry, 3d tracking = classification + interpolation, in: Proceedings Ninth IEEE International Conference on Computer Vision 2 (2003) 1441-1448.[8] Sharp, C. Keskin, D. Robertson, J. Taylor, J. Shotton, D. Kim, C. Rhemann, I. Leichter, A. Vinnikov, Y. Wei, et al., Accurate, robust, and flexible real-time hand tracking, in: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, 2015, pp. 3633-3642.[9] Sridhar, F. Mueller, A. Oulasvirta, C. Theobalt, Fast and robust hand tracking using detection-guided optimization, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.[10] Oikonomidis, N. Kyriazis, A.A. Argyros, Tracking the articulated motion of two strongly interacting hands, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 1862-1869.[11] Melax, L. Keselman, S. Orsten, Dynamics based 3d skeletal hand tracking, CoRR abs/1705.07640, 2017.[12] Wang, S. Paris, J. Popovic, 6d hands: Markerless hand tracking for computer aided design, 2011, pp. 549-558. https://doi.org/10.1145/2047196.2047269.[13]Tang, T. Yu, T. Kim, Real-time articulated hand pose estimation using semi-supervised transductive regression forests, in: 2013 IEEE International Conference on Computer Vision, 2013, pp. 3224-3231.[14] Oberweger, P. Wohlhart, V. Lepetit, Generalized feedback loop for joint hand-object pose estimation, 2019, CoRR abs/1903.10883. URL http://arxiv.org/abs/1903.10883.[15] Malik, A. Elhayek, F. Nunnari, K. Varanasi, K. Tamaddon, A. Heloir,´ D. Stricker, Deephps: End-to-end estimation of 3d hand pose and shape by learning from synthetic depth, 2018, pp. 110-119. https://doi.org/10.1109/3DV.2018.00023.[16] A. Mohammed, J.L.M. Islam, A deep learning-based end-to-end composite system for hand detection and gesture recognition, Sensors 19 (2019) 5282.  https://doi.org/10.3390/s19235282.
A Comprehensive Study of Adaptive LNA Nonlinearity Compensation Methods in Direct RF Sampling Receivers
Hai Nam Le, Vu Ngoc Anh
This paper studies the effects of nonlinear distortion of Low Noise Amplifier (LNA) for the multichannel direct-RF sampling receiver (DRF). The main focus of our work is to study and compare the effectiveness of the different adaptive compensation algorithms, including the inverse-based and subtract-based Least Mean Square (LMS) algorithm with a fixed and variable step size. The models for the compensation circuits have been analytically derived. As the major improvements, the effectiveness of the compensation circuits under the ADC quantization noise effect is evaluated. The bit-error-rates (BER) in dynamic signal-to-noise ratio (SNR) scenarios are calculated. We have proposed the use of variable step-size LMS (VLMS) to shorten the convergence time and to improve the compensation effect in general. To evaluate and compare different compensation methods, a complex Matlab model of the Ultra high frequency (UHF) DRF with 4-QPSK channels was implemented. The simulation results show that all compensation methods significantly improve the receiver performance, with the convergence time of the VLMS algorithm does not exceed 5.104 samples, the adjacent channel power ratios (ACPR) are reduced more than 30 dBc, and the BERs decrease by 2­–3 orders of magnitude, compared with the non-compensated results. The simulation results also indicate that the subtraction method in general has better performance than the inversion method.
PHY-MAC Cross-Layer Cooperative Protocol Supporting Physical-Layer Network Coding
Cooperative communication has known as an eective solution to deal with the channel fading as well as to improve the networkperformances. Further, by combining the cooperative relaying technique with the physical-layer network coding (PNC), cooperative networks will obtain more benefits to improve the throughput and network resource utilization. In order to leverage these benefits in this paper, we propose a PHY-MAC cross-layer cooperative protocol which can support PNC for multi-rate cooperative wireless networks with bidirectional traffic. The design objective of the proposed protocol is to increase the transmission reliability, throughput, and energy efficiency, and reduce the transmission delay. Simulation results show that the proposed protocol outperforms the previous cooperative protocol as well as the traditional protocol in terms of the network performance.
Abbreviation Detection in Vietnamese Clinical Texts
Chau Vo, Tru Cao, Bao Ho
Abbreviations have been widely used in clinical notes because generating clinical notes often takes place under high pressure with lack of writing time and medical record simplification. Those abbreviations limit the clarity and understanding of the records and greatly affect all the computer-based data processing tasks. In this paper, we propose a solution to the abbreviation identification task on clinical notes in a practical context where a few clinical notes have been labeled while so many clinical notes need to be labeled. Our solution is defined with a semi-supervised learning approach that uses level-wise feature engineering to construct an abbreviation identifier, from using a small set of labeled clinical texts and exploiting a larger set of unlabeled clinical texts. A semi-supervised learning algorithm, Semi-RF, and its advanced adaptive version, Weighted Semi-RF, are proposed in the self-training framework using random forest models and Tri-training. Weighted Semi-RF is different from Semi-RF as equipped with a new weighting scheme via adaptation on the current labeled data set. The proposed semi-supervised learning algorithms are practical with parameter-free settings to build an effective abbreviation identifier for identifying abbreviations automatically in clinical texts. Their effectiveness is confirmed with the better Precision and F-measure values from various experiments on real Vietnamese clinical notes. Compared to the existing solutions, our solution is novel for automatic abbreviation identification in clinical notes. Its results can lay the basis for determining the full form of each correctly identified abbreviation and then enhance the readability of the records. Keywords: Electronic medical record, Clinical note, Abbreviation identification, Semi-supervised learning,  Self-training, Random forest.
Tổng số: 133   
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 10