ACM Transactions on Intelligent Systems and Technology
2157-6912
2157-6904
Mỹ
Cơ quản chủ quản: Association for Computing Machinery (ACM) , ASSOC COMPUTING MACHINERY
Các bài báo tiêu biểu
LIBSVM is a library for Support Vector Machines (SVMs). We have been actively developing this package since the year 2000. The goal is to help users to easily apply SVM to their applications. LIBSVM has gained wide popularity in machine learning and many other areas. In this article, we present all implementation details of LIBSVM. Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Những tiến bộ trong việc thu thập vị trí và kỹ thuật tính toán di động đã tạo ra một lượng lớn dữ liệu quỹ đạo không gian, đại diện cho sự di chuyển của đa dạng các đối tượng di chuyển, chẳng hạn như con người, phương tiện và động vật. Nhiều kỹ thuật đã được đề xuất để xử lý, quản lý và khai thác dữ liệu quỹ đạo trong thập kỷ qua, thúc đẩy một loạt ứng dụng rộng rãi. Trong bài báo này, chúng tôi tiến hành một khảo sát có hệ thống về các nghiên cứu chính trong lĩnh vực
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology-enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services, such as traffic, public transport, water supply, weather, sewage, and public safety, as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance-level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over 4 months from the San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
Linear discriminant analysis (LDA) is a popular technique to learn the most discriminative features for multi-class classification. A vast majority of existing LDA algorithms are prone to be dominated by the class with very large deviation from the others, i.e., edge class, which occurs frequently in multi-class classification. First, the existence of edge classes often makes the total mean biased in the calculation of between-class scatter matrix. Second, the exploitation of ℓ2-norm based between-class distance criterion magnifies the extremely large distance corresponding to edge class. In this regard, a novel self-weighted robust LDA with ℓ2,1-norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes. SWRLDA can automatically avoid the optimal mean calculation and simultaneously learn adaptive weights for each class pair without setting any additional parameter. An efficient re-weighted algorithm is exploited to derive the global optimum of the challenging ℓ2,1-norm maximization problem. The proposed SWRLDA is easy to implement and converges fast in practice. Extensive experiments demonstrate that SWRLDA performs favorably against other compared methods on both synthetic and real-world datasets while presenting superior computational efficiency in comparison with other techniques.
The discovery of Markov blanket (MB) for feature selection has attracted much attention in recent years, since the MB of the class attribute is the optimal feature subset for feature selection. However, almost all existing MB discovery algorithms focus on either improving computational efficiency or boosting learning accuracy, instead of both. In this article, we propose a novel MB discovery algorithm for balancing efficiency and accuracy, called <underline>BA</underline>lanced <underline>M</underline>arkov <underline>B</underline>lanket (BAMB) discovery. To achieve this goal, given a class attribute of interest, BAMB finds candidate PC (parents and children) and spouses and removes false positives from the candidate MB set in one go. Specifically, once a feature is successfully added to the current PC set, BAMB finds the spouses with regard to this feature, then uses the updated PC and the spouse set to remove false positives from the current MB set. This makes the PC and spouses of the target as small as possible and thus achieves a trade-off between computational efficiency and learning accuracy. In the experiments, we first compare BAMB with 8 state-of-the-art MB discovery algorithms on 7 benchmark Bayesian networks, then we use 10 real-world datasets and compare BAMB with 12 feature selection algorithms, including 8 state-of-the-art MB discovery algorithms and 4 other well-established feature selection methods. On prediction accuracy, BAMB outperforms 12 feature selection algorithms compared. On computational efficiency, BAMB is close to the IAMB algorithm while it is much faster than the remaining seven MB discovery algorithms.
Personalized medication recommendations aim to suggest a set of medications based on the clinical conditions of a patient. Not only should the patient's diagnosis, procedure, and medication history be considered, but drug-drug interactions (DDIs) must also be taken into account to prevent adverse drug reactions. Although recent studies on medication recommendation have considered DDIs and patient history, personalized disease progression and prescription have not been explicitly modeled. In this work, we proposed FastRx, a Fastformer-based medication recommendation model to capture longitudinality in patient history, in combination with Graph Convolutional Networks (GCNs) to handle DDIs and co-prescribed medications in Electronic Health Records (EHRs). Our extensive experiments on the MIMIC-III dataset demonstrated superior performance of the proposed FastRx over existing state-of-the-art models for medication recommendation. The source code and data used in the experiments are available at