Latent Mixture of Discriminative Experts (LMDE)
Source
Evernote/Papers/Latent Mixture of Discriminative Experts.md
Summary
본 논문은 다양한 모달리티 간의 시간적 관계를 자동으로 학습할 수 있는 ‘Latent Mixture of Discriminative Experts(LMDE)’ 모델을 제안합니다. 각 모달리티별로 별도의 전문가(expert)를 학습하여 데이터가 제한적일 때도 예측 성능을 향상시킵니다. 청자의 백채널(머리 끄덕임 등) 예측 작업을 통해 검증되었으며, 5가지 다중모달리티 특징(어휘, 구문, 품사, 시각, 운율)의 결합이 중요함을 확인했습니다. 또한 개인별 반응 차이를 고려한 ‘User-adaptive Prediction Accuracy’라는 새로운 평가 지표를 도입하고, 정규화를 활용한 희소 특징 순위 알고리즘을 통해 모델 해석 가능성을 제시합니다.
Key Points
- LMDE 모델 제안: 모달리티 간 시간적 관계 자동 학습
- 데이터 효율성: 모달리티별 독립 학습으로 소량 데이터에서도 성능 향상
- 응용 분야: 청자 백채널(Head nod) 예측
- 다중모달리티 통합: 어휘, 구문, 품사, 시각, 운율 5가지 특징 결합의 중요성 입증
- 새로운 평가 지표: 개인별 반응 차이를 반영한 User-adaptive Prediction Accuracy
- 모델 해석: 정규화를 이용한 희소 특징 순위 알고리즘 제시
Related
-
언어 독립적 시간 표현 판별적 파싱 (Language-Independent Discriminative Parsing of Temporal Expressions)
-
Social Event Classification via Boosted Multimodal Supervised Latent Dirichlet Allocation
-
Nonlinear Latent Factorization by Embedding Multiple User Interests
-
Coordinated Multi-Device Presentations: Ambient-Audio Identification
-
Speaker Adaptation of Context Dependent Deep Neural Networks
-
Weakly Supervised Learning of Object Segmentations from Web-Scale Video
-
Target Language Adaptation of Discriminative Transfer Parsers
-
Stock Selection Model Based on Machine Learning with Wisdom of Experts and Crowds
-
Continuous Birdsong Recognition Using Gaussian Mixture Modeling of Image Shape Features
-
Efficient Estimation of Word Representations in Vector Space
-
Improved Domain Adaptation for Statistical Machine Translation
-
Feature Ensemble Plus Sample Selection: Domain Adaptation for Sentiment Classification
-
Active Learning through Adaptive Heterogeneous Ensembling (AHE)
-
Enlisting the Ghost: Modeling Empty Categories for Machine Translation
-
Efficient Closed-Form Solution to Generalized Boundary Detection
-
Smooth Nonnegative Matrix Factorization for Unsupervised Audiovisual Document Structuring
-
Accurate and Compact Large Vocabulary Speech Recognition on Mobile Devices
-
A Hamming Embedding Kernel with Informative Bag-of-Visual Words for Video Semantic Indexing
-
An Unsupervised Feature Selection Framework for Social Media Data
-
Supporting Flexible, Efficient, and User-Interpretable Retrieval of Similar Time Series
-
Speech and Natural Language: Where Are We Now And Where Are We Headed
-
Fast, Accurate Detection of 100,000 Object Classes on a Single Machine (Technical Supplement)
-
Token and Type Constraints for Cross-Lingual Part-of-Speech Tagging
-
Moment-Based Spectral Analysis of Large-Scale Networks Using Local Structural Information
-
Language Model Verbalization for Automatic Speech Recognition
-
웹 데이터베이스 검색 결과 자동 주석 처리 (Automatic Annotation of Web Database Search Results)
-
Efficient Inference and Structured Learning for Semantic Role Labeling
-
Fast Near-Duplicate Image Detection Using Uniform Randomized Trees
-
웹 페이지의 시각적 복잡성 측정 (Measuring the Visual Complexities of Web Pages)