Loading...

We Build
Robust
AI Products

Work with us

라온데이터는 인공지능 기술로 세상을 바꾸고 싶은 멋진 기업들과 함께하고자 합니다. 협업을 원한다면 언제든 연락주세요.

Our Featured Projects

라온데이터는 자체 연구개발한 인공지능 기술을 바탕으로 엔터테인먼트, 교육, 의료 등 다양한 도메인에서 프로젝트를 진행했습니다.

Amazing Publications

D-vlog: Multimodal Vlog Dataset for Depression Detection

Detecting depression based on non-verbal behaviors has received great attention. However, most prior work on detecting depression mainly focused on detecting depressed individuals in laboratory settings, which are difficult to be generalized in practice. In addition, little attention has been paid to analyzing the non-verbal behaviors of depressed individuals in the wild. Therefore, in this paper, we present a multimodal depression dataset, D-Vlog, which consists of 961 vlogs (i.e., around 160 hours) collected from YouTube, which can be utilized in developing depression detection models based on the non-verbal behavior of individuals in real-world scenario. We develop a multimodal deep learning model that uses acoustic and visual features extracted from collected data to detect depression. Our proposed model employs the cross-attention mechanism to effectively capture the relationship across acoustic and visual features, and generates useful multimodal representations for depression detection. The extensive experimental results demonstrate that the proposed model significantly outperforms other baseline models. We believe our dataset and the proposed model are useful for analyzing and detecting depressed individuals based on non-verbal behavior.

Optical coherence tomography-based deep-learning model for detecting central serous chorioretinopathy

Central serous chorioretinopathy (CSC) is a common condition characterized by serous detachment of the neurosensory retina at the posterior pole. We built a deep learning system model to diagnose CSC, and distinguish chronic from acute CSC using spectral domain optical coherence tomography (SD-OCT) images. Data from SD-OCT images of patients with CSC and a control group were analyzed with a convolutional neural network. Sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUROC) were used to evaluate the model. For CSC diagnosis, our model showed an accuracy, sensitivity, and specificity of 93.8%, 90.0%, and 99.1%, respectively; AUROC was 98.9% (95% CI, 0.983–0.995); and its diagnostic performance was comparable with VGG-16, Resnet-50, and the diagnoses of five different ophthalmologists. For distinguishing chronic from acute cases, the accuracy, sensitivity, and specificity were 97.6%, 100.0%, and 92.6%, respectively; AUROC was 99.4% (95% CI, 0.985–1.000); performance was better than VGG-16 and Resnet-50, and was as good as the ophthalmologists. Our model performed well when diagnosing CSC and yielded highly accurate results when distinguishing between acute and chronic cases. Thus, automated deep learning system algorithms could play a role independent of human experts in the diagnosis of CSC.

Predicting Emotional Intensity in Political Debates via Non-verbal Signals

Non-verbal expressions of politicians are important in election. In particular, the emotional intensity of politician revealed in a debate can be strongly linked to voters’ evaluation. This paper proposes a multimodal deep-learning model for predicting the perceived emotional intensity of a candidate, which utilizes voice, face, and gesture to capture the comprehensive information of one’s emotional intensity revealed in a debate. We collect a dataset of political debate videos from the 2020 Democratic presidential primaries in the USA, and train the proposed model with randomly sampled clips from the debate videos. By applying the proposed model to 23 candidates in 11 debate videos, we show that the standard deviation of the perceived emotional intensity is positively correlated with the changes in candidates’ favorability in public polls.

Distinguishing retinal angiomatous proliferation from polypoidal choroidal vasculopathy with a deep neural network based on optical coherence tomography

This cross-sectional study aimed to build a deep learning model for detecting neovascular age-related macular degeneration (AMD) and to distinguish retinal angiomatous proliferation (RAP) from polypoidal choroidal vasculopathy (PCV) using a convolutional neural network (CNN). Patients from a single tertiary center were enrolled from January 2014 to January 2020. Spectral-domain optical coherence tomography (SD-OCT) images of patients with RAP or PCV and a control group were analyzed with a deep CNN. Sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUROC) were used to evaluate the model’s ability to distinguish RAP from PCV. The performances of the new model, the VGG-16, Resnet-50, Inception, and eight ophthalmologists were compared. A total of 3951 SD-OCT images from 314 participants (229 AMD, 85 normal controls) were analyzed. In distinguishing the PCV and RAP cases, the proposed model showed an accuracy, sensitivity, and specificity of 89.1%, 89.4%, and 88.8%, respectively, with an AUROC of 95.3% (95% CI 0.727–0.852). The proposed model showed better diagnostic performance than VGG-16, Resnet-50, and Inception-V3 and comparable performance with the eight ophthalmologists. The novel model performed well when distinguishing between PCV and RAP. Thus, automated deep learning systems may support ophthalmologists in distinguishing RAP from PCV.

VCTUBE : A Library for Automatic Speech Data Annotation

We introduce an open-source Python library, VCTUBE, which can automatically generate < audio, text > pair of speech data from a given Youtube URL. We believe VCTUBE is useful for collecting, processing, and annotating speech data easily toward developing speech synthesis systems.

“Why tag me?”: Detecting motivations of comment tagging in Instagram

Tagging a friend in a comment is one of the main mechanisms to lead user interaction in social media. This paper investigates the current practice of user tagging in Instagram by collecting large-scale data that includes 9K uploaded posts and associated 4M comments shared by 3M users. Our analysis reveals that 54.8% of the comment contains user tagging, meaning that user tagging is widely used in Instagram. By analyzing the comment texts, we observe that the comments with user tagging tend to have more social and fewer negative words than those without user tagging, suggesting that user tagging is often used for friendly conversations. Based on lessons learned, we propose a learning-based model to classify the motivation of user tagging into one of the following categories: information-, relationship-, and discussion-oriented motivation. The proposed model can achieve a high f1-score of 83.72% in identifying the motivations for user tagging, which can provide considerable insights into user responses. We then apply our classification model to the user tagging comments in our dataset, and find that 44.08%, 47.74%, and 8.18% of comments are information-, relationship-, and discussion-oriented comments, respectively, which reveals that user tagging is frequently used to socialize with other friends.

Classifying central serous chorioretinopathy subtypes with a deep neural network using optical coherence tomography images: a cross-sectional study

Central serous chorioretinopathy (CSC) is the fourth most common retinopathy and can reduce quality of life. CSC is assessed using optical coherence tomography (OCT), but deep learning systems have not been used to classify CSC subtypes. This study aimed to build a deep learning system model to distinguish CSC subtypes using a convolutional neural network (CNN). We enrolled 435 patients with CSC from a single tertiary center between January 2015 and January 2020. Data from spectral domain OCT (SD-OCT) images of the patients were analyzed using a deep CNN. Five-fold cross-validation was employed to evaluate the model’s ability to discriminate acute, non-resolving, inactive, and chronic atrophic CSC. We compared the performances of the proposed model, Resnet-50, Inception-V3, and eight ophthalmologists. Overall, 3209 SD-OCT images were included. The proposed model showed an average cross-validation accuracy of 70.0% (95% confidence interval [CI], 0.676–0.718) and the highest test accuracy was 73.5%. Additional evaluation in an independent set of 104 patients demonstrated the reliable performance of the proposed model (accuracy: 76.8%). Our model could classify CSC subtypes with high accuracy. Thus, automated deep learning systems could be useful in the classification and management of CSC.

Assessing central serous chorioretinopathy with deep learning and multiple optical coherence tomography images

Central serous chorioretinopathy (CSC) is one of the most common macular diseases that can reduce the quality of life of patients. This study aimed to build a deep learning-based classification model using multiple spectral domain optical coherence tomography (SD-OCT) images together to diagnose CSC. Our proposed system contains two modules: single-image prediction (SIP) and a final decision (FD) classifier. A total of 7425 SD-OCT images from 297 participants (109 acute CSC, 106 chronic CSC, 82 normal) were included. In the fivefold cross validation test, our model showed an average accuracy of 94.2%. Compared to other end-to-end models, for example, a 3D convolutional neural network (CNN) model and a CNN-long short-term memory (CNN-LSTM) model, the proposed system showed more than 10% higher accuracy. In the experiments comparing the proposed model and ophthalmologists, our model showed higher accuracy than experts in distinguishing between acute, chronic, and normal cases. Our results show that an automated deep learning-based model could play a supplementary role alongside ophthalmologists in the diagnosis and management of CSC. In particular, the proposed model seems clinically applicable because it can classify CSCs using multiple OCT images simultaneously.

Classifying neovascular age-related macular degeneration with a deep convolutional neural network based on optical coherence tomography images

Neovascular age-related macular degeneration (nAMD) is among the main causes of visual impairment worldwide. We built a deep learning model to distinguish the subtypes of nAMD using spectral domain optical coherence tomography (SD-OCT) images. Data from SD-OCT images of nAMD (polypoidal choroidal vasculopathy, retinal angiomatous proliferation, and typical nAMD) and normal healthy patients were analyzed using a convolutional neural network (CNN). The model was trained and validated based on 4749 SD-OCT images from 347 patients and 50 healthy controls. To adopt an accurate and robust image classification architecture, we evaluated three well-known CNN structures (VGG-16, VGG-19, and ResNet) and two customized classification layers (fully connected layer with dropout vs. global average pooling). Following the test set performance, the model with the highest classification accuracy was used. Transfer learning and data augmentation were applied to improve the robustness and accuracy of the model. Our proposed model showed an accuracy of 87.4% on the test data (920 images), scoring higher than ten ophthalmologists, for the same data. Additionally, the part that our model judged to be important in classification was confirmed through Grad-CAM images, and consequently, it has a similar judgment criteria to that of ophthalmologists. Thus, we believe that our model can be used as an auxiliary tool in clinical practice.

Meet Our Team

Seong Choi

CEO

Jeewoo Yoon

CTO

Kiwoong Kim

Lead Engineer

Yongmo Ku

AI Engineer

Junseo Ko

AI Engineer

Deoghyeon Ga

AI Engineer

Taihu Li

AI Engineer

Daehwan Park

AI Engineer

Yejin Kim

Administrative Manager

Jinyoung Han

Technical Advisor

Trusted by Awesome Clients

라온데이터와의 멋진 협업을 원하는 기업은 연락주세요.

Launch Your Project with Us

Work with us
Top