IIPP - mentor

National Chung-Hsing University

Intelligent System

Hung-Hsu Tsai

https://sites.google.com/view/thhweb

Research Field

Smart Computing (Information)

Introduction
Research Topics
Honor
Educational Background
Job Position

Introduction

Hung-Hsu Tsai received the BS and the MS degrees in applied mathematics from National Chung Hsing University, Taichung, Taiwan, in 1986 and 1988, respectively, and the PhD degree in computer science and information engineering from National Chung Cheng University, Chiayi, Taiwan, in 1999. Currently, he is the Vice Dean of College of Science since Feb., 2024, and a professor at Graduate Institute of Data Science and Information Computing / Department of Applied Mathematics, National Chung Hsing University, Taichung, Taiwan. He had serviced at Department of Information Management, National Formosa University, Huwei, Yunlin, Taiwan, from Aug., 2000 to Jan., 2021. He has worked in industry for the SYSTEX Corporation, and in academia for Nanhua University, Chiayi, Taiwan. He is an honorary member of the Phi Tau Phi Scholastic Honor Society. He has been selected and included in the 9th edition of Who's Who in Science and Engineering which has been published in 2006. He serves as a technical reviewer for various scientific journals and numerous international conferences. His research interests include artifical intelligence, machine learning, deep learning, multimedia security, multimedia watermarking, intelligent filter design, content-based multimedia retrieval, data mining, e-Learning interaction system, artifical internet of things, cloud computing, big data analysis.

Our lab mainly focuses on research in: the application of deep learning and machine learning theories and technologies in smart healthcare, sustainable environment, precision education, multimedia fraud detection, and other hot topics. We also provide a webpage to showcase our research results for sharing purposes. The four main research directions are described as follows:

Theoretical Studies: Artificial intelligence, deep learning, machine learning, neural networks, computer vision, multimodal networks, domain generalization, multi-scale feature fusion, anomaly detection and localization, fuzzy logic, evolutionary algorithms, and reinforcement learning.

Technical Development: Deep learning modeling techniques based on mathematics, AI big data analysis.

Application Fields: Smart healthcare, intelligent anomaly detection, sustainable environment, anti-multimedia fraud, and image recognition.

Implementation Techniques: Cloud services, Python, R, Docker, Visual Studio.NET, Mobile Web Programming, Web-Database Programming, Matlab.

Research Topics

Research Projects and Achievements

Ongoing Projects:

Application of Large Language Models: Development of a document generation system based on generative AI systems to generate slope management-related documents for soil and water conservation, reducing manual writing time, improving processing efficiency, and enhancing the effectiveness of new employee training.

Prediction of Stent Placement for Atherosclerosis in Head and Neck Cancer: Using neck ultrasound data to build a prediction model to predict the atherosclerosis status of patients with head and neck cancer and monitor stent placement warnings.

Important Achievements:

Smart Healthcare:

Cough Detection for COVID-19: Using cough recording data, sound processing techniques such as STFT, DCT, and Mel filter bank are applied to convert one-dimensional audio into two-dimensional time-frequency graphs, then use CNN and contrastive loss to model and predict COVID-19 patients.

COVID-19 Detection from CXR Images: Using X-ray images as the basis, enhanced images with CHALE, and designed an attention branch to enhance convolution capabilities, addressing the receptive field issues of traditional CNN networks to detect COVID-19 patients.

Multimodal System for COVID-19 Detection: Combining both X-ray images of the lungs and cough sound files, using a cross-attention fusion mechanism to integrate different modal features to detect COVID-19 patients.

Brain and Liver Tumor Image Localization: Medical image anomaly detection often uses image reconstruction methods to screen for abnormal parts of organs (tumors). The current challenge is to enhance the reconstruction of the abnormal parts of the original image, resulting in less effective detection. By using wavelet transformation to decompose the image's partial information characteristics, and building a VAE model without the LL band, we design three features ({LH, HL, HH}, Gaussian Fourier feature, VAE latent feature) to ensure the reconstructed image maintains high quality and reduces the reconstruction ability of outlier image information. These two advantages enable the precise and effective capture of abnormal parts of organs.

Anomaly Detection of Lung Auscultation Sounds: Based on a lung stethoscope dataset, applying sound processing techniques such as STFT, DCT, and Mel filter bank to convert one-dimensional audio into two-dimensional time-frequency graphs, then using a CNN-based encoder to extract features, and using contrast center loss to strengthen the encoder's feature clustering ability, allowing the classifier to more accurately identify abnormal lung sounds.

Image Segmentation:

Abnormal Monitoring of Soil and Water Conservation Facilities: Constructing a semantic segmentation framework for cracks on revetment surfaces based on domain generalization. This framework uses aerial images of revetments and public crack datasets to train deep learning models. Successfully solving four issues: lack of crack images on revetment surfaces, difficulty in providing the distance between the drone and the revetment, background objects (such as gaps between wave blocks) easily misjudged as cracks, and the small proportion of crack areas leading to category imbalance issues. Therefore, the framework can effectively segment crack areas in images.

Audio and Video Generation:

Image to Image Transformation: A popular generative model technology that inputs an image and transforms it into another image, such as converting a cat into a dog or a non-tumor organ into a tumor-bearing organ. This technique is often used in anomaly detection, as it is difficult to estimate the true distribution of abnormal data due to the lack of abnormal data. When encountering this problem, normal images are input into the generative model to generate a large number of abnormal images. Developing unsupervised object deformation methods to solve Image-to-image transformation's tendency to overfit due to the model's excessive memory of the source image shape, making it difficult to determine whether some unseen input images are source or target images. The main design is to predict pixel movement through the generator to rearrange the pixels of the source image, and use patch-wise constraints to maintain semantic consistency between the deformed image and the final output image, training the generator to search for shapes.

Generated Voice Recognition: The development of fake voice generation technology brings entertainment and convenience to humans but also makes it difficult to distinguish the authenticity of voice, causing security and ethical issues. To prevent and prevent the negative impact of fake voice generation on society, it is urgent to develop detection methods for fake voices. Traditional methods only verify a single language, but there are multilingual application scenarios in reality, still the biggest challenge. Directly using multiple languages to train models also encounters the difficulty of integrating different language datasets. Our lab proposes a multilingual model for identifying fake voices, including Chinese, English, and Japanese, expanding the language coverage of the model. At the same time, using domain generalization technology to effectively integrate all language datasets, and designing domain alignment methods, the model retains features of each domain and finds common features among domains, improving the model's strong adaptability and accuracy to multilingualism, thus effectively capturing key features of voice signals, showing significant performance for both known and unknown attack scenarios.

Species Recognition:

Cicada Sound Classification: Based on a handheld recorder dataset, applying sound processing techniques such as STFT to convert one-dimensional

Honor

Outstanding Research Award in National Formosa University (2019.01.01~2019.12.31)

Educational Background

Ph.D., Computer Science and Information Engineering, National Chung Cheng University, Taiwan,
1999
M.S., Applied Mathematics, National Chung-Shing University, Taiwan, 1988
B.S., Applied Mathematics, National Chung-Shing University, Taiwan, 1986