Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

Voice Keyword Recognition Based on Spiking Convolutional Neural Network for Human-Machine Interface

Jinhai Hu, Wang Ling Goh, Zhongyi Zhang, Yuan Gao

IEEE International Conference on Intelligent Autonomous Systems (ICoIAS), 2020

Abstract In this paper, a spiking convolutional neural network (SCNN) model for voice keyword recognition is presented. The model consists of an input pre-processing layer, a spiking neural network (SNN) layer with build-in filter bank and the convolutional neural network (CNN) layers. A 16-channel infinite impulse response (IIR) filter bank with energy detector extracts power from the voice signal band and converts it to spikes via the SNN layer. The spiking rate in a defined time window is used as the inputs to the following CNN layers for classification. The network is trained using a voice digit dataset, while the weights of the convolutional layers are adjusted through the training of spike-integration results obtained from the spiking layer. This model has been implemented for voice keyword recognition and achieved 96.0 % accuracy. The combination of SNN and CNN reduces the overall number of layer and neuron in the system without compromise in classification accuracy. It is suitable for low-power hardware implementation in edge devices for human machine interface (HMI) applications.

Link to Paper

Dynamically-biased Fixed-point LSTM for Time Series Processing in AIoT Edge Device

Jinhai Hu, Wang Ling Goh, Yuan Gao

IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021

Abstract In this paper, a Dynamically-Biased Long Short-Term Memory (DB-LSTM) neural network architecture is proposed for artificial intelligence internet of things (AIoT) applications. Different from the conventional LSTM which uses static bias, DB-LSTM adjusts the cell bias dynamically based on the previous status. Hence, a DB-LSTM cell contains information of both the previous output and the current cell state. With more information, the DB-LSTM is able to achieve faster training convergence and better accuracy. Furthermore, weight quantization is performed to reduce the weights to either 1-bit or 2-bit, so that the algorithm can be implemented in portable edge device. With the same 100 epochs training setup, more than 70% loss reduction are achieved for floating 32-bit, 1-bit and 2-bit weights, respectively. The loss degradation due to weight quantization is also negligible. The performance of the proposed model is also validated with the classical air passenger forecasting problem. 0.075 loss and 94.96% accuracy are achieved with 2-bit weight when compared to the ground truth, which is comparable to full-length 32-bit weight.

Link to Paper

Time-Series Analysis on Edge-AI Hardware for Healthcare Monitoring

Jinhai Hu

Progress Report for PhD Confirmation Exercise, 2022

Abstract This project presents a novel time-domain ECG signal analysis model using a dynamically-biased Long Short-Term Memory (DB-LSTM) neural network, capable of performing both ECG forecasting and classification with high accuracy and efficiency. The model achieved over 98% accuracy and a normalized mean square error below 10⁻³ for forecasting, while classification reached over 97% accuracy with fewer training parameters and fast convergence. Designed for real-time, low-power applications, the network uses low-bit quantization (INT4/INT3) with minimal loss in classification accuracy and no degradation in forecasting performance. The model's robustness was validated through simulations on multiple ECG datasets. Future work includes deploying the algorithm on FPGA and CMOS hardware for continuous cardiac monitoring, with plans to develop a flexible AI platform for neural network simulation and implement on-chip online training for healthcare applications.

Link to Paper

Classification of ECG Anomaly with Dynamically-biased LSTM for Continuous Cardiac Monitoring

Jinhai Hu, Wang Ling Goh, Yuan Gao

IEEE International Symposium on Circuits and Systems (ISCAS), 2023

Abstract This paper presents an electrocardiogram (ECG) signal classification model based on dynamically-biased Long Short-Term Memory (DB-LSTM) network. Compared to conventional LSTM networks, DB-LSTM introduces a set of parameters C which save the previous time-step cell gate states of the unit cell. Hence, more feature information is preserved and a smaller size network is required for the classification task. Comprehensive simulations using MIT-BIH ECG datasets show that this model can perform ECG feature classification with shorter time window, faster training convergence while achieving comparable training and classification accuracy with much lower weigh resolution. Compared to the other state-of- art ECG analysis algorithms, this model only requires 4 layers, and it achieved 96.74% accuracy when weights are truncated from FP32 to INT4 with only 2.4% accuracy degradation. Implemented on Xilinx Artix-7 FPGA, the proposed design is estimated to consume only 40ÎŒW dynamic power, which is a promising candidate for resource constrained edge devices.

Link to Paper

Energy Efficient Software-hardware Co-design of Quantized Recurrent Convolutional Neural Network for Continuous Cardiac Monitoring

Jinhai Hu, Cong Sheng Leow, Wang Ling Goh, Yuan Gao

IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2023

Abstract This paper presents an electrocardiogram (ECG) signal classification model based on Recurrent Convolutional Neural Network (RCNN). With recurrent connections and data buffers, a single convolutional layer is reused to implement multiple layers function. Using a 5-layers CNN network as an example, this approach reduces the number of parameters by more than 50% while achieving the same feature extraction size. Furthermore, quantized RCNN (QRCNN) is proposed where the input signal, interlayer output, and kernel weights are quantized to unsigned INT8, INT4, and signed INT4 respectively. For hardware implementation, pipelining and data reuse within the 1-D convolution kernel can potentially reduce latency. QRCNN model achieved 98.08% validation accuracy on MIT-BIH datasets with only 1% degradation due to quantization. The estimated dynamic power consumption of the QRCNN is less than 60% of a conventional quantized CNN when implemented on a Xilinx Artix-7 FPGA, showing the potential for resource-constraint edge devices.

Link to Paper

Supervised Contrastive Pretrained ResNet with MixUp to Enhance Respiratory Sound Classification on Imbalanced and Limited Dataset

Jinhai Hu*, Cong Sheng Leow*, Shuailin Tao, Wang Ling Goh, Yuan Gao

IEEE Biomedical Circuits and Systems Conference (BioCAS), 2023

Abstract This paper proposes a strategy of combining multiple techniques to classify paediatric respiratory sound (PRS) from the Open-Source SJTU Paediatric Respiratory Sound Database. Inspired by recent successes in image classification, this work focuses on improving audio classification with limited and imbalanced datasets through Residual Networks (ResNet). These techniques include augmentations applied to audio features, supervised contrastive (SupCon) pretraining, and MixUp. These three techniques helped reduced overfitting due to imbalanced dataset. To further enhance accuracy, pre-processing, and training hyperparameters were optimized through Bayesian Optimization. The proposed strategy achieved over 95% training accuracies for the four tasks (11, 12, 21, and 22) in the IEEE BioCAS 2023 grand challenge. Through this strategy, the four tasks achieved calculated scores of 0.769, 0.632, 0.662 and 0.512 respectively using the test dataset. The total score is 0.729 including 0.1 obtained from the runtime bonus.

Link to Paper

Squeeze-Excite Fusion based Multimodal Neural Network for Sleep Stage Classification with Flexible EEG/ECG Signal Acquisition Circuit

Shuailin Tao, Jinhai Hu, Wang Ling Goh, Yuan Gao

IEEE International Symposium on Circuits and Systems (ISCAS), 2024

Abstract This paper presents a multimodal fusion strategy for sleep stage classification using polysomnography (PSG) with electroencephalogram (EEG) and Electrocardiogram (ECG) data. The Squeeze-Excite (SE) Fusion mechanism is implemented to enhance the collaborative impact of EEG and ECG signals on neural network classification. To address the challenges of imbalance in the dataset, a balanced sampler is used. Improved feature extraction is achieved through Linear-frequency cepstral coefficients (LFCC) applied to the EEG signal. A recurrent convolutional neural network (RCNN) reduces model parameters and optimizes architecture, while quantizing the network weight down to INT4 ensures hardware compatibility, especially for edge devices. Applying these methodologies to signals, this optimized approach achieves a significant validation accuracy of 77.6% with a compact 23.5KB weight memory size on the MIT-BIH dataset, covering six distinct classification categories.

Link to Paper

Supervised Contrastive Learning Framework and Hardware Implementation of Learned ResNet for Real-time Respiratory Sound Classification

Jinhai Hu*, Cong Sheng Leow*, Shuailin Tao, Wang Ling Goh, Yuan Gao

IEEE Transactions on Biomedical Circuits and Systems (TBioCAS), 2024

Abstract This paper presents a supervised contrastive learning (SCL) framework for respiratory sound classification and the hardware implementation of learned ResNet on field programmable gate array (FPGA) for real-time monitoring. At the algorithmic level, multiple techniques such as features augmentation and MixUp are combined holistically to mitigate the impact of data scarcity and imbalanced classes in the training dataset. Bayesian optimization further enhances the classification accuracy through parameter tuning in pre-processing and SCL. The proposed framework achieves 0.8725 total score (including runtime score) on a ResNet-18 model in both event and record multi-class classification tasks using the SJTU Paediatric Respiratory Sound Database (SPRSound). In addition, algorithm-hardware co-optimizations including Quantization-Aware Training (QAT), merge of network layers, optimization of memory size and number of parallel threads are performed for hardware implementation on FPGA. This approach reduces 40% model size and 70% computation latency. The learned ResNet is implemented on a Xilinx Zynq ZCU102 FPGA with 16ms latency and less than 2% inference score degradation compared to the software model.

Link to Paper

Late Breaking Results: Circuit-Algorithm Co-design for Learnable Audio Analog Front-End

Jinhai Hu, Zhongyi Zhang, Cong Sheng Leow, Wang Ling Goh, Yuan Gao

ACM/IEEE Design Automation Conference (DAC), 2024

(Acceptance rate = 21%)

Abstract This paper presents a circuit-algorithm co-design framework for learnable audio analog front-end (AFE) which includes an analog filterbank for feature extraction and a classifier based on Depthwise Separable Convolutional Neural Network (DSCNN). Instead of the traditional approach to design the analog filterbank and digital classifier separately, a learnable filterbank is proposed and its source-follower bandpass filter (SF-BPF) parameters are optimized together with the neural network classifier in a signal-to-noise ratio (SNR)-aware training process. A new system criterion function (LBPF) is proposed to include classification loss and filter performance into the training process. The optimized audio AFE achieves 10.6% and 11.7% reduction in BPF power and chip area, respectively. Meanwhile, this approach achieved 88.6%--94.5% accuracy for 10-keyword classification task across a wide range of input signal SNR from 5dB to 20dB, with only 16k trainable parameters.

Link to Paper

Convolutional Auto-encoder for Variable Length Respiratory Sound Compression and Reconstruction

Shuailin Tao, Jinhai Hu, Wang Ling Goh, Yuan Gao

IEEE Biomedical Circuits and Systems Conference (BioCAS), 2024

Abstract This paper presents a respiratory sound compression and reconstruction method based on convolutional Auto-Encoder. By utilizing convolutional and transpose convolutional layers, this model can process variable length sound waveform, which is an important feature for data transmission from edge-based medical devices to cloud server and reconstruct the signal with high fidelity. This work shows that utilizing a non-variational latent space in respiratory sounds compression generates smaller reconstruction error compared to other state-of-art solution. Additionally, this work proposes a new composite loss function to guide the network training. Tested with BioCAS 2024 Grand Challenge dataset, this method achieves a Percent Root Mean Square Difference (PRD) of 0.2230, Correlation Coefficient (CC) of 0.972, and Signal-to-Noise Ratio Loss (SNRL) −0.7129dB with an average compression rate of 222.

Link to Paper

LSTM-based ECG Signal Classification with Multi-level One-hot Encoding for Wearable Applications

Jinhai Hu, Wang Ling Goh, Yuan Gao

IEEE Biomedical Circuits and Systems Conference (BioCAS), 2024

Abstract This paper presents an electrocardiogram (ECG) signal classification method using one-hot coding scheme and Long Short-Term Memory (LSTM) neural network. Instead of the conventional analog to digital converter (ADC) with two’s complement binary output, one-hot encoding scheme is adopted in this design to convert the analog signal to 1D vector and then further processed by a LSTM network for classification. Our study shows that one-hot encoding can effectively represent the features of ECG signal with low data bit-width and sampling rate. The proposed method not only simplifies the ADC design, but also improves the classification accuracy. Simulation results show that the proposed design only requires 5-bit bit-width with 50 Hz sampling rate to achieve 96.9% validation accuracy on 5-class classification task using MIT-BIH ECG dataset.

Link to Paper

talks

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.