-
Wilkerson Jamison posted an update 6 months ago
Parkinson’s Disease (PD) is a common neurodegenerative disease which impacts millions of people around the world. In clinical treatments, freezing of gait (FoG) is used as the typical symptom to assess PD patients’ condition. Currently, the assessment of FoG is usually performed through live observation or video analysis by doctors. Considering the aging societies, such a manual inspection based approach may cause serious burdens on the healthcare systems. In this study, we propose a pure video-based method to automatically detect the shuffling step, which is the most indistinguishable type of FoG. Firstly, the RGB silhouettes which only contain legs and feet are fed into the feature extraction module to obtain multi-level features. 3D convolutions are used to aggregate both temporal and spatial information. Then the multi-level features are aggregated by the feature fusion. Skip connections are implemented to reserve information of high resolution and period-wise horizontal pyramid pooling is utilized to fuse both global context and local features. To validate the efficacy of our method, a dataset containing 268 normal gait samples and 362 shuffling step samples is built, on which our method achieves an average detection accuracy of 90.8%. Besides shuffling step detection, we demonstrate that our method can also assess the severity of walking abnormity. Our proposal facilitates a more frequent assessment of FoG with less manpower and lower cost, leading to more accurate monitoring of the patients’ condition.In recent studies, collaborative intelligence (CI) has emerged as a promising framework for deployment of Artificial Intelligence (AI)-based services on mobile/edge devices. In CI, the AI model (a deep neural network) is split between the edge and the cloud, and intermediate features are sent from the edge sub-model to the cloud sub-model. In this article, we study bit allocation for feature coding in multi-stream CI systems. We model task distortion as a function of rate using convex surfaces similar to those found in distortion-rate theory. Dac51 concentration Using such models, we are able to provide closed-form bit allocation solutions for single-task systems and scalarized multi-task systems. Moreover, we provide analytical characterization of the full Pareto set for 2-stream k -task systems, and bounds on the Pareto set for 3-stream 2-task systems. Analytical results are examined on a variety of DNN models from the literature to demonstrate wide applicability of the results.In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues. First, to obtain a robust appearance model, we develop a novel late fusion method to infer the fusion weight maps of both RGB and thermal (T) modalities. The fusion weights are determined by using offline-trained global and local multimodal fusion networks, and then adopted to linearly combine the response maps of RGB and T modalities. Second, when the appearance cue is unreliable, we comprehensively take motion cues, i.e., target and camera motions, into account to make the tracker robust. We further propose a tracker switcher to switch the appearance and motion trackers flexibly. Numerous results on three recent RGB-T tracking datasets show that the proposed tracker performs significantly better than other state-of-the-art algorithms.We propose a neural network model to estimate the current frame from two reference frames, using affine transformation and adaptive spatially-varying filters. The estimated affine transformation allows for using shorter filters compared to existing approaches for deep frame prediction. The predicted frame is used as a reference for coding the current frame. Since the proposed model is available at both encoder and decoder, there is no need to code or transmit motion information for the predicted frame. By making use of dilated convolutions and reduced filter length, our model is significantly smaller, yet more accurate, than any of the neural networks in prior works on this topic. Two versions of the proposed model – one for uni-directional, and one for bi-directional prediction – are trained using a combination of Discrete Cosine Transform (DCT)-based l1 -loss with various transform sizes, multi-scale Mean Squared Error (MSE) loss, and an object context reconstruction loss. The trained models are integrated with the HEVC video coding pipeline. The experiments show that the proposed models achieve about 7.3%, 5.4%, and 4.2% bit savings for the luminance component on average in the Low delay P, Low delay, and Random access configurations, respectively.Passive cavitation mapping (PCM), which generates images using bubble acoustic emission signals, has been increasingly used for monitoring and guiding focused ultrasound surgery (FUS). PCM can be used as an adjunct to magnetic resonance imaging to provide crucial information on the safety and efficacy of FUS. The most widely used algorithm for PCM is delay-and-sum (DAS). One of the major limitations of DAS is its suboptimal computational efficiency. Although frequency-domain DAS can partially resolve this issue, such an algorithm is not suitable for imaging the evolution of bubble activity in real time and for cases in which cavitation events occur asynchronously. This study investigates a transient angular spectrum (AS) approach for PCM. The working principle of this approach is to backpropagate the received signal to the domain of interest and reconstruct the spatial-temporal wavefield encoded with the bubble location and collapse time. The transient AS approach is validated using an in silico model and water bath experiments. It is found that the transient AS approach yields similar results to DAS, but it is one order of magnitude faster. The results obtained by this study suggest that the transient AS approach is promising for fast and accurate PCM.Training convolutional neural networks (CNNs) for segmentation of pulmonary airway, artery, and vein is challenging due to sparse supervisory signals caused by the severe class imbalance between tubular targets and background. We present a CNNs-based method for accurate airway and artery-vein segmentation in non-contrast computed tomography. It enjoys superior sensitivity to tenuous peripheral bronchioles, arterioles, and venules. The method first uses a feature recalibration module to make the best use of features learned from the neural networks. Spatial information of features is properly integrated to retain relative priority of activated regions, which benefits the subsequent channel-wise recalibration. Then, attention distillation module is introduced to reinforce representation learning of tubular objects. Fine-grained details in high-resolution attention maps are passing down from one layer to its previous layer recursively to enrich context. Anatomy prior of lung context map and distance transform map is designed and incorporated for better artery-vein differentiation capacity.