Quote
Content
This scientific talk discusses methods for optimizing deep learning architectures on embedded systems. It highlights key challenges, such as limited processing power, memory constraints, and real-time performance requirements. Model compression techniques, including quantization, pruning, knowledge distillation, and weight sharing, are explored to reduce memory usage and computational complexity. Hardware-software co-design is emphasized, leveraging specialised accelerators like NPUs, GPUs, and FPGAs to improve efficiency. Additionally, software optimization techniques, demonstrated through a radar-based hand gesture recognition project, showcase how deep learning can be effectively deployed on edge devices while balancing accuracy, performance, and resource constraints.