在遮挡图像中,行人目标通常被其他物体部分或完全遮挡,导致其外观特征不完整、边缘模糊,甚至与背景或遮挡物混淆。行人遮挡目标的检测需要算法能够在特征缺失的情况下,仍然准确识别和定位目标。为了解决这一挑战,本文基于YOLOv10提出一种融合多尺度自注意力机制(Efficient Multi-directional Self-Attention, EMSA)的多尺度感知能力的YOLOv10改进方法。首先在YOLOv10中的C2f中融合MSDA注意力机制,增强了模型在多尺度上的特征捕捉能力,提升了对不同尺度遮挡目标的检测能力,通过自适应地加权不同通道的特征,提高了对遮挡目标特征的关注;其次基于动态聚焦机制引入新的损失函数Focaleriou,动态调整损失焦点,提高对不同尺度目标的检测能力,同时改善边界框回归损失收敛速度,之后添加了小目标检测头,增强小遮挡目标的特征提取能力;最后使用公开数据集Citypersons进行消融实验。结果表明,该融合了MSDA注意力机制的模型平均精度(Map@0.5)达到了62.3%,相较于官方YOLOv10n提升了2.2%。实验结果表明该EMSA注意力能够有效改进行人遮挡目标的检测,满足自动驾驶、监控等应用场景下的行人遮挡场景的检测需求。In occluded images, pedestrian targets are often partially or completely blocked by other objects, leading to incomplete appearance features, blurred edges, and even confusion with the background or occluding objects. Detecting occluded pedestrian targets requires algorithms capable of accurately recognizing and localizing targets despite missing features. To address this challenge, this paper proposes an improved YOLOv10 method with enhanced multi-scale perception by integrating an Efficient Multi-directional Self-Attention (EMSA) mechanism. Firstly, the MSDA attention mechanism is incorporated into the C2f module of YOLOv10 to enhance the model’s ability to capture features at multiple scales, improving the detection of occluded t