为满足耦合地球系统模式应用的需求,提出了一种二维样条插值算法,并将其有效地实现成插值模块封装进地球系统建模框架(earth system modeling framework,ESMF)。该算法基于经典样条算法,根据地球系统模式特点进行修改,用两次一维插值扩张成二维插值,引入极点区域外插处理,将插值权重生成与插值结果计算两部分分离。实验结果表明,该算法能获得高精度的插值结果,模块化的设计使得用户可通过统一的接口来使用插值算法从而完成插值计算。
The coupler is fundamental for a coupled model to realize complex interactions among component models.This paper focuses on the coupling process of Wave-Circulation(W-C) coupled model which consists of MASNUM(key laboratory of marine science and numerical modeling wave model)and POM(Princeton Ocean Model).The current coupling module of this coupled model is based on the inefficient I/O file,which has already become a performance bottleneck especially when the coupled model utilizes a large number of processes.To improve the performance of the W-C model,a flexible coupling module based on the model coupling toolkit(MCT) is designed and implemented to replace the current I/O file coupling module in the coupled model.Empirical studies that we have carried out demonstrate that our online coupling module can dramatically improve the parallel performance of the coupled model.The online coupling module outperforms the I/O file coupling module.When processes increase to 96,the whole process of EXP-C takes only 695.8 seconds,which is only 58.8%of the execution time of EXP-F.Based on our experiments under 2D Parallel Decomposition(2DPD),we suggest setting parallel decomposition strategies automatically to component models in order to achieve high parallel efficiency.
This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the multiplier-matrix,and the other is caused by the multiplicand.For each of them,the paper puts forward an optimization method respectively.The first hash based method removes cache misses of the 1 st category effectively,and improves the performance by a factor of 6 on an Intel 8-core CPU for the best cases.For cache misses of the 2nd category,it proposes a new cache replacement algorithm,which achieves a cache hit rate much higher than other historical knowledge based algorithms,and the algorithm is applicable on CELL and GPU.To further verify the effectiveness of our methods,we implement our algorithm on GPU,and the performance perfectly scales with the size of on-chip storage.