Klotski v2: Improved DNN Model Orchestration Framework for Dataflow Architecture Accelerators

Abstract

Dataflow architecture accelerators are a new kind of scalable DNN accelerators. For an instruction, the availability of input operands solely determines the beginning of executions. DNN model orchestration determines how to partition, schedule, and map the computation to the underlying hardware. In this article, we propose the Klotski v2 framework to solve DNN model orchestration for dataflow architecture accelerators. First, a Bayesian optimization-based entropy-directed partition algorithm is proposed to transform a DNN model into μ ops. Second, a unified formal formulation for μ ops scheduling and mapping is presented. Third, a two-stage methodology is proposed to decouple the scheduling and mapping. Fourth, a Hilbert curve-based mapping heuristic is proposed to enhance problem-solving efficiency, improving the trade-off between solution quality and algorithm runtime. Extensive results show that Klotski v2 can achieve an average of 21:57% higher execution performance improvement than previous methodologies.With the Hilbert curve-based mapping heuristic, we improve the algorithm efficiency by an average of 63:50% across different DNN workloads.

Publication
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD)

Add the full text or supplementary notes for the publication here using Markdown formatting.

Youwei Zhuo
Youwei Zhuo
Assistant Professor of Computer Science