Scalable Frame to Block Based Automatic Converter for Efficient Embedded Vision Processing

Computer Vision and Pattern Recognition Workshops(2013)

引用 0|浏览1
暂无评分
摘要
A typical digital signal processor (DSP) uses hierarchical memory to handle the trade-off between cost and speed. It has a fast on-chip memory with data-access rates similar to the DSP's processing rate but it is not large enough to hold the entire Image data. Image buffers typically reside in the larger external memory like DDR whose data access rate is ~4-6X slower than the processor rate. Cache or direct memory access (DMA) mechanisms are used to improve the slow access rate of external memory using the internal memory. Optimizing an embedded processing application to be efficient for such hierarchical memory systems requires block-based algorithm design. This is usually accomplished by manually re-designing the code. This effort requires several man months and DSP expertise. In this paper, we automate this process and demonstrate a performance improvement of ~2-4X over conventional frame level processing. We believe that the proposed solution is novel in the sense that it is fully automated and scalable to any memory size and speed. We use a compiler assisted parser to extract the relevant function parameters and use them to re-target the code to be block-based and handle memory management automatically. This is an offline code generation process with self-verification. We have implemented and tested the parser for Texas Instruments (TI) C6000 DSPs but the method is generic to work with any processor core.
更多
查看译文
关键词
vision processing,fast on-chip memory,external memory,direct memory access,scalable frame,hierarchical memory system,hierarchical memory,memory management,larger external memory,internal memory,data access rate,memory size,automatic converter,optimization,digital signal processing,kernel,algorithm design and analysis,digital signal processor,data transfer,dsp,computer vision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要