Skip to main content

Publications

Filter by type:

, , , , Improving Neural Network Quantization without Retraining using Outlier Channel Splitting, International Conference on Machine Learning (ICML).

Details arXiv PDF Slides Code

, , , , Building Efficient Deep Neural Networks with Unitary Group Convolutions, Computer Vision and Pattern Recognition (CVPR).

Details arXiv PDF

, , , , , , , , , , , , , , , , , , , The Celerity Open-Source 511-Core RISC-V Tiered Accelerator Fabric: Fast Architectures and Design Methodologies for Fast Chips, IEEE Micro (Vol 38, Issue 2).

Details DOI

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Serving DNNs in Real Time at Datacenter Scale with Project Brainwave, IEEE Micro (Vol 38, Issue 2).

Details DOI

, , , , , , , , , , , Rosetta: A Realistic High-Level Synthesis Benchmark Suite for Software Programmable FPGAs, Int’l Symp. On Field-Programmable Gate Arrays (FPGA).

Details DOI Code

, , , , , , Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration, Computer Vision and Pattern Recognition Workshops (CVPRW).

Details DOI

, , , , , , , Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs, Int’l Symp. On Field-Programmable Gate Arrays (FPGA).

Details DOI PDF Code

, , , , , A Parallel Bandit-Based Approach for Autotuning FPGA Compilation, Int’l Symp. On Field-Programmable Gate Arrays (FPGA).

Details DOI Code

, , , , Improving High-Level Synthesis with Decoupled Data Structure Optimization, Design Automation Conference (DAC).

Details DOI PDF Slides

, , , , ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests, Int’l Conf. on Computer Aided Design (ICCAD).

Details DOI

, , , Area-Efficient Pipelining for FPGA-Targeted High-Level Synthesis, Design Automation Conference (DAC).

Details DOI PDF Slides