- Hardware and Software Optimizations for Accelerating Deep Neural Networks: Survey of Current Trends, Challenges, and the Road Ahead
- Hardware Beyond Backpropagation: a Photonic Co-Processor for Direct Feedback Alignment
- Compiling Spiking Neural Networks to Mitigate Neuromorphic Hardware Constraints
- MacLeR: Machine Learning-based Run-Time Hardware Trojan Detection in Resource-Constrained IoT Edge Devices
- Sparse Systolic Tensor Array for Efficient CNN Hardware Acceleration
- The Hardware Lottery
- Nengo and low-power AI hardware for robust, embedded neurorobotics
- RANC: Reconfigurable Architecture for Neuromorphic Computing
- EagerPy: Writing Code That Works Natively with PyTorch, TensorFlow, JAX, and NumPy
- Qibo: a framework for quantum simulation with hardware acceleration
- Standing on the Shoulders of Giants: Hardware and Neural Architecture Co-Search with Hot Start
- Cortex: A Compiler for Recursive Deep Learning Models
- Static Neural Compiler Optimization via Deep Reinforcement Learning
- MLIR: A Compiler Infrastructure for the End of Moore's Law
- TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir
- Low Latency CMOS Hardware Acceleration for Fully Connected Layers in Deep Neural Networks
- The Deep Learning Compiler: A Comprehensive Survey
- Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training ⭐
- Survey of Machine Learning Accelerators ⭐
- MobileDets: Searching for Object DetectionArchitectures for Mobile Accelerators
- Marvel: A Data-centric Compiler for DNN Operators onSpatial Accelerators
- A Survey on Impact of Transient Faults on BNN Inference Accelerators
- DNN+NeuroSim V2.0: An End-to-End Benchmarking Framework for Compute-in-Memory Accelerators for On-chip Training
- Optimizing Memory-Access Patterns for Deep Learning Accelerators
- Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks
- Hardware Implementation of Deep Network Accelerators Towards Healthcare and Biomedical Applications
- Hardware Acceleration of Sparse and Irregular Tensor Computations of ML Models: A Survey and Insights
- hxtorch: PyTorch for BrainScaleS-2 -- Perceptrons on Analog Neuromorphic Hardware
- Compiling Spiking Neural Networks to Neuromorphic Hardware
- Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach Using MAESTRO
- Exposing Hardware Building Blocks to Machine Learning Frameworks
- Spiking Neural NetworksHardware Implementationsand Challenges: a Survey
- Just another quantum assembly language (Jaqal)
- The Hardware Lottery
- VirtualFlow: Decoupling Deep Learning Model Execution from Underlying Hardware
2020
- Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator
- Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach Using MAESTRO
- Spiking Neural Networks Hardware Implementations and Challenges: a Survey
- Lupulus: A Flexible Hardware Accelerator for Neural Networks
- Benchmarking Deep Spiking Neural Networks on Neuromorphic Hardware
- Real-Time Apple Detection System Using Embedded Systems With Hardware Accelerators: An Edge AI Application
- Exposing Hardware Building Blocks to Machine Learning Frameworks
- DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architectures
- Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware
- MTJ-Based Hardware Synapse Design for Quantized Deep Neural Networks
- Integrating Hardware Diversity with Neural Architecture Search for Efficient Convolutional Neural Networks
- Performance and Comparisons of STDP based and Non-STDP based Memristive Neural Networks on Hardware
- UWB-GCN: Hardware Acceleration of Graph-Convolution-Network through Runtime Workload Rebalancing
- Benchmarking Contemporary Deep Learning Hardware and Frameworks:A Survey of Qualitative Metrics ⭐
- Efficient Hardware Implementation of Incremental Learning and Inference on Chip
- DRCAS: Deep Restoration Network for Hardware Based Compressive Acquisition Scheme
- LSTM-Sharp: An Adaptable, Energy-Efficient Hardware Accelerator for Long Short-Term Memory
- A Stealthy Hardware Trojan Exploiting the Architectural Vulnerability of Deep Learning Architectures: Input Interception Attack (IIA)
- On Neural Architecture Search for Resource-Constrained Hardware Platforms
- K-TanH: Hardware Efficient Activations For Deep Learning
- xYOLO: A Model For Real-Time Object Detection In Humanoid Soccer On Low-End Hardware
- Impact of Inference Accelerators on hardware selection
- Understanding Reuse, Performance, and Hardware Cost of DNN Dataflows: A Data-Centric Approach
- Design Space Exploration of Hardware Spiking Neurons for Embedded Artificial Intelligence
- Towards hardware acceleration for parton densities estimation
- Spiking Neural Network on Neuromorphic Hardware for Energy-Efficient Unidimensional SLAM
- Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators using Time Compression Supporting Multiple Spike Codes
- Mapping Spiking Neural Networks to Neuromorphic Hardware
- Applying Quantum Hardware to non-Scientific Problems: Grover's Algorithm and Rule-based Algorithmic Music Composition
- Deep Neural Network Approximation for Custom Hardware: Where We've Been, Where We're Going
- On-chip learning in a conventional silicon MOSFET based Analog Hardware Neural Network
- Fast Training of Sparse Graph Neural Networks on Dense Hardware
- Hardware Aware Neural Network Architectures using FbNet