Real-time moving object detection algorithm on high-resolution videos using GPUs P Kumar, A Singhal, S Mehta, A Mittal Journal of Real-Time Image Processing 11 (1), 93-109, 2013 | 117 | 2013 |
Tile Size Selection Revisited S Mehta, G Beeraka, PC Yew Transactions on Architecture and Code Optimization, 2014 | 48 | 2014 |
Revisiting Loop Fusion in the Polyhedral Framework S Mehta, PH Lin, PC Yew ACM SIGPLAN PPoPP, 2014 | 39 | 2014 |
Multi-Stage Coordinated Prefetching for Present-day Processors S Mehta, Z Fang, A Zhai, PC Yew Interational Conference on Supercomputing (ICS), 2014 | 34 | 2014 |
Measuring Micro-architectural Details of Multi- and Many-core Memory Systems Through Micro-benchmarking Z Fang, S Mehta Transactions on Architecture and Code Optimization, 2015 | 29 | 2015 |
Improving compiler scalability: optimizing large programs at small price S Mehta, PC Yew ACM SIGPLAN PLDI, 143-152, 2015 | 28 | 2015 |
TurboTiling: Leveraging Prefetching to Boost Performance of Tiled Codes S Mehta, PC Yew International Conference on Supercomputing (ICS), 2016 | 21 | 2016 |
Parallel Implementation of Video Surveillance Algorithms on GPU Architecture using CUDA S Mehta, A Mishra, A Singhal, P Kumar, A Mittal, K Palaniappan Advanced Computing and Communications, 2009 | 14 | 2009 |
Variable-sized blocks for locality-aware SpMV N Namashivavam, S Mehta, PC Yew 2021 IEEE/ACM International Symposium on Code Generation and Optimization …, 2021 | 11 | 2021 |
A high-performance parallel implementation of sum of absolute differences algorithm for motion estimation using CUDA S Mehta, A Misra, A Singhal, P Kumar, A Mittal HiPC Conf 2 (4), 6, 2010 | 10 | 2010 |
WearCore: A core for wearable workloads S Mehta, J Torrellas Parallel Architecture and Compilation Techniques (PACT), 2016 International …, 2016 | 6 | 2016 |
Variable liberalization S Mehta, PC Yew ACM Transactions on Architecture and Code Optimization (TACO) 13 (3), 1-25, 2016 | 6 | 2016 |
Performance analysis and optimization with little’s law S Mehta 2022 IEEE International Symposium on Performance Analysis of Systems and …, 2022 | 3 | 2022 |
High-bandwidth prefetcher for high-bandwidth memory S Mehta, JR Kohn, DJ Ernst, HL Poxon, L DeRose US Patent 9,946,654, 2018 | 3 | 2018 |
Scalable Compiler Optimizations for Improving the Memory System Performance in Multi-and Many-core Processors S Mehta University of Minnesota, 2014 | 3 | 2014 |
Memory allocation system for multi-tier memory HL Poxon, W Homer, DW Oehmke, L DeRose, CD Andreasen, S Mehta US Patent 10,185,659, 2019 | 2 | 2019 |
Method and system for hard ware-assisted pre-execution S Mehta US Patent 11,687,344, 2023 | 1 | 2023 |
An application-oriented approach to designing hybrid cpu architectures A Yue, S Mehta 2023 IEEE International Symposium on Performance Analysis of Systems and …, 2023 | 1 | 2023 |
Speculative register reclamation S Mehta 2023 IEEE International Symposium on High-Performance Computer Architecture …, 2023 | 1 | 2023 |
Systems and methods for increased bandwidth utilization regarding irregular memory accesses using software pre-execution S Mehta, GW Elsesser, TD Greyzck US Patent 11,403,082, 2022 | 1 | 2022 |