Active Awards (sorted by PI)
Yakov Nekritch
Amount: $594,059
Sponsor: National Science Foundation
Awarded: October 2022
Soner Onder
Amount: $331,704
Sponsor: National Science Foundation
Awarded: June 2022
Zhenlin Wang
Amount: $29,923
Sponsor: Clemson University / National Science Foundation
Awarded: February 2024
Select Past Awards
PI: Soner Onder
Sponsor: NSF (Florida State University)
Amount Funded: $18,382
Date Awarded: February 2022
PI: Junqiao Qiu
Sponsor: NSF
Amount Funded: $190,397
Date Awarded: May 2021
PI: Soner Onder
Sponsor: NSF
Amount Funded: $149,996
Date Awarded: May 2021
PI: Soner Onder, SAS, CS
Sponsor: National Science Foundation
Award: $246,329 | 4 Years
Awarded: September 2019
Abstract: Enabling better performing systems benefits applications that span those running on
mobile devices to large data applications running on data centers. The efficiency
of most applications is still primarily affected by single thread performance. Instruction-level
parallelism (ILP) speeds up programs by executing instructions of the program in parallel,
with 'superscalar' processors achieving maximum performance. At the same time, energy
efficiency is a key criteria to keep in mind as such speedup happens, with these two
being conflicting criteria in system design. This project develops a Statically Controlled
Asynchronous Lane Execution (SCALE) approach that has the potential to meet or exceed
the performance of a traditional superscalar processor while approaching the energy
efficiency of a very long instruction word (VLIW) processor. As implied by its name,
the SCALE approach has the ability to scale to different types and levels of parallelism.
The toolset and designs developed in this project will be available as open-source
and will also have an impact on both education and research. The SCALE architectural
and compiler techniques will be included in undergraduate and graduate curricula.
The SCALE approach supports separate asynchronous execution lanes where dependencies between instructions in different lanes are statically identified by the compiler to provide inter-lane synchronization. Providing distinct lanes of instructions allows the compiler to generate code for different modes of execution to adapt to the type of parallelism that is available at each point within an application. These execution modes include explicit packaging of parallel instructions, parallel and pipelined execution of loop iterations, single program multiple data (SPMD) execution, and independent multi-threading.
PI: Soner Onder, SAS, CS
Sponsor: National Science Foundation
Award: $230,744 | 3 Years
Awarded: August 2019
Abstract: Instruction-level parallelism (ILP) in computing allows different machine-level instructions
within an application to execute in parallel within a micro-processor. Exploitation
of ILP has provided significant performance benefits in computing, but there has been
little improvement in ILP in recent years. This project proposes a new approach called
"eager execution" that could significantly increase ILP. The success of many applications
depends on how efficiently they can be executed. The proposed eager execution technique
will benefit applications that span those running on mobile devices to large data
applications running on the ever-growing number of data centers. Enabling better systems
at all scales will further enable the ubiquitous computing that continues to pervade
lives.
The project's approach includes the following advantages: (1) immediately-dependent consumer instructions can be more quickly delivered to functional units for execution; (2) the execution of instructions whose source register values have not changed since its last execution can be detected and redundant computation can be avoided; (3) the dependency between a producer/consumer pair of instructions can sometimes be collapsed so they can be simultaneously dispatched for execution; (4) consumer instructions from multiple paths may be speculatively executed and their results can be naturally retained in the paradigm to avoid re-execution after a branch misprediction; and (5) critical instructions can be eagerly executed to improve performance, which include loads to prefetch cache lines and pre-computation of branch results to avoid branch misprediction delays.
PI: Jianhui Yue, SAS, CS
Sponsor: National Science Foundation
Award: $192,716 | 3 Years
Awarded: July 2017
Abstract: Emerging nonvolatile memory (NVM) technologies, such as PCM, STT-RAM, and memristors,
provide not only byte-addressability, low-latency reads and writes comparable to DRAM,
but also persistent writes and potentially large storage capacity like an SSD. These
advantages make NVM likely to be next-generation fast persistent storage for massive
data, referred to as in-memory storage. Yet, NVM-based storage has two challenges:
(1) Memory cells have limited write endurance (i.e., the total number of program/erase
cycles per cell); (2) NVM has to remain in a consistent state in the event of a system
crash or power loss. The goal of this project is to develop an efficient in-memory
storage framework that addresses these two challenges. This project will take a holistic
approach, spanning from low-level architecture design to high-level OS management,
to optimize the reliability, performance, and manageability of in-memory storage.
The technical approach will involve understanding the implication and impact of the
write endurance issue when cutting-edge NVM is adopted into storage systems. The improved
understanding will motivate and aid the design of cost-effective methods to improve
the life-time of in-memory storage and to achieve efficient and reliable consistence
maintenance.
Publications:
Pai Chen, Jianhui Yue, Xiaofei Liao, Hai Jin. “Optimizing DRAM Cache by a Trade-off
between Hit Rate and Hit Latency,” IEEE Transactions on Emerging Topics in Computing, 2018.
Chenlei Tang, Jiguang Wan, Yifeng Zhu, Zhiyuan Liu, Peng Xu, Fei Wu and Changsheng
Xie. “RAFS: A RAID-Aware File System to Reduce Parity Update Overhead for SSD RAID,” Design Automation Test In Europe Conference (DATE) 2019, 2019.
Pai Chen, Jianhui Yue, Xiaofei Liao, Hai Jin. “Trade-off between Hit Rate and Hit
Latency for Optimizing DRAM Cache,” IEEE Transactions on Emerging Topics in Computing, 2018.