
Understanding PRAM and VLSI Models in Parallel Computing
Explore the concept of PRAM (Parallel Random Access Machines) and VLSI models used in parallel computing. PRAM models allow for arbitrary but finite processors to access shared memory synchronously, while VLSI models have processors executing instructions concurrently. Learn about different memory access techniques and variations of the PRAM model such as EREW and CREW.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
PRAM Parallel Random Access Machines Arbitrary but finite number of processors can access any value in an arbitrarily large shared memory in a single time step. Processors execute different instruction stream but work synchronously.
Model: The machine size n can be arbitrarily large Machine is synchronous at instruction level. Each processor is executing its own series of instructions and entire machine operates at a basic time step / cycle. Within each cycle, each processor executes exactly one operation (or idle). Instructions can fetch operands from memory, perform ALU operation on data, store result back in memory.
All processors implicitly synchronize on each cycle and synchronization overhead is zero. Communication is done through reading and writing shared variables. Memory access through UMA, NUMA, CREW, EREW, or CRCW with defined conflict policy.
PRAM model can be applied to SIMD class of machines if all processors execute identical instructions on same cycle or MIMD class of machines if processors are executing different instructions. Load imbalance is the only overhead in PRAM model.
P1 Tightly synchronized P2 Processing elements operates on a synchronized read-memory, compute and write-memory cycle Shared Memory Pn
Four memory update operations are possible: Exclusive read (ER) At most one processor is allowed to read from any memory location in each cycle, Exclusive write (EW) At most one processor is allowed to write into a memory location at a time. Concurrent read (CR) Multiple processors are allowed to read the same information from same memory cell in the same cycle. Concurrent write (CW) Allows simultaneous writes to the same memory location. Here policy must be defined to avoid write conflicts.
Four most important variations of PRAM Model EREW PRAM Model Exclusive Read Exclusive Write Any memory location can be accessed once in any one step. Blocking other processors from accessing the same memory location simultaneously. CREW PRAM Model - Concurrent Read Exclusive Write Any memory location may be read any number of times during a single step. But write takes place once in a step.
ERCW PRAM Model - Exclusive Read Concurrent Write to the same memory location CRCW PRAM Model Concurrent Read Concurrent Write to the same memory location EREW and CRCW are the most popular models among algorithm designers. Q. Which model is faster?
VLSI Complexity Model Parallel computers largely dependent on use of VLSI chips to fabricate major components like processor arrays, memory arrays, large scale switching networks. Due to rapid advancements in VLSI technology, computer architect trying to implement parallel algorithms directly in hardware.
The AT2 model A is chip area T is latency for completing given computation
Model represented in three ways: Memory bound on chip area Many computations are memory bound because they need to process large data sets, memory requirement is fulfilled through dense area A (fig a). I/O bound on volume AT Data flow through the chip for a period of time T, volume AT (fig a). Bisection communication bound AT Bisection area represents maximum amount of information exchange between two half of the chip circuit during time period T. Cross section AT limits communication bandwidth of a computation (fig b).
Time Time T T A A Chip Area Chip Area A b) Communication bound on the bisection A T a) Memory bound on chip area A and I/O bound on chip represented by the volume AT