
Advanced Computer Architecture and Algorithms in Computer Science and Statistics
Explore advanced computer architecture concepts and parallel algorithms in computer science and statistics, featuring topics like DNA analysis, shared memory multiprocessors, lockless algorithms, and binary search trees. Delve into the intricate world of algorithms and their significance in modern computing.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science CS4021/4521 Advanced Computer Architecture II Dr Jeremy Jones web: https://www.scss.tcd.ie/Jeremy.Jones/CS4021/CS4021.htm some changes from previous years
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science parallel algorithms on shared memory multiprocessors maximise performance DNA analysis (bioinformatics, string searching) multiple threads, SIMD instructions (SSEx, AVXx, ...) cost of sharing data, memory bandwidth, the problem with locks, ... lock implementations CAS based lockless algorithms (for lists, trees, hash tables...) lockless algorithms using hardware transactional memory
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science Remember algorithms underpin most of what Computer Scientists and Engineers do "algorithms are the poetry of the 21st century"
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science human genome approx. 3.2 billion bases (length n) ACGTACGTGGTAACCCGGTTA..... DNA sequencing techniques generate millions of short reads of 100 or so bases (length m) CGTGGTAA... have to find the reads in the genome (maybe with inexact matching) using the Burroughs Wheeler Transform (BWT) can find each read in O(m) rather than O(n) can parallelise BWT generation and searching for the reads
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science 5 Consider a Binary Search Tree 20 22 add (50) add(45) remove(45) NO children remove(25) ONE child remove(20) TWO children 10 30 25 40 find node (20) find smallest key in its right sub tree (22) overwrite key 20 with 22 remove old node 22 (will have zero or one child) 22 50 45
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science Concurrent add operations 5 concurrently add(27) and add(50) 20 OK if adding to different leaf nodes 10 30 concurrently add(23) and add(24) 25 40 problem as adding to same leaf node 22 27 50 result depends on how the steps of the operations are interleaved could work correctly, BUT... if there's a conflict ONLY one node may be added 24 23 concurrent removes also possible if there is no conflict
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science Concurrent Updates binary search tree normally protected by a single lock so concurrent updates by multiple threads are NOT possible plenty of scope for concurrent updates provided they do NOT conflict with one other (with a large tree conflicts will be rare) hence lockless algorithms
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science lock based algorithms are pessimistic assumes something will always go wrong lockless algorithms are optimistic assumes nothing will go wrong, deal with conflicts when detected exploits parallelism higher throughput / performance if done correctly lockless algorithms implemented using CAS instructions of hardware transactional memory (Intel TSX instruction set)
School of Computer Science and Statistics Faculty of Engineering, Mathematics and Science LOOKING FORWARDTO SEEING YOUIN SEPTEMBER