Introduction to Grid and Cloud Computing

grid and cloud computing n.w
1 / 50
Embed
Share

Distributed computing involves multiple autonomous computers communicating through a network to achieve computational tasks efficiently. Explore the evolution of distributed computing, technologies, clusters, grid computing, and cloud computing with insights from renowned authors. Dive into distributed systems, their communication, and coordination across different locations, offering a scalable approach to computing tasks.

  • Grid Computing
  • Cloud Computing
  • Distributed Systems
  • Network Technologies
  • Scalable Computing

Uploaded on | 1 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. GRID AND CLOUD COMPUTING Introduction to Grid and Cloud Computing Courtesy: Dr Gnanasekaran Thangavel http://web.uettaxila.edu.pk/CMS/SPR2022/teGNCCms/

  2. UNIT I INTRODUCTION Evolution of Distributed computing: Scalable computing over the Internet Technologies for network based systems Clusters of cooperative computers Grid computing Infrastructures Cloud computing 3/12/2025 2

  3. Text Book Author: Ian Foster, Carl Kesselman Publisher: Elsevier, - 2-Dec-2003 - 748 pages 3/12/2025 3

  4. Text Book Author: Barrie Sosinsky Publisher: John Wiley & Sons, - 10-Dec-2010 - 473 pages 3/12/2025 4

  5. Reference Book Authors: Judith Hurwitz, Robin Bloor, Marcia Kaufman, Fern Halper Publisher: John Wiley & Sons, - 2010 - 339 pages 3/12/2025 5

  6. Distributed Computing Definition: A distributed system consists of multiple autonomous computers that communicate through a computer network. Distributed computing utilizes a network of many computers, each accomplishing a portion of an overall task, to achieve a computational result much more quickly than with a single computer. Distributed computing is type of computing that involves multiple computers; remote from each other with each having a role in a computation problem or information processing. 3/12/2025 6

  7. Introduction A distributed system is one in which hardware or software components located at networked computers communicate and coordinate their actions only by message passing. In the term distributed computing, the word distributed means spread out across space. Thus, distributed computing is an activity performed on a spatially distributed system. These networked computers may be in the same room, same campus, same country, or in different continents 3/12/2025 7

  8. Introduction Agent Agent Cooperation Agent Cooperation Distribution Distribution Cooperation Distribution Agent Internet Subscription Distributio n Job Request Large-scale Application Resource Management 3/12/2025 8

  9. Motivation Inherently distributed applications Performance/cost Resource sharing Flexibility and extensibility Availability and fault tolerance Scalability Network connectivity is increasing. Combination of cheap processors often more cost-effective than one expensive fast system. Potential increase of reliability. 3/12/2025 9

  10. History 1975 -1985 Parallel computing was favored in the early years Primarily vector-based at first Gradually more thread-based parallelism was introduced The first distributed computing programs were a pair of programs called Creeper and Reaper invented in 1970s Ethernet that was invented in 1970s. ARPANET e-mail was invented in the early 1970s and probably the earliest example of a large-scale distributed application. 3/12/2025 10

  11. History 1985 -1995 Massively parallel architectures start rising and message passing interface and other libraries developed Bandwidth was a big problem The first Internet-based distributed computing project was started in 1988 by the DEC System Research Center. Distributed.net was a project founded in 1997 - considered the first to use the internet to distribute data for calculation and collect the results, 3/12/2025 11

  12. History 1995 Today Cluster/grid architecture increasingly dominant Special node machines were avoided in favor of COTS technologies Web-wide cluster software Google take this to the extreme (thousands of nodes/cluster) SETI@Home startedin May 1999 - analyze the radio signals that were being collected by the Arecibo Radio Telescope in Puerto Rico. 3/12/2025 12

  13. Goal Making Resources Accessible Data sharing and device sharing Distribution Transparency Access, location, migration, relocation, replication, concurrency, failure Communication Make human-to-human comm. easier. e.g.. : electronic mail Flexibility Spread the workload over the available machines in the most cost- effective way To coordinate the use of shared resources To solve large computational problem 3/12/2025 13

  14. Characteristics Resource Sharing Openness Concurrency Scalability Fault Tolerance Transparency 3/12/2025 14

  15. Distributed Computing Architecture Client-server 3-tier architecture N-tier architecture loose coupling, or tight coupling Peer-to-peer Space based 3/12/2025 15

  16. Application of Distributed Systems Examples of commercial application : Database Management System Distributed computing using mobile agents Local intranet Internet (World Wide Web) JAVA Remote Method Invocation (RMI) 3/12/2025 16

  17. Distributed Computing Using Mobile Agents Mobile agents can be wandering around in a network using free resources for their own computations. 3/12/2025 17

  18. Local Intranet A portion of Internet that is separately administered & supports internal sharing of resources (file/storage systems and printers) using Internet Protocols is called local intranet. 3/12/2025 18

  19. Internet The Internet is a global system of interconnected computer networks that use the standardized Internet Protocol Suite (TCP/IP). 3/12/2025 19

  20. JAVA RMI Embedded in language Java:- Object variant of remote procedure call Adds naming compared with RPC (Remote Procedure Call) Restricted to Java environments RMI Architecture 3/12/2025 20

  21. Advantages Economics:- Computers harnessed together give a better price/performance ratio than mainframes. Speed:- A distributed system may have more total computing power than a mainframe. Inherent distribution of applications:- Some applications are inherently distributed. e.g., an ATM-banking application. Reliability:- If one machine crashes, the system as a whole can still survive if you have multiple server machines and multiple storage devices (redundancy). Extensibility and Incremental Growth:- Possible to gradually scale up (in terms of processing power and functionality) by adding more resources (both hardware and software). This can be done without disruption to the rest of the system. 3/12/2025 21

  22. Disadvantages Complexity :- Lack of experience in designing and implementing a distributed system. e.g., which platform (hardware and OS) to use, which language to use etc. Network problem:- If the network underlying a distributed system saturates or goes down, then the distributed system will be effectively disabled thus negating most of the advantages of the distributed system. Security:- Security is a major hazard since easy access to data means easy access to secret data as well. 3/12/2025 22

  23. Issues and Challenges Heterogeneity of components :- Variety or differences that apply to computer hardware, network, OS, programming language and implementations by different developers. All differences in representation must be dealt with to do message exchange. Example : Different calls for exchange of messages in UNIX is different from Windows. Openness:- System can be extended and re-implemented in various ways. Cannot be achieved unless the specification and documentation are made available to software developer. The most challenge to designer is to tackle the complexity of distributed system; design by different people. 3/12/2025 23

  24. Issues and Challenges cont Transparency:- Aim : make certain aspects of distribution invisible to the application programmer; focus on design of their particular application. They are not concerned about the locations and details of how it operates, either replicated or migrated. Failures can be presented to application programmers in the form of exceptions that must be handled. 3/12/2025 24

  25. Issues and Challenges cont Transparency:- This concept can be summarize as shown in this Figure: 3/12/2025 25

  26. Issues and Challenges cont Security:- Security for information resources in distributed system have 3 components : a. Confidentiality : protection against disclosure to unauthorized individuals. b. Integrity : protection against alteration/corruption c. Availability : protection against interference with the means to access the resources. The challenge is to send sensitive information over Internet in a secure manner and to identify a remote user or other agent correctly. 3/12/2025 26

  27. Issues and Challenges cont.. Scalability :- Distributed computing operates at many different scales, ranging from small Intranet to Internet. A system is scalable if there is significant increase in the number of resources and users. The challenges is : a. controlling the cost of physical resources. b. controlling the performance loss. c. preventing software resource running out. d. avoiding performance bottlenecks. 3/12/2025 27

  28. Issues and Challenges cont Failure Handling :- Failures in a distributed system are partial some components fail while others can function. That s why handling the failures are difficult: a. Detecting failures : to manage the presence of failures cannot be detected but may be suspected. b. Masking failures : hiding failure not guaranteed in the worst case. Concurrency :- Where applications/services process concurrency, it will affect a conflict in operations with one another and produce inconsistence results. Each resource must be designed to be safe in a concurrent environment. 3/12/2025 28

  29. Conclusion The concept of distributed computing is the most efficient way to achieve the optimization. Distributed computing is anywhere : intranet, Internet or mobile ubiquitous computing (laptop, PDAs, pagers, smart watches, hi-fi systems). It deals with hardware and software systems, that contain more than one processing / storage and run in concurrently. Main motivation factor is resource sharing; such as files , printers, web pages or database records. Grid computing and Cloud computing are form of distributed computing. 3/12/2025 29

  30. Grid Computing Grid computing is a form of distributed computing whereby a "super and virtual computer" is composed of a cluster of networked, loosely coupled computers, acting in concert to perform very large tasks. Grid computing (Foster and Kesselman, 1999) is a growing technology that facilitates the executions of large-scale resource intensive applications on geographically distributed computing resources. Facilitates flexible, secure, coordinated large scale resource sharing among dynamic collections of individuals, institutions, and resource. Enable communities ( virtual organizations ) to share geographically distributed resources as they pursue common goals. 3/12/2025 30

  31. Criteria for a Grid: Coordinates resources that are not subject to centralized control Uses standard, open, general-purpose protocols and interfaces. Delivers nontrivial qualities of service Benefits Exploit Underutilized resources Resource load Balancing Virtualize resources across an enterprise Data Grids, Compute Grids Enable collaboration for virtual organizations 3/12/2025 31

  32. Grid Applications Data and computationally intensive applications: This technology has been applied to computationally-intensive scientific, mathematical, and academic problems like drug discovery, economic forecasting, seismic analysis Backoffice data processing in support of e- commerce A chemist may utilize hundreds of processors to screen thousands of compounds per hour. Teams of engineers worldwide pool resources to analyze terabytes of structural data. Meteorologists seek to visualize and analyze petabytes of climate data with enormous computational demands. Resource sharing Computers, storage, sensors, networks, Sharing always conditional: issues of trust, policy, negotiation, payment, Coordinated problem solving distributed data analysis, computation, collaboration, 3/12/2025 32

  33. Grid Topologies Intragrid Local grid within an organization Trust based on personal contracts Extragrid Resources of a consortium of organizations connected through a (Virtual) Private Network Trust based on Business to Business contracts Intergrid Global sharing of resources through the internet Trust based on certification 3/12/2025 33

  34. Computational Grid A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities. The Grid: Blueprint for a New Computing Infrastructure , Kesselman & Foster Example : Science Grid (US Department of Energy) 3/12/2025 34

  35. Data Grid A data grid is a grid computing system that deals with data the controlled sharing and management of large amounts of distributed data. Data Grid is the storage component of a grid environment. Scientific and engineering applications require access to large amounts of data, and often this data is widely distributed. A data grid provides seamless access to the local or remote data required to complete compute intensive calculations. Example : Biomedical informatics Research Network (BIRN), the Southern California Earthquake Center (SCEC). 3/12/2025 35

  36. Methods of Grid Computing Distributed Supercomputing High-Throughput Computing On-Demand Computing Data-Intensive Computing Collaborative Computing Logistical Networking 3/12/2025 36

  37. Distributed Supercomputing Combining multiple high-capacity resources on a computational grid into a single, virtual distributed supercomputer. Tackle problems that cannot be solved on a single system. 3/12/2025 37

  38. High-Throughput Computing Uses the grid to schedule large numbers of loosely coupled or independent tasks, with the goal of putting unused processor cycles to work. On-Demand Computing Uses grid capabilities to meet short-term requirements for resources that are not locally accessible. Models real-time computing demands. 3/12/2025 38

  39. Collaborative Computing Concerned primarily with enabling and enhancing human-to-human interactions. Applications are often structured in terms of a virtual shared space. Data-Intensive Computing The focus is on synthesizing new information from data that is maintained in geographically distributed repositories, digital libraries, and databases. Particularly useful for distributed data mining. 3/12/2025 39

  40. Logistical Networking Logistical networks focus on exposing storage resources inside networks by optimizing the global scheduling of data transport, and data storage. Contrasts with traditional networking, which does not explicitly model storage resources in the network. high-level services for Grid applications Called "logistical" because of the analogy it bears with the systems of warehouses, depots, and distribution channels. 3/12/2025 40

  41. P2P Computing vs Grid Computing Differ in Target Communities Grid system deals with more complex, more powerful, more diverse and highly interconnected set of resources than P2P P2P uses heterogeneous end user devices for resource sharing to fulfill the application requirements. Business logic and data is distributed among end user nodes for P2P applications. 3/12/2025 41

  42. A typical view of Grid environment Grid Information Service 1. Grid Information Service system collects the details of the available Grid resources and passes the information to the resource broker. Details of Grid resources 1 Computational jobs 2 4 Grid application Processed jobs Computation result 3 User Resource Broker Grid Resources 3. A ResourceBroker distribute the jobs in an application to the Grid resources based on user s QoS requirements and details of available Grid resources for further executions. 2. A User sends computation or data intensive application to Global Grids in order to speed up the execution of the application. 4. Grid Resources (Cluster, PC, Supercomputer, database, instruments, etc.) in the Global Grid execute the user jobs. 3/12/2025 42

  43. Grid Middleware Grids are typically managed by grid ware - a special type of middleware that enable sharing and manage grid components based on user requirements and resource attributes (e.g., capacity, performance) Software that connects other software components or applications to provide the following functions: Run applications on suitable available resources Brokering, Scheduling Provide uniform, high-level access to resources Address inter-domain issues of security, policy, etc. Federated Identities Provide application-level status monitoring and control 3/12/2025 43

  44. Middleware Globus Chicago Univ Condor Wisconsin Uni High throughput computing Legion Virginia Univ Virtual workspaces - Collaborative computing IBP Internet back plane Tennesse Univ logistical networking NetSolve solving scientific problems in heterogeneous env high throughput & data intensive 3/12/2025 44

  45. Two Key Grid Computing Groups The Globus Alliance (www.globus.org) Composed of people from: Argonne National Labs, University of Chicago, University of Southern California Information Sciences Institute, University of Edinburgh and others. OGSA/I standards initially proposed by the Globus Group The Global Grid Forum (www.ggf.org) Heavy involvement of Academic Groups and Industry (e.g. IBM Grid Computing, HP, United Devices, Oracle, UK e-Science Programme, US DOE, US NSF, Indiana University, and many others) Process Meets three times annually Solicits involvement from industry, research groups, and academics 3/12/2025 45

  46. Some of the Major Grid Projects Name URL/Sponsor Focus EuroGrid, Grid Interoperability (GRIP) eurogrid.org European Union Create tech for remote access to super comp resources & simulation codes; in GRIP, integrate with Globus Toolkit Create a national computational collaboratory for fusion research Fusion Collaboratory fusiongrid.org DOE Off. Science globus.org DARPA, DOE, NSF, NASA, Msoft gridlab.org European Union gridpp.ac.uk U.K. eScience grids-center.org NSF Research on Grid technologies; development and support of Globus Toolkit ; application and deployment Globus Project GridLab Grid technologies and applications GridPP Create & apply an operational grid within the U.K. for particle physics research Grid Research Integration Dev. & Support Center Integration, deployment, support of the NSF Middleware Infrastructure for research & education 3/12/2025 46

  47. Cloud Computing Cloud computing refers to applications and services that run on a distributed network using virtualized resources and accessed by common Internet protocols and networking standards. It is distinguished by the notion that resources are virtual and limitless and that details of the physical systems on which software runs are abstracted from the user. 3/12/2025 47

  48. Cloud Computing Cloud computing takes the technology, services, and applications that are similar to those on the Internet and turns them into a self-service utility. The use of the word cloud makes reference to the two essential concepts: Abstraction: Cloud computing abstracts the details of system implementation from users and developers. Applications run on physical systems that aren't specified, data is stored in locations that are unknown, administration of systems is outsourced to others, and access by users is ubiquitous. Virtualization: Cloud computing virtualizes systems by pooling and sharing resources. Systems and storage can be provisioned as needed from a centralized infrastructure, costs are assessed on a metered basis, multi- tenancy is enabled, and resources are scalable with agility. 3/12/2025 48

  49. Cloud Computing Cloud computing is an abstraction based on the notion of pooling physical resources and presenting them as a virtual resource. It is a new model for provisioning resources, for staging applications, and for platform-independent user access to services. To help clarify how cloud computing has changed the nature of commercial system deployment, consider these three examples: Google: In the last decade, Google has built a worldwide network of datacenters to service its search engine. In doing so Google has captured a substantial portion of the world's advertising revenue. That revenue has enabled Google to offer free software to users based on that infrastructure and has changed the market for user-facing software. This is the classic Software as a Service case. Azure Platform: By contrast, Microsoft is creating the Azure Platform. It enables .NET Framework applications to run over the Internet as an alternate platform for Microsoft developer software running on desktops. Amazon Web Services: One of the most successful cloud-based businesses is Amazon Web Services, which is an Infrastructure as a Service offering that lets you rent virtual computers on Amazon's own infrastructure. 3/12/2025 49

  50. Thank You Questions and Comments? 3/12/2025 50

More Related Content