Database Schema Fragmentation Optimization

p resented n.w
1 / 18
Embed
Share

Explore the concepts of horizontal and vertical database schema fragmentation, discussing their advantages, drawbacks, and optimization strategies. Learn how to derive horizontal fragments, ensure correctness, and address various drawbacks in database design.

  • Database
  • Schema
  • Fragmentation
  • Optimization
  • Horizontal

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. PRESENTED BY MANOJ KUDUDULA PRESENTATION ID: 10 University of Houston Clear Lake

  2. 1. DERIVED HORIZONTAL FRAGMENTATION 2. DRAWBACKS 2.1 Fragmentation used in more applications 2.2 Fragmentation using better join characteristics 3. CORRECTNESS 3.1 Completeness 3.2 Reconstruction 3.3 Disjointness 4. Vertical Fragmentation

  3. 2. Derived Horizontal Fragmentation A derived horizontal fragmentation is defined on a member relation of a link according to a selection operation specified on its owner It is important to remember two points First, the link between the owner and the member relations is defined as an equi-join Second, an equi-join can be implemented by means of semi-join

  4. 2. DHF (contd.) Accordingly, given a link L where owner(L) = S and member(L) = R, the derived horizontal fragments of R are defined as: Where w is the maximum number of fragments that will be defined on R, and where Fi is the formula according to S which the primary horizontal fragment Si is defined

  5. 2. DHF(contd.) To carry out a derived horizontal fragmentation, three inputs are needed: The set of partitions of the owner relation (PAY1, PAY2) The member relation The set of semi join predicates between the owner and member (EMP.TITLE=PAY.TITLE)

  6. 3. Drawbacks In a database schema if there are two link into a relation R, there could be more than one possible derived horizontal fragmentation of R The choice of candidate fragmentation is based on two criteria The fragmentation used in more applications The fragmentation with better join characteristics

  7. 3.1 The fragmentation used in more Applications It is quite straight forward if we take into consideration the frequency with which application access some data The access of the heavy users can minimize the total impact on system performance

  8. 3.2 The Fragmentation with better join characteristics Consider the last example, the effect of this fragmentation is that the join of the EMP and PAY relations to answer the query is assisted By performing it on smaller relations By potentially performing joins in parallel

  9. 3.2 The Fragmentation with better join characteristics (contd) The first point is obvious, the fragments of EMP are smaller than EMP itself Therefore, it will be faster to join any fragment of PAY with any fragment of EMP than to work with the relations themselves The second point is however, more important and is at the heart of distributed databases If, besides executing a number of queries at different sites, we can parallelize execution of one join query, the response time or throughput of the system can be expected to be improved.

  10. 3.2 The Fragmentation with better join characteristics (contd) Lets now consider ASG, assume that there are two applications The first application finds the names of engineers who work at certain places, it turns on all three sites and accesses the information about the engineer who work on local projects with higher probability than those of projects at other locations At each administrative sites where employee records are managed, users would like to access the responsibilities on the projects that these employee work on and learn how they will work on those projects

  11. 4. Checking of Correctness We should now check the fragmentation algorithms discussed so far with respect to three correctness criteria Completeness Reconstruction Disjointness

  12. 4.1 Completeness The completeness of a primary horizontal fragmentation is based on the selection predicate used As long as the selection predicates are complete, the resulting fragmentation is guaranteed to be complete as well

  13. 4.1 Completeness (contd..) The fragmentation is somewhat more difficult to define completeness of a derived horizontal For example, there should be no ASG tuple which has a project number that is not also contained in PROJ, this rule is know as referential integrity

  14. 4.2 Reconstruction Reconstruction of a global relation from its fragments is performed by the union operator in both the primary and the derived horizontal fragmentation Thus for a relation R with fragmentation

  15. 4.3 Disjointness It is easier to establish Disjointness of fragmentation for primary than for derived horizontal fragmentation In PHF Disjointness is guaranteed as long as the minterm predicates determining the fragmentation are mutually exclusive

  16. Vertical fragmentation vertical fragmentation of a relation R produces fragments R1,R2, ...,Rr , each of which contains a subset of R s attributes as well as the primary key of R. The objective of vertical fragmentation is to partition a relation into a set of smaller relations so that many of the user applications will run on only one fragment

  17. Vertical fragmentation has been investigated within the context of centralized database systems as well as distributed ones. Its motivation within the centralized context is as a design tool, which allows the user queries to deal with smaller relations, thus causing a smaller number of page accesses.

  18. Thank you

Related


More Related Content