Reconstructing Weighted Graphs and Adaptive Algorithms in Computer Science

nader h bshouty and hanna mazzawi n.w
1 / 45
Embed
Share

Explore the research on reconstructing weighted graphs, additive queries, and adaptive vs. non-adaptive algorithms by Nader H. Bshouty and Hanna Mazzawi. Learn about the applications in genome sequencing, reaction graph reconstruction, and information theoretic lower bound in computer science.

  • Graphs
  • Algorithms
  • Computer Science
  • Adaptive
  • Applications

Uploaded on | 18 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Nader H. Bshouty and Hanna Mazzawi Department of Computer Science Technion Israel Institute of Technology

  2. Reconstructing Weighted Graphs AQ(S) Answer

  3. Additive Queries w w 5 7 w w w 4 w 6 9 8 S w w 2 3 1 w = + ( ) AQ S w w 1 2 What is the sum of weight of edges in the subgraph induces by S?

  4. Adaptive VS Non-adaptive Non-adaptivealgorithms: Ask all the queries in advance. Adaptivealgorithms: Can take into account the outcome of previous queries.

  5. Applications Reconstructing Reaction Graph Chem Chem Chem Chem One reaction Chem Chem Chem Chem Chem Chem Chem Chem

  6. Example Genome sequencing: Short reads are assembled to contigs. Contiguous fragments that cover the genome. Query multiplex PCR method. Returns the number of adjacent contigs in the original genome. Contig 1

  7. Information Theoretic Lower Bound The information theoretic lower bound for the query complexity of reconstructing a weighted graph with m edges and n vertices is log m log n m

  8. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 V. Grebinski and G. Kucherov [Discrete Applied Mathematics] Optimal algorithm for reconstructing a hamiltonian cycle

  9. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 V. Grebinski and G. Kucherov [Algorithmica] 1. Tight upper bound for constant degree unweighted graphs. 2. Optimal polynomial time algorithm for unweighted graphs with (n2) edges.

  10. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 V. Grebinski [COCOON] Tight upper bound for d-degenerate unweighted graphs.

  11. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 D. Angluin and J. Chen [COLT] Adaptive polynomial time O(m log n) algorithm for the unweighted problem (using weaker queries).

  12. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 M. Bouvel, V. Grebinski and G. Kucherov [WG] Survey for known results and various applications for the problem.

  13. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 L. Reyzin and N. Srivastava [ALT] Simple O(m log n) adaptive polynomial time algorithm for unweighted graphs.

  14. Known Results Brief Survey 2004 1997 1998 2005 2007 2008 S. Choi and J. H. Kim [STOC] 1. Tight upper bound for reconstructing any unweighted graph. 2. Tight upper bound for weighted graphs with many edges (m > (log n)c), such that the weights are between n-aand nb.

  15. Our Result We close the gap left by S. Chioand J. H. Kim: We prove the existence of an optimal non-adaptive algorithm for reconstructing graphs with any number of edges, such that the weights on the edges are real numbers between n-aand nb, for any positive constants a and b.

  16. Algebraic View of the Problem G 9v v 10 v 0 0 0 0 w1 11 5v 6v 0 0 0 0 w1 7v v = 0 0 0 0 w2 A 12 S 3v 4v G 0 0 0 0 w2 w 1v 2 8v 1 w 2v 0 0 0 0 0 T x A x 1 v S ( ) S ( 0 , 0 , 1 , 1 , 1 , 1 ) 0 = i G = AQ = x ,..., x i 0 otherwise 2

  17. Algebraic View of the Problem [V. Grebinski and G. Kucherov] For any symmetric matrix A we have: ( ) ( ) T + + T T x Ax y Ay x y A x y T T = + + T 1 1 1 1 x Ay x Ax y Ay 1 1 1 1 2 2 y 2 y = = , . x x x y y x where Therefore, our problem is equivalent to reconstructing A using the assignment oracle f: ( ) , y x f 1 1 = T , x A y G n , } 1 , 0 { . x y where

  18. Unweight Problem A query (x,y) distinguish between two distinct graphs G1 and G2if y A x G 1 T T x A y G 2 Our goal is to find a set of k queries, such that, for every two distinct graphs there is at least one query that distinguish between them. The problem of reconstructing unweighted graph with m edges is easy when we are allowed to ask O(m log n) queries.

  19. Answers Vector ( ). T T T v = , ,..., x A y x A y x A y Denote by 1 1 2 2 G G G k G k G v G v 2 1

  20. Unweighted Problem Since ( ) T T T y 0 x A y x A x A A y G G G G 1 2 1 2 Our goal is to show a set of queries such that for every B in B we have one query such that xT 0 By where B = = G : and , distinct are graphs B B A A G G 1 2 G 1 2

  21. Unweight Problem Suppose we have two different graphs G1and G2. A random query distinguish between them with probability greater or equal to a . ( Pr 1 A x G ) 1 T 0 A y G 4 2 If we randomly choose k = O(m log n) queries. The probability that for every query we have the same answer on G1and G2is bounded by ( G i A x i : Pr 1 k ) 3 ( ) T = 0 A y G i 4 2

  22. Unweight Problem The number of matrices in B is at most 2 n 4 4 m B 3 m For k = O(m log n) Using union bound, the probability that there exists a matrix B in B that was not eliminated is k 3 ( ) T = B Pr : 0 1 B i iBy x i 4 This implies the result.

  23. Reducing Query Complexity The goal now is to reduce the complexity: log m log n ( log ) O m n O m To do this, we divide the set B into two sets: Matrices B in B that has few non-zero entries. Matrices B in B with many non-zero entries.

  24. Cases All = B Matrices B B Matrices B Matrices B B B = = B B with few with many 1 2 non - zero non - zero entires entires

  25. Graphical Meaning Matrices B Matrices B B B = = B B with few with many 1 2 non - zero non - zero entires entires ( ) G ( ) E E G 1 2 E(G1) E(G2)

  26. First Case log m log n = In this case, we must find queries to eliminate every B in B1. k O m If we randomly choose k queries, the probability that there exists a matrix in B1 that we haven t eliminate is less than 1. This is because the size of B1is small! Therefore, this case can be easily handled.

  27. E(G1) E(G2) Second Case We use the following [Littlewood - Offord]: a n R Let be vector. Then for a randomly chosen (0,1)- vector x we have 1 = aT Pr 0 x ( ) wt a

  28. Idea Suppose y is given in advance, and we randomly choose x. Then, 1 ( ) = xT Pr 0 . By ( ) wt By

  29. Idea (Cont.) Now, for k given vectors y1, y2, , ykif we randomly choose x1, x2, , xkthen 1 ( ) ( ) T = Pr : 0 i x By i i t i i where tiis the number of non-zero entries in Byi.

  30. Choosing yis To make sure that the ti s are large enough, we use specific set of vectors P with small weight! We choose yi s such that for each vector p in P we have that at least k/4 of the values T T T , ,..., p y p y p y 1 2 k are non-zeros.

  31. Choosing yis We show that every matrix B in B2must have many row that are from P. = T T iy ix ix = B

  32. Choosing yis This way we are able to bound (from below) the sum of the ti s. Bounding the sum of the ti s, together with the fact that each tiis bounded by m, we are able to bound the product of them.

  33. Choosing xs We already showed that for a randomly chosen xi s 1 T = Pr i : 0 x By i i t i i We showed a set of vectors yi s that for every B the sum of the ti s is large and therefore, their product is also large. This leads to bounding the probability 0 : Pr = i i By x i 1 1 B i T | 2 | t 2 i

  34. Weighted Case w w 5 7 w w w 4 6 9 w 8 w w 2 3 1 w

  35. Weighted Case The same technique shows that we can reconstruct a hidden graph with integer weights that are bounded by ncfor any constant c.

  36. What About Real Number Weights? s1 = n-a s2= nb 0 m

  37. Cases All = B Matrices B B Matrices B Matrices B B B = = B B with few with many 1 2 heavy heavy entires entires

  38. Answers Vector ( ). T T T v = , ,..., x A y x A y x A y Denote by 1 1 2 2 G G G k G k

  39. Answers Vector Not only y G 1 T T x A x A y G 2 But, x | T T | x A y A y m G G 1 2 G v 1 m G v 2

  40. Answers Vector m 2 v G 1 m G v 2

  41. Real Number Weights Suppose we have a graph G with real number weights. ' G G w w 5' w 7' w 5 7 w 4' w w 6' w w 4 w 9' w 8' w 6 9 8 w w 2' w 3' w 2 3 1 w 1' w

  42. Discritization wibecomes w i. w' i Therefore, | | ' i w w i w i 2 We have m edges. Therefore for every query we have T T | | x A y x A y m ' G G 2

  43. Answers Vector m 2 G v ' m G v m 2

  44. Conclusions We prove the existence of an optimal non-adaptive algorithm for reconstructing graphs with any number of edges, such that the weights on the edges are real numbers between n-aand nb, for any constants a and b.

  45. Open Problems & New Results An explicit construction for an algorithm in the non-adaptive case is still open. Related new results: Optimal adaptive constructive algorithm. Optimally Reconstructing Weighted Graphs using Queries . SODA 2010. Reconstructing graphs with unbounded weights. Appears in On Parity Check (0,1)-Matrix over Zp in ECCC. Extension of the results to constant dimension hypergraphs.

More Related Content