Practical Query Pricing Strategies with QueryMarket Insights

t oward p ractical q uery p ricing w ith q uery n.w
1 / 27
Embed
Share

Explore the world of data pricing through innovative QueryMarket strategies. Learn how sellers price queries and buyers purchase database queries, with a focus on efficient computation and revenue sharing among sellers. Discover the formal pricing framework that supports diverse SQL queries and marketplace functionalities.

  • Data pricing
  • QueryMarket
  • SQL queries
  • Pricing framework
  • Revenue sharing

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. TOWARD PRACTICAL QUERY PRICING WITH QUERYMARKET Paraschos Koutris Prasang Upadhyaya Magdalena Balazinska Bill Howe Dan Suciu University of Washington SIGMOD 2013

  2. MOTIVATION Data is increasingly sold and bought on the web Websites that sell data: Xignite (financial) Gnip (social) Data marketplace services: Windows Azure Marketplace Infochimps Factual DataMarket 2

  3. A PRICING SCENARIO (1) English-German dictionary T PRICING SCHEMES english german thanks Danke Sell the whole table T for a fixed price Q: translate only the word thanks The user pays for redundant information car Auto day Tag road Strasse Price per output tuple Q: Does the word thanks translate to Auto ? An empty result still carries information Road Weg 3

  4. A PRICING SCENARIO (2) English-German dictionary T Word Frequency Stats UF word frequency rock 0.025 pop 0.030 database 0.001 genre music music science .. rank 20 10 1453 english german thanks Danke car Auto day Tag road Strasse Q1: Return all translations to German of top 10 words in the genre music Road Weg Q2: Return all translations to German of top 20 words in the genre music Current systems do not sell queries that combine datasets Queries issued by a user may have overlapping content 4

  5. HOW TO PRICE DATA English-German dictionary T english german p( T.english= thanks )=$0.1 thanks Danke p( T.german= Auto )=$0.5 car Auto p( T.english= car )=$0.1 p( T.english= day )=$0.1 day Tag road Strasse p( T.english= road )=$0.15 road Weg p( T.english= cat )=$0.05 Price points selection queries on single table exhaust the possible values (ColA) of some attribute A may select on values not in the active domain 5

  6. QUERYMARKET: CONTRIBUTIONS A formal pricing framework where: sellers specify a set of price points as selection queries buyers can purchase any query on the database the system automatically computes the price of the query Support efficient computation of prices for a large class of SQL queries Support the necessary functionality for a marketplace: Pricing queries with overlapping information content Database updates Revenue sharing among different sellers? 6

  7. OUTLINE 1. The Pricing Framework 2. Computing the Price 3. Query History 4. Revenue Sharing 7

  8. THE PRICING FRAMEWORK [Koutris et al., PODS 2012] The seller defines price points (view-price pairs): S = { (V1,p1), (V2,p2), } A buyer can buy any query Q The system will compute priceDS(Q) Buyer Q(D) ? Seller priceDS(Q) Pricing System + Database D Price points 8

  9. PROPERTIESOF PRICES We say that the views V1, , Vk determine Q if one can compute Q(D) from V1(D), , Vk(D) without access to D Arbitrage-free: Given D, priceD(Q) is arbitrage-free if for all views V1, , Vk that determine Q: priceD(Q) priceD(V1) + + priceD(Vk) Discount-free: priceD(Q) must not offer additional discounts except for the explicit price points defined by the seller 9

  10. THE PRICING FORMULA Arbitrage-Price: The price of the cheapest set of views from price points S that determine the query Q unique + arbitrage-free + discount-free + agrees with price points ColA = { a1, a2, a3 } ColB = { b } A a1 a2 B b b Table S Table R A Q(y) = R(x),S(x,y) a1 price = $1 price = $2 price = $3 { [R.A=a1], [S.B=b] } determines Q cost = 1 + 3 = 4 { [R.A=a1], [S.A=a1] } also determines Q cost = 1 + 2 = 3 (cheapest possible) 10 10

  11. OUTLINE 1. The Pricing Framework 2. Computing the Price 3. Query History 4. Revenue Sharing 11

  12. COMPUTING THE PRICE The problem of computing the arbitrage price even for SELECT-PROJECT-JOIN queries is coNP-complete For some queries, the price can be computed fast: Selections, joins w/o projection We describe pricing as an Integer Linear Program (ILP) and then use fast ILP solvers (e.g. GLPK, CPLEX) Classes of queries supported: Selections/Projections/Joins Unions User-Defined Functions (UDF) Bundles of queries 12 12

  13. ILP CONSTRUCTION (1) ColA = { a1, a2, a3 } ColB = { b } Table S Table R A B A a1 b a1 a2 b price = $1 price = $2 price = $3 Price the queryQ(x,y) = R(x), S(x,y) Introduce a {0/1} variable x[attribute,value] for each price point: x[R.A, a2], x[S.A, a1], x[S.B, b], 13 13

  14. ILP CONSTRUCTION (2) ColA = { a1, a2, a3 } ColB = { b } Table S Table R A B A a1 b a1 Q(x,y) = R(x), S(x,y) a2 b Minimize (independent of the query): price = x[R.A,a1] + x[R.A,a2] + x[R.A,a3] +2x[S.A,a1] + 2x[S.A,a2] + 2x[S.A,a3] +3x[S.B,b] Constraints: (a1,b) in Q: x[R.A,a1] 1 (a2,b) not in Q: x[R.A,a2] 1 (a3,b) not in Q: x[R.A,a3] + x[S.A,a3] + x[S.B,b] 1 x[S.A,a1] + x[S.B,b] 1 14 14

  15. ILP CONSTRUCTION (3) ColA = { a1, a2, a3} ColB = { b} Table S Table R A B A a1 b a1 a2 b a2 New variable for each tuple in Qfull Projection: Q(y) = R(x), S(x,y) Constraints: (a1,b) in Qfull: x[R.A,a1] z1 (a2,b) in Qfull: x[R.A,a2] z2 x[S.A,a2] + x[S.B,b] z2 (b) in Q : z1 + z2 1 x[S.A,a1] + x[S.B,b] z1 15 15

  16. QUERYMARKET SYSTEM Runs on top of any SQL database Information stored in the database: Price points are stored in the database in price tables Keeping track of price tables with an index table The dataset: English-german translation: Ten,gr(w, w ) English-french translation : Ten,fr(w, w ) UDF to find hashtags : IsHashtag(w) Word frequency stats : WF(w, genre, frequency, rank) 16

  17. PRICE COMPUTATION (1) Small dataset where columns have size ~ 102 10 ILP solving time ILP construction time 8 Time in seconds 6 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Query 3-way join 2-way joins w/o projections 2-way joins with projections selections 17

  18. PRICE COMPUTATION (2) Larger dataset where columns have size ~ 103 80 ILP solving time 70 ILP construction time 60 Time in seconds 50 40 30 20 10 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Query 3-way join 2-way joins w/o projections 2-way joins with projections selections 18

  19. OUTLINE 1. The Pricing Framework 2. Computing the Price 3. Query History 4. Revenue Sharing 19

  20. QUERY HISTORY A user asks a sequence of queries over time of varying information overlap Q = Q1, Q2, , Qk Experiment with 30 selection/join queries Oblivious pricing: each query priced independently 18 High Overlap 16 14 12 price in dollars Bundle pricing: each query Qi priced p(Q1, ,Qi)- p(Q1, ,Qi-1) 10 8 6 4 2 0 1 2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930 query View pricing: when a query is purchased, the purchased views are free for later queries 20

  21. QUERY HISTORY (2) 25 Moderate Overlap 20 price in dollars Oblivious pricing 15 View pricing 10 Bundle pricing 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 query Weak Overlap 16 14 12 price in dollars 10 Oblivious pricing 8 Bundle pricing 6 View pricing 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 query 21

  22. VIEW PRICING View Pricing is our proposed strategy: Computationally efficient Low storage overhead Close to optimal (bundle) price View Pricing can be used for dynamic databases: if view V is purchased at some point and then updated, the user pays only an update price 22

  23. OUTLINE 1. The Pricing Framework 2. Computing the Price 3. Query History 4. Revenue Sharing 23

  24. REVENUE SHARING How is the revenue shared between sellers if several datasets contribute to the answer? What if the cheapest set of views to determine a query is not unique ? Example: Q( sigmod13 ) = isHashtag( sigmod13 ), isNoun( sigmod13 ) Seller 1 prices $1 per entry for isHashtag, so does seller 2 If both isHashtag, isNoun are false and each costs $1, purchasing either of the entries answers Q 24

  25. REVENUE SHARING: SOLUTION For a seller s, share(s, Q) is the maximum revenue of s over all minimum-cost set of price points that determine Q share(s, Q) can be computed in our framework Solution: split price(Q) among sellers proportionally to their shares Example: Both shares are $1 The revenue of each seller will be $0.5, since their shares are equal 25

  26. CONCLUSIONS QueryMarket: the first system that supports pricing a large class of SQL queries within a formal framework We presented solutions to address the requirements of a real-world marketplace Future work includes: Scaling the price computation (bucketization) Full SQL Support (aggregates, negation) Query answering under limited budget 26

  27. Thank you ! 27

Related


More Related Content