Improved Bounds for Sampling Solutions of Random CNF Formulas
"This study delves into sampling solutions of random CNF formulas, exploring satisfiability, phase transition, algorithms, and hardness. It discusses decision-making, search strategies, and sampling techniques in statistical physics. The research is a joint work by Kuan Yang, Kun He, and Kewen Wu, presented at the NII Shonan Meeting."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Improved Bounds for Sampling Solutions of Random CNF Formulas Kuan Yang (John Hopcroft Center for Computer Science, Shanghai Jiao Tong University) Joint work with Kun He (Renmin University) and Kewen Wu (UC Berkeley) 2023/09/05, NII Shonan Meeting
Random CNF Formulas CNF formula: an AND of ORs = ?1 ?2 ?3 ?2 ?3 ?4 ?1 ?4 ?2 Clause Literal Variables: ?1,?2,?3,?4 Satisfiable: There exists an assignment to the variables that satisfies all the clauses ?-CNF formula: Each clause has exactly ? literals Random ?-CNF formula (?,?,?): A formula on ? variables and ? clauses, each clause has exactly ? literals Each literal is chosen independently and uniformly from ?1, ,??, ?1, , ?? Density: ? = ?/? Model: Fix ? and ?, then let ? (and ? = ??)
Density: ? = ?/? Questions for Random CNF Formulas Decision: When is ?,?,? satisfiable with high probability? Average-case analysis, phase transition Satisfiability theorem: For sufficiently large ?, there exists a bound [KKK 98,F 99,AM 02,AP 03,COP 16,DSS 22] 1 + ln2 2 ? = 2?ln2 + ??1 such that if ? < ? if ? > ? ? Pr[ is satisfiable] = 1 lim 0
Density: ? = ?/? Questions for Random CNF Formulas Decision: When is ?,?,? satisfiable with high probability? ? 2?ln2 Search: Beyond Decision, can we find a solution efficiently? Algorithmic phase transition Algorithms: Poly-time algorithm exists when ? 2?ln?/? [CO 10] Hardness: Evidence that the problem is hard when ? 2?ln?/? [ACO 08, H 16,COHH 17,BH 22]
Density: ? = ?/? Questions for Random CNF Formulas Decision: When is ?,?,? satisfiable with high probability? ? 2?ln2 Search: Beyond Decision, can we find a solution efficiently? ? 2?ln?/? Sampling: Beyond Search, can we efficiently sample a uniform random solution from the whole solution space? Partition function in statistical physics [MS 07]: poly ? -time algorithm for ? 2ln?/? [GGGY 21]: ?poly ?,?-time algorithm for ? 2?/300
Density: ? = ?/? Questions for Random CNF Formulas Decision: When is ?,?,? satisfiable with high probability? ? 2?ln2 Search: Beyond Decision, can we find a solution efficiently? ? 2?ln?/? Sampling: Beyond Search, can we efficiently sample a uniform random solution from the whole solution space? ? 2?/300 and ?poly ?,?-time Our focus
Density: ? = ?/? Questions for Random CNF Formulas Decision: When is ?,?,? satisfiable with high probability? ? 2?ln2 Search: Beyond Decision, can we find a solution efficiently? ? 2?ln?/? Sampling: Beyond Search, can we efficiently sample a uniform random solution from the whole solution space? ? 2?/300 and ?poly ?,?-time Our result: ? 2?/3 and ?1.0001-time
Phase Transition of the Solution Space Decision: When is ?,?,? satisfiable with high probability? ? 2?ln2 (??) Search: Beyond Decision, can we find a solution efficiently? ? 2?ln?/? (conjectured to be ?? 2?(ln? + lnln?)/?) Sampling: Beyond Search, can we efficiently sample a uniform random solution? ? 2?/3 (???)
Random vs Standard ?-CNF Formulas Each clause has ? variables The averagevariable degree in ?,?,? is ? =?? ?= ?? If a ?-CNF formula has maximum variable degree ?, then Satisfiability threshold: ? 2?/? Search threshold: ? 2?/? Sampling bound Algorithm: ? 2?/5 Hardness: ? 2?/2 [Lov sz local lemma, EL 75] [Algorithmic local lemma, MT 10] [He-Wang-Yin 22, 09:45 10:25 this morning] [BGGGS 19]
Random vs Standard ?-CNF Formulas Maximum degree ? = ?? Average degree ? = ?? ? 2?/?2 ? 2?ln2 Satisfiability ? 2?/?2 ? 2?ln?/? Search ? 2?/5 ? 2?/300 Sampling algorithm ? 2?/2 Sampling hardness Intuitively Random ?-CNF is slightly better than standard ?-CNF in satisfiability and search Conjecture Random ?-CNF is slightly better than standard ?-CNF also in sampling easier, but not much easier
Random vs Standard ?-CNF Formulas Maximum degree ? = ?? Average degree ? = ?? ? 2?/?2 ? 2?ln2 Satisfiability ? 2?/?2 ? 2?ln?/? Search ? 2?/5 ? 2?/3 Sampling algorithm ? 2?/2 ? Sampling hardness Intuitively Random ?-CNF is slightly better than standard ?-CNF in satisfiability and search Conjecture Random ?-CNF is slightly better than standard ?-CNF also in sampling Disadvantage: maximum degree ? Advantage: structural properties of random instances log? loglog? (constant fraction of variables)
Recursive Sampler [AJ21, HWY22] Assume we want to sample ?(?1) from ??1 ??1 is the marginal distribution of ?1 in a uniform solution ??( ) have lower bounds (using local uniformity) False (0) ? True (1) ??1 = ??20 ??1 ? ?2 = 0 + ??21 ??1 ? ?2 = 1 ??1 | ? ?2,?3, ,?? = ?=0 ???+1? ?(?2, ,??)) ??1 | ? ?2, ,??+1 We first sample ? ?2,? ?3, where ? ?? is sampled from ??? ?(?2, ,?? 1)) Once a clause is satisfied, it vanishes 1 ?1 ?2 ?5 ?4 ?3
Recursive Sampler [AJ21, HWY22] Assume we want to sample ?(?1) from ??1 ??1 is the marginal distribution of ?1 in a uniform solution ??( ) have lower bounds (using local uniformity) False (0) ? True (1) ??1 = ??20 ??1 ? ?2 = 0 + ??21 ??1 ? ?2 = 1 ??1 | ? ?2,?3, ,?? = ?=0 ???+1? ?(?2, ,??)) ??1 | ? ?2, ,??+1 We first sample ? ?2,? ?3, where ? ?? is sampled from ??? ?(?2, ,?? 1)) Once a clause is satisfied, it vanishes Suppose ?(?2) satisfies the dashed clauses. Then Conditioned on ?(?2), ?1,?4 are independent! 1 ?1 ?2 ?5 ?4 ?3
Solving Random Instances Main difficulty: high-degree variables having no local uniformity Partition variables into Good part and Bad part Only Good variables are guaranteed to have local uniformity For Bad variables, use rejection sampling directly For Good variables, low-degree is necessary, but not sufficient How large the local uniformity do we need? Strong local uniformity requires to fix few variables Efficiency of rejection sampling requires small size of unsatisfied clauses have to fix many variables
Solving Random Instances If a CNF formula satisfies Each variable has degree at most ? 2?/3 Each clause has at least 0.9? distinct variables then it is satisfiable and has local uniformity ? 0,1?? ?? = 0 ? is a solution 1 good marginal approximation 2 2 0.6? Pr In = ?,?,? , we say a variable is Bad if It has degree larger than ? Or it is in a clause that contains more than 0.1? Bad variables The remaining variables are Good: local uniformity guaranteed ? = ?/? 2?/3
Proof Overview Classify the variables into Good or Bad For Good variables, we sample one by one from its marginal distribution over uniform solution Marginal sampler [AJ 21, HWY 22] Conditioned on the assignments on Good variables, we sample values for Bad variables Standard rejection sampling Efficiency: locally sparse properties of random ?-CNF formulas
Marginal Sampler Assume we want to sample ?(?1) from ??1 for a Good variable ?1 ??1 is the marginal distribution of ?1 in a uniform solution By local uniformity, we fix 0 w.p 1/2 2 0.6? 1 w.p 1/2 2 0.6? w.p If ? ?1 = , then resample it under distribution ??1 ??1 1/2 + 2 0.6? ? ?1 = 2 2 0.6? Bernoulli factory [NP 05, H 16, DHKN 17] ??1 can be efficiently sampled given samples from ??1.
Marginal Sampler Assume we want to sample ?(?1) from ??1 for a Good variable ?1 ??1 is the marginal distribution of ?1 in a uniform solution By local uniformity, we sample ? ?1 0,1, If ? ?1 0,1 , then simply return it Otherwise resample it using Bernoulli factory on ??1 Recursive sampling Lazy Sampling Assume the values of ?2,?3, are sampled correctly conditioned on ? ?1 = Then we can easily sample from ??1( ? ?2,? ?3, ), which equals ??1 over random ? ?2,? ?3, To obtain ? ?2 conditioned on ? ?1 = , we execute marginal sampler on ?2 with ?1 fixed as
Efficiency To obtain a sample ? ?1, First use local uniformity to get an approximate, which stops already whp If not stop, We sample ? ?2 conditioned on the value of ?1= We sample ? ?3 conditioned on the value of ?1= ,?(?2) (whenever local uniformity holds) Finally use Bernoulli factory on ??1 ? ?2,? ?3, The latter is provided by rejection sampling conditioned on ? ?2,? ?3, Strong local uniformity The remaining formula isn t too large Fix many variables before reaching here Local uniformity can t be too strong Efficiency
Beyond Bounded-degree Results How to bound the size of remaining formulas? All related proofs analyzed the probability of nonvanishing clauses Bounded degree formulas may be dense locally 2,3 -tree was used to capture independence This is not a typical random instance! Random formulas are typically sparse Average degree 1.001 in ?(log?)-size components No Bad variable clusters Duplicated variables are few in the resampling components Analyze the probability of fixing too many variables
Summary of Efficiency Analysis The rejection sampling is efficient if the remaining formula is small Bad variables are few Good variables have many possible solutions The recursion has small depth Most Good variables are in the recursion The local uniformity sampling stops w.h.p. Local sparsity Local uniformity
Conclusion and Conjectures Satisfiability threshold Density: ? = ?/? Efficient search 2?/300 2?/2 2?/3 ? = 0 ? 2?ln?/? 2?ln2 Trivial formula Efficient sampling sampling Efficient ?