In this page we propose two approaches for designing algorithms for generating instances of the CNF SAT (mainly, 3SAT) problem. We provide three algorithms and some sets of instances generated by these algorithms.
Index:
This page is created by T.Uchida, M.Motoki, and O.Watanabe, Watanabe research group of Dept. of Math. and Computing Sciences, Tokyo Inst. of Technology. For question and/or comment of this page, please contact mmotoki@is.titech.ac.jp.
The SAT problem, the problem of deciding whether a given Boolean formula is satisfiable, is one of the well-known NP-complete problems. While it is believed that the problem is not solvable in polynomial-time in the worst case, many heuristic algorithms have been proposed for the problem and some of them seem to be quite good "on average". (Most of SAT solvers aim to solve the search version of SAT, i.e., finding a satisfying assignment of a given Boolean formula, instead of the original decision problem.)
For testing the performance of these SAT solvers, it would be useful if we have some algorithm generating SAT instances, which we call a SAT instance generator. Here we provide algorithms for generating instances of the CNF SAT (mainly, 3SAT) problem and some sets of instances generated by these algorithms.
One common feature of our algorithms is that they generate positive SAT instances with a unique solution, i.e., only one satisfying assignment. (The algorithms also output this sat. assignment with the instance.)
It is not so difficult to generate positive SAT instances under a certain distribution. But it seems hard to generate negative SAT instances. (For example, it is impossible to generate all negative instances with nonzero probability unless NP = co-NP.) On the other hand, you can easily convert any positive SAT instance with a unique solution to a negative instance by adding one (or several if you like) clause to kill this solution. Thus, our SAT instance generators can be also used to generate reasonable set of negative SAT instances. (Though still weak, we may be able to use this idea to generate nontrivial MAX-SAT instances.)
Here we propose two approaches for designing SAT instance generators.
The first one is based on the naive way to generate positive SAT instances randomly. We propose one modification so that generated instances have a unique solution with high probability.
The second one is based on the factorization problem. The factorization problem has been studied quite well and still it has been believed hard even on average. Theoretically, we know that the factorization problem is reducible to the SAT problem. Here we propose to design an efficient reduction and use it for generating hard SAT instances, and we indeed give two reduction algorithms, i.e., SAT generators. Since a solution of the generated SAT instance corresponds to that of the original factorization problem instance, we can easily control the number of solutions.
For the detail of these approaches and related projects, see papers and pointers in References and Links.
We assume that all SAT instances are of conjunctive normal form formula, i.e., the conjunction of clauses, where each clause is a disjunction of several literals. We use a basic and compact ASCII encoding format for specifying instances.
Suppose a formula given as
(x7 + ~x1 + ~x6) (x6 + ~x7 + ~x4) (~x3 + ~x8 + x6) (x8 + ~x6 + ~x1) (~x10 + x2 + x8) (x5 + ~x6 + ~x4) (x1 + ~x7 + ~x6) (~x9 + ~x4 + x7) (x4 + ~x5 + ~x10)
In our encoding format, this formula is stated as follows.
# example of encoding 10 9 3 7 -1 -6 6 -7 -4 -3 -8 6 8 -6 -1 -10 2 8 5 -6 -4 1 -7 -6 -9 -4 7 4 -5 -10
A line that begins "#" is comment line. The first non-comment line gives (i) the number of variables, (ii) the number of clauses, and (iii) the maximum size of clauses. From the second line, clauses are specified by one clause per line. We assume that variables are numbered sequentially starting from 1. A negative value means that the corresponding variable appears negatively in the clause.
References
Links
Now please go to each approach to get algorithms and sample data!