SAT Instance Generation Page

In this page we propose two approaches for designing algorithms for generating instances of the CNF SAT (mainly, 3SAT) problem. We provide three algorithms and some sets of instances generated by these algorithms.

Index:

  1. Introduction
  2. Encoding Format
  3. References and Links
  1. Approach 1: Random Instance Generation
  2. Approach 2: Hard Instance Generation Based on Factoring

This page is created by T.Uchida, M.Motoki, and O.Watanabe, Watanabe research group of Dept. of Math. and Computing Sciences, Tokyo Inst. of Technology. For question and/or comment of this page, please contact mmotoki@is.titech.ac.jp.


1. Introduction

The SAT problem, the problem of deciding whether a given Boolean formula is satisfiable, is one of the well-known NP-complete problems. While it is believed that the problem is not solvable in polynomial-time in the worst case, many heuristic algorithms have been proposed for the problem and some of them seem to be quite good "on average". (Most of SAT solvers aim to solve the search version of SAT, i.e., finding a satisfying assignment of a given Boolean formula, instead of the original decision problem.)

For testing the performance of these SAT solvers, it would be useful if we have some algorithm generating SAT instances, which we call a SAT instance generator. Here we provide algorithms for generating instances of the CNF SAT (mainly, 3SAT) problem and some sets of instances generated by these algorithms.

One common feature of our algorithms is that they generate positive SAT instances with a unique solution, i.e., only one satisfying assignment. (The algorithms also output this sat. assignment with the instance.)

It is not so difficult to generate positive SAT instances under a certain distribution. But it seems hard to generate negative SAT instances. (For example, it is impossible to generate all negative instances with nonzero probability unless NP = co-NP.) On the other hand, you can easily convert any positive SAT instance with a unique solution to a negative instance by adding one (or several if you like) clause to kill this solution. Thus, our SAT instance generators can be also used to generate reasonable set of negative SAT instances. (Though still weak, we may be able to use this idea to generate nontrivial MAX-SAT instances.)

Here we propose two approaches for designing SAT instance generators.

The first one is based on the naive way to generate positive SAT instances randomly. We propose one modification so that generated instances have a unique solution with high probability.

The second one is based on the factorization problem. The factorization problem has been studied quite well and still it has been believed hard even on average. Theoretically, we know that the factorization problem is reducible to the SAT problem. Here we propose to design an efficient reduction and use it for generating hard SAT instances, and we indeed give two reduction algorithms, i.e., SAT generators. Since a solution of the generated SAT instance corresponds to that of the original factorization problem instance, we can easily control the number of solutions.

For the detail of these approaches and related projects, see papers and pointers in References and Links.


2. Encoding Format

We assume that all SAT instances are of conjunctive normal form formula, i.e., the conjunction of clauses, where each clause is a disjunction of several literals. We use a basic and compact ASCII encoding format for specifying instances.

Suppose a formula given as

(x7 + ~x1 + ~x6) (x6 + ~x7 + ~x4) (~x3 + ~x8 + x6) (x8 + ~x6 + ~x1) (~x10 + x2 + x8) (x5 + ~x6 + ~x4) (x1 + ~x7 + ~x6) (~x9 + ~x4 + x7) (x4 + ~x5 + ~x10)

In our encoding format, this formula is stated as follows.

# example of encoding
10 9 3
7 -1 -6
6 -7 -4
-3 -8 6
8 -6 -1
-10 2 8
5 -6 -4
1 -7 -6
-9 -4 7
4 -5 -10

A line that begins "#" is comment line. The first non-comment line gives (i) the number of variables, (ii) the number of clauses, and (iii) the maximum size of clauses. From the second line, clauses are specified by one clause per line. We assume that variables are numbered sequentially starting from 1. A negative value means that the corresponding variable appears negatively in the clause.


3. References and Links

References

  1. M. Motoki, On the Maximum Satisfiability of Random 3-CNF Formulae, Research Report C-141, Dept. of Math. and Computing Sciences, Tokyo Inst. of Tech. (2000).
  2. M. Motoki and R. Uehara, Unique solution instance generation for 3SAT, Research Report C-129 Dept. of Math. and Computing Sciences, Tokyo Inst. of Tech. (1999); also appeared in SAT2000.
  3. S. Horie and O. Watanabe, Hard instance generation for SAT, Technical Report TR97-0007, Dept. of Computer Science, Tokyo Inst. of Tech. (1997) (Available from CS Dept. TR Archive; the extended abstract appeared in Proc. ISAAC'97, Lecture Notes in Computer Science 1350.)
  4. O. Watanabe Test instance generation for promised NP search problems (gzipped ps file), presented at the 9th IEEE Structure in Complexity Theory Conference (1994).

Links

  1. SATLIB (by H. Hoos and T. Stuetzle)
  2. SAT hard instance generator and solution algorithms (by V.Z. Nuri)
  3. Test Instance Generation (by O. Watanabe; KAKEN report, in Japanese)

Now please go to each approach to get algorithms and sample data!

  1. Approach 1: Random Instance Generation
  2. Approach 2: Hard Instance Generation Based on Factoring

Mitsuo Motoki mmotoki@is.titech.ac.jp