This page is created by T.Uchida and O.Watanabe. For question and/or comment of this page, please contact uchida@is.titech.ac.jp.
The factorization problem is a problem to compute a sequence of prime factors of a given integer. In particular, we are interested in the factorization problem instances x that are products of two big prime numbers p and q. (Below we assume that a given 2l- bit number x is a product of two l bit prime numbers, and the factorization problem (FACT in short) is to compute these two prime numbers from x.)
The FACT has been studied by many researchers, and it has been believed to be hard even on average. Thus, one can expect to generate hard SAT instances by transforming from FACT instances, and here we take this approach.
For a given FACT instance x, we generate SAT instance (i.e., a Boolean formula F) as follows:
It is also easy to see that the number n of variables of F is the same as the number of gates in C. Thus, for obtaining a small formula, we need to construct a small test circuit. Here we propose two algorithms to construct test circuits:
The former algorithm gives us smaller instances for l < 40, which corresponds to formulas with 4,000 variables, and the latter algorithm is better for l > 40.
For further explanation, see our paper in References and Links.
A Variation
Through our experiments, we started thinking that the factorization problem may not be essential for generating reasonably hard SAT instances, and that some other arithmetic problems may work as well. Thus, for one variation of our approach, we considered the following problem. (The algorithm we provide here is called GenCSAdd.)
Clearly, this problem is easy to solve. But it has a similar structure as the ordinary multiplication, and thus one can expect that it yields hard instances as well (at least for some type of heuristics). It turned out that the problem is as strong as those obtained from FACT. Furthermore, since the size of formula obtained from l bit number is linear in l, we can generate relatively small instances, or we can use larger l (which is harder) compared with FACT.
Four algorithms GenFact, GenFact3, GenFactC, and GenCSAdd3 can be obtained by downloading the following files and compling them.
The program (e.g., GenFact) is called as
GenFact [p] [q] ([L])
where [p] and [q]
are two prime numbers of the same bit length l,
and [L] (optional) is
the number of input gates of the generated test circuit.
(Thus,
the default value of [L] is 2l,
and L must be larger than 2l.)
Then the program creates a directory GenFact.p.q that contains the following files:
L.
Some examples are given below.
(l = bit size of the original problem,
n = num. of variables,
m = num. of clauses,
ci = num. of size i clauses.
GenFact(*) (resp., GenCSAdd(*)) means a modified version.)
| algorithm | l | n | m | c1, c2, c3, c4 |
|---|---|---|---|---|
| GenFact | 5 | 75 | 335 | c1, c2, c3, c4 |
| 6 | 108 | 504 | c1, c2, c3, c4 | |
| 7 | 147 | 706 | c1, c2, c3, c4 | |
| 8 | 192 | 944 | c1, c2, c3, c4 | |
| l | 3*l*l | l(17l-18) | 2l, 2l(l+1), l(7l-6), 8l(l-2) | |
| GenFact(*) | 5 | 81 | 361 | 0, 0, c3, c4 |
| 6 | 114 | 534 | 0, 0, c3, c4 | |
| 7 | 153 | 736 | 0, 0, c3, c4 | |
| 8 | 198 | 978 | 0, 0, c3, c4 | |
| l | 3*l*l+d | l(17l-18)+6d+x | 0, 0, l(9l-4)+6d+y, 8l(l-2) | |
| GenCSAdd | 6 | 92 | 477 | 0, 0, c3, c4 |
| 7 | 104 | 549 | 0, 0, c3, c4 | |
| 8 | 116 | 621 | 0, 0, c3, c4 | |
| l | 2(k+1)l+k(k-1) | 2(7k+1)l+k(7k-17)/2 | 2l+k, k(k+3), 6kl+k(5k-9)/2, 8l+k-9 | |
| GenCSAdd(*) | 6 | 98 | 511 | 0, 0, c3, c4 |
| 7 | 110 | 587 | 0, 0, c3, c4 | |
| 8 | 122 | 663 | 0, 0, c3, c4 |
In what follows we give some set of instances generated by using our generators GenFact and GenCSAdd. (All files are at GenFact: F.tar.gz, and GenCSAdd: A.tar.gz.)