SAT, SMT, and constraint solvers are criminally underutilized in the software industry. We need more education about what they are, how they work, and what sorts of problems they can solve.

▲

LegionMammal978 5 days ago | parent | next [-]

At least personally, I've been very underwhelmed by their performance when I've tried using them. Usually past a few dozen variables or so is when I start hitting unacceptable exponential runtimes, especially for problem instances that are unsatisfiable or barely-satisfiable. Maybe their optimizations are well-suited for knapsack problems and other classic OR stuff, but if your problem doesn't fit the mold, then it's very hit-or-miss.

▲

zaxioms 4 days ago | parent | next [-]

I'm surprised to hear this. Modern SAT solvers can easily handle many problems with hundreds of thousands of variables and clauses. Of course, there are adversarial problems where CDCL solvers fail, but I would be fascinated if you can find industrial (e.g. human written for a specific purpose) formulas with "dozens of variables" that a solver can't solve fairly quickly.

	▲	LegionMammal978 4 days ago \| parent [-]
		One thing that I spent a particularly long time trying to get working was learning near-minimum-size exact languages from positive and negative samples. DFAMiner [0] has a relatively concise formulation for this in terms of variables and clauses, though I have no way to know if some other reformulation would be better suited for SAT solvers (it uses CaDiCaL by default). It usually starts taking a few seconds around the ~150-variable mark, and hits the absolute limit of practicality by 350–800 variables; the number of clauses is only an order of magnitude higher. Perhaps something about the many dependencies in a DFA graph puts this problem near the worst case. The annoying thing is, there do seem to be heuristics people have written for this stuff (e.g., in FlexFringe [1]), but they're all geared toward probabilistic automata for anomaly detection and similar fuzzy ML stuff, and I could never figure out how to get them to work for ordinary automata. In any case, I eventually figured out that I could get a rough lower bound on the minimum solution size, by constructing a graph of indistinguishable strings, generating a bunch of random maximal independent sets, and taking the best of those. That gave me an easy way to filter out the totally hopeless instances, which turned out to be most of them. [0] https://github.com/liyong31/DFAMiner [1] https://github.com/tudelft-cda-lab/FlexFringe

▲

cerved 4 days ago | parent | prev | next [-]

I've worked on a model with thousands of variables and hundreds of thousands of parameters with a hundred constraints. There are pitfalls you need to avoid, like reification, but it's definitely doable.

Of course, NP hard problems become complex at an exponential rate but that doesn't change if you use another exact solving technique.

Using local-search are very useful for scaling but at the cost of proven optimality

▲

j2kun 5 days ago | parent | prev | next [-]

I think this hits the nail on the head: performance is the obstacle, and you can't get good performance without some modeling expertise, which most people don't have.

	▲	drob518 5 days ago \| parent [-]
		Hence my call for more education.

▲

js8 5 days ago | parent | prev | next [-]

I wish I knew better how to use them for these coding problems, because I agree with GP they're underutilized.

But I think if you have constraint problem, that has an efficient algorithm, but chokes a general constraint solver, that should be treated as a bug in the solver. It means that the solver uses bad heuristics, somewhere.

▲

LegionMammal978 5 days ago | parent [-]

I'm pretty sure that due to Rice's theorem, etc., any finite set of heuristics will always miss some constraint problems that have an efficient solution. There's very rarely a silver bullet when it comes to generic algorithms.

▲

sigbottle 4 days ago | parent | next [-]

I think they're saying that the types of counter-examples are so pathological in most cases that if you're doing any kind of auto-generation of constraints - for example, a DSL backed by a solver - should have good enough heuristics.

Like it might even be the case that certain types of pretty powerful DSLs just never generate "bad structures". I don't know, I've not done research on circuits, but this kind of analysis shows up all the time in other adjacent fields.

▲

LegionMammal978 4 days ago | parent [-]

Idk, I also thought so once upon the time. "Everyone knows that you can usually do much better than the worst case in NP-hard problems!" But at least for the non-toy problems I've tried using SAT/ILP solvers for, the heuristics don't improve on the exponential worst case much at all. It's seemed like NP-hardness really does meet the all-or-nothing stereotype for some problems.

Your best bet using them is when you have a large collection of smaller unstructured problems, most of which align with the heuristics.

	▲	sigbottle 4 days ago \| parent \| next [-]
		> Your best bet using them is when you have a large collection of smaller unstructured problems, most of which align with the heuristics. Agreed. An algorithm right now in our company turns a directed graph problem, which to most people would seem crazy, into roughly ~m - n (m edges, n nodes) SAT checks that are relatively small. Stuffing all the constraints into an ILP solver would be super inefficient (and honestly undefined). Instead, by defining the problem statement properly and carving out the right invariants, you can decompose the problem to smaller NP-complete problems. Definitely a balancing act of design.
	▲	drob518 4 days ago \| parent \| prev [-]
		For some problems, there is not much you can do. But for many, it works.

▲

zaxioms 4 days ago | parent | prev [-]

Rice's theorem is about decidability, not difficulty. But you are right that assuming P != NP there is no algorithm for efficient SAT (and other constraint) solving.

▲

drob518 5 days ago | parent | prev [-]

Well, they aren’t magic. You have to use them correctly and apply them to problems that match how they work. Proving something is unsat is worst case NP. These solvers don’t change that.

▲

LegionMammal978 5 days ago | parent [-]

Of course they aren't magic, but people keep talking about them as if they're perfectly robust and ready-to-use for any problem within their domain. In reality, unless you have lots of experience in how to "use them correctly" (which is not something I think can be taught by rote), you'd be better off restricting their use to precisely the OR/verification problems they're already popular for.

	▲	drob518 5 days ago \| parent [-]
		Hence my statement about education. All tools must be used correctly in their proper domain, that is true. Don’t try to drive screws with a hammer. But I'm curious what problems you tried them on and found them wanting and what your alternative was? I actually find that custom solutions work better for simple problems and that solvers do a lot better when the problem complexity grows. You’re better off solving the Zebra puzzle and its ilk with brute force code, not a solver, for instance.

▲

loeg 5 days ago | parent | prev | next [-]

In what way? They're useful for toy problems like this but they're very slow on larger problems.

▲

drob518 5 days ago | parent | next [-]

SAT solvers are used daily to generate solutions for problems that have literally millions of variables. So, what you said is just wrong on the face. Yes, some talented people can write custom code that solves specific problems faster than a general purpose solver, particularly for easy special cases of the general problem, but most of the time that results in the programmer recreating the guts of a solver customized to a specific problem. There’s sort of a corollary of Greenspun’s Tenth Rule that every sufficiently complicated program also contains an ad hoc, informally-specified, bug-ridden, slow-implementation of half of a SAT or SMT solver.

	▲	sigbottle 5 days ago \| parent [-]
		I mean right tool for the right job. Plenty of formulations and problems (our job has plenty of arbitrarily hard graph algorithms) that have 90% of the problem just being a very clever reduction with nice structure. Then the final 10% is either NP hard, or we want to add some DSL flexibility which introduces halting problem issues. Once you lower it enough, then comes the SMT solvers.

▲

cerved 4 days ago | parent | prev [-]

Define large. We've written model which solves real business issues in 8K lines of MiniZinc and it wasn't slow.

The conventional wisdom is the larger you make an NP hard problem, the slower is going to get. Irregardless of algorithm.

▲

Qem 3 days ago | parent | prev [-]

What are some good books to get started on the subject?