命题知识库演化中的新知识特征

2024-04-15王以松刘蕻张颖张明义李丹宁杨佳佳

贵州大学学报（自然科学版） 2024年1期

关键词：子句知识库命题

王以松刘蕻张颖张明义李丹宁杨佳佳

文章編号 1000-5269(2024)01-0001-19

DOI：10.15958/j.cnki.gdxbzrb.2024.01.01

Whats New in the Evolution of Propositional

Knowledge Bases?

收稿日期：@2023-11-14

基金项目：@国家自然科学基金资助项目（62376066， 61976065， 61370161）

作者简介：@王以松(1975—)，男，教授，博士，博导，研究方向：人工智能、知识表示与推理，E-mail:yswang@gzu.edu.cn.

*通讯作者：@王以松,E-mail:yswang@gzu.edu.cn.

WANG Yisong*1,2, LIU Hong3, ZHANG Ying4, ZHANG Mingyi4,

LI Danning4, YANG Jiajia1

(1.College of Computer Science and Technology, Guizhou University, Guiyang 550025, China; 2.Institute for Artificial Intelligence, Guizhou University, Guiyang 550025, China; 3.School of Intelligence and

Computing, Tianjin University, Tianjin 300072, China; 4.Guizhou Academy of Science, Guiyang 550001, China)

Abstract:

The notion of logical difference plays an important role in characterizing substantial difference among logic-based knowledge bases that suffer from dynamic changes continuously. This notion is also closely related to forgetting, which has been extensively explored in various logics. This paper presents three notions of difference over a relevant signature for propositional theories—logical difference, clausal difference and prime difference, to capture the difference of logical consequence, clause consequence and prime clause consequence respectively. Besides, their properties and computational complexities are also investigated. It is shown that various deciding problems involving logical difference are one level higher in the polynomial hierarchy than their corresponding satisfiability problem, except for 2-CNF theories for which these deciding problems are tractable. Extensive experimental results on random 3-CNF, 2-CNF and Horn theories reveal some interesting phenomena on clausal difference and prime difference. In particular, the number of clauses in both clausal and prime difference exhibits a similar phase transition with their satisfiability for both random 3-CNF theories and 2-CNF theories. However, for random Horn theories, while the number of clauses in clausal difference exhibits a similar coarse phase transition with its satisfiability, the number of clauses in prime difference behaves quite differently from that of clausal difference. It reveals a new character of satisfiability phase transition for random 3-CNF and 2-CNF theories, that their evolutions will introduce more different knowledge nearby their phase transition thresholds.

Key words:

logical difference; clausal difference; prime difference; computational complexity; phase transition; knowledge management

CLC number：TP301；TP391

Document code：A

The notion of difference plays an important role in evolving logical systems, and knowledge bases in particular, which generally suffers from dynamic continuous changes ［1］. For example, in belief revision ［2］, one may be interested in the difference of a knowledge base after it has been revised. In software engineering, one generally uses version control systems, such as CVS and RCS, to track difference of evolving source codes and documents in software development. To capture version difference of an ontology system［3］ from different perspectives, Konev et al. formally proposed three notions of differences for description logic ε-syntactic difference, structural difference and logical difference［4-6］. It is possible that different syntax may result in the same structure, and different structures may lead to the same logical consequences. Their notions of logical difference are closely related to that of forgetting in description logic［7-9］ and inseparability［10-11］. Thus, it is useful for ontology engineering as well［1,12-13］. The notion of logical difference was also extended to fuzzy ε+［14］.

Informally, over a signature , the logical difference between two knowledge bases Σ and Π is the set of formulas over that are derivable from Σ but not from Π, or vice versa. The notion of logical difference is also useful in many other domains, such as decision making and negotiation［15］.

Let us consider the following “argument” scenario: Bob is negotiating with his daughter Annie on her breakfast-milk, oat meal and cookie. Bob prefers to a healthy breakfast, while Annie prefers to a tasty one. Both agree to have only two breakfast items. Bob has the following knowledge about these breakfast items: if the breakfast includes milk and oat meal then it is healthy; if the breakfast includes milk and cookie, then it is tasty; if the breakfast includes oat meal and cookie, then it is either healthy or tasty. Now Annie wants to change her ordinary breakfast choice of milk and (oat meal or cookie) to oat meal and (milk or cookie).

Suppose what Bob really concerns on breakfast is its quality, i.e., any consequence on “good health” and “good taste”. Should Bob agree with her request? A positive answer can be drawn in a formal analysis, cf. Example 1.

While Konev and others concentrated on the difference of (lightweight) ontology ε and , that is a subclass of first-order logic, and there is little literature on logical difference in propositional logic, to our best knowledge. A distinguishing exception is the notion of model difference which owes to Eiter et al.［16］. Informally, the model difference of two formulas φ and ψ is defined as φ∧ψ, whose models are exactly the models of φ but ψ. It is apparent that the model difference cannot capture the difference in terms of logical consequences, i.e., the formulas that are logical consequence of φ，but not of ψ.

To capture such logical difference in the evolving propositional theories (knowledge bases), three notions of differences are proposed in the paper: logical difference, clausal difference and prime difference. Formally, let φ and ψ be two propositional formulas and A a signature, viz, a set of propositional atoms.

· The logical difference of φ from ψ over A consists of the formulas over A that are logical consequence of φ but not of ψ, i.e. {ξVar(ξ)罙,φ 郸,and ψ ξ}, whose elements are differences of φ from ψ over A, where Var (ξ) denotes the set of atoms occurring in ξ.

· The clausal difference of φ from ψ over A consists of the clauses in the logical consequence of φ from ψ over A, i.e.{ξ is a clause Var(ξ)罙,φ 郸, and ψ ξ}.

· The prime difference of φ from ψ over A consists of the prime clauses in the clausal difference of φ from ψ over A.

Intuitively speaking, when the knowledge base ψ is evolved to φ by adding some new knowledge, the difference ξ of φ from ψ over A means that the new information ξ over A is enhanced. Similarly in the case that ψ is evolved to φ by removing or revising some existing knowledge in ψ the difference ξ of φ from ψ over A means that the information ξ over A is missed.

From the perspective of computation, the following problems are usually concerned in knowledge bases evolving. Let φ and ψ be two formulas, α a clause, l a literal and A a signature.

· Does φ have no clausal difference from ψ over A?

· Does the clausal difference of φ from ψ over A contain a clause mentioning l?

· Does every clause in the clausal difference of φ from ψ over A contain l?

· Does α belong to the clausal difference of φ from ψ over A?

These problems are proved to be generally intractable in the second level of polynomial hierarchy. When both φ and ψ are Horn, these complexity issues are one level below that of general case, respectively. When both φ and ψ are 2-CNFs, all of these problems are tractable. Moreover, we can show that these problems have the same complexities for the cases of logical difference and prime difference.

The algorithm LDFF is proposed and implemented for computing clausal difference and prime difference, which is based on an implementation of MINISAT ［17］. It is evaluated on random 3-CNF, 2-CNF and Horn theories. Experimental results show that the number of clausal differences is generally much larger than that of prime difference. More interestingly, it reveals a phase transition on the number of clauses of both clausal and prime differences between random 3-CNF and 2-CNF theories respectively, which seemly agrees with the phase transition of its satisfiability respectively. However, on random Horn theories, while the number of clauses in clausal difference exhibits a phase transition that looks consistent with the coarse phase transition of Horn satisfiability, it behaves quite differently from that of prime difference.

The rest of the paper is organized as follows. Preliminary notions and notations of propositional logic are presented in the next section. The notions of logical, clausal and prime differences are presented in Section 2, which also includes some basic properties about these notions. The computational complexity results are presented in Section 3. Experimental results are reported in Section 4. A formal comparison with the notion of model difference is presented in Section 5. Finally, concluding remarks are discussed in Section 6. This paper is a substantial extension of ［18］.

1 Preliminaries

We assume a propositional language with an underlying finite nonempty signature ={x1,…,xn}, whose elements are propositional atoms/variables. The formulas of are built from using the connectives ,∧,∨, andas usual. A theory is a finite set of formulas. The notions of interpretation, model, entailment and equivalence of are defined in the standard manner.

Let Γ be a theory. The set of atoms appearing in Γ is denoted by Var(Γ ). The set of models of Γ is denoted by Mod (Γ ). The set of logical consequences of Γ, i.e.,{ξΓ 郸蝳, is denoted by Cn(Γ ). Γ is closed if Γ=Cn(Γ ).

A literal is an atom x(called positive literal) or its negation x(called negative literal). The complement of a literal l, written l , is the literal l with double negation eliminated, i.e., x is replaced by x. A clause α is a disjunction l1∨…∨lm(m≥0)of pairwise different literals. The length of α is m, written α=m. It is an empty clause, denoted by ⊥，if m=0; it is a tautology clause, denoted by , if it contains a pair of complementary literals. Note that Var()=Var(⊥)= .

A clause α is reduced if it is of the form , ⊥ or l1∨…∨lm(m≥0)containing no complementary literals. In the following we assume clauses are reduced unless stated explicitly. For convenience, the above clause α is alternatively written as the set {l1,…,lm}. The set of positive (resp. negative) literals occurring in α is denoted by Pos(α) (resp. Neg(α)). A clause α is Horn if Pos(α)≤1. Two clauses α1 and α2 are resolvable if there exists a literal l such that l∈α1 and l∈α2. In this case, the resolvent of α1 and α2 is the clause α1∪α2{l,l}.

A formula is in conjunctive normal form (CNF) if it is a conjunction of clauses. A Horn formula is a conjunction of Horn clauses. A clausal theory is a finite set of clauses. A k-CNF formula is a CNF formula such that each of its clauses has length k. By k-SAT we mean the satisfiability of k-CNF theories. An exact k-CNF formula is a k-CNF formula such that each of its reduced clauses has length k. In the following, we abuse the notion of k-CNF as exact k-CNF for convenience, unless explicitly stated otherwise.

Let A , α be a clause and φ be a formula.

· α is an implicate of φ if φ 郸.

· α is an A-implicate of φ if it is an implicate of φ and Var(α)罙.

· α is a prime implicate of φ whenever it is an implicate of φ and no other implicate α′ of φ satisfying α′α.

· α is a A-prime implicate of φ whenever it is an A-implicate of φ and no other A-implicate α′ of φ satisfying α′α. The set of all of the A-prime implicates of φ is denoted by PIA(φ) . In the case Var(φ)罙 or A=, it is written as PI(φ) for convenience.

It is clear that if φ is unsatisfiable then ⊥ is the unique A-prime implicate of φ for any A. It is well-known that the number of prime implicates of a formula φ is, in general, exponentially larger than the size of φ［19］(A formulas with n variables may have 3n implicates or Ω(3n/n) prime implicates), even if φ is a Horn formula［20］. However, if φ is a 2-CNF formula then the number of prime implicates of φ is bounded by O(φ2) , cf. Theorem 4 of ［21］.

In the following we recall the notion of forgetting in propositional logic. Let V and φ a formula. A formula ψ is a result of forgetting V from φ if M 郸 whenever M′ 郸 for some M′ with MV=M′V. The forgetting result always exists, and it is unique up to equivalence ［22-23］. For this reason, we denote it by Forget(φ,V). A syntactic counterpart definition for forgetting is as follows:

· Forget(φ,)=φ.

· Forget(φ,{x})=φ［/x］∨φ［⊥/x］, where φ［t/x］is the formula obtained from φ by replacing every occurrence of x with t and t ∈{,⊥}.

· Forget(φ,V∪{x})=Forget(Forget(φ,{x}),V).

It is well-known that PIA(φ)≡Forget(φ,A) where A is the complement of A, i.e., A. The notion of forgetting is also related with the notion of irrelevance and independence. Formally, given A, a formula φ is A-irrelevant if there is a formula φ′ such that Var(φ′)∩A= and φ≡φ′. The following results are well-known ［22-24］, that demonstrate a close relationship among the notions of A-prime implicate, forgetting and irrelevance.

Proposition 1

Let φ,ψ be formulas and A. Then

(i)φ≡PI(φ).

(ii)φ≡ψ iff PI(φ)=PI(ψ).

(iii)Forget(φ,A)≡{ψVar(ψ)罙 and φ 郸讅 .

(iv)PIA(φ)≡Forget(φ,A).

(v)PIA(φ)=PIA(ψ) iff Forget(φ,A)≡Forget(ψ,A).

(vi)φ is a A-irrelevant iff φ≡Forget(φ,A).

It is known that, given a formula φ and a clause α,φ 郸 if and only if there exists a clause β∈PI (φ) such that β 郸, cf. Proposition 3.4 of ［24］. This is easily extended to forgetting and A-prime implicates.

Lemma 1

Let φ be a formula, A and α a clause. Then Forget (φ,A) 郸 iff 靓隆 PIA(φ) such that β 郸.

Proof:

Forget(φ,A) 郸

iff 靓隆蔖I (Forget(φ,A))such that by β 郸 Proposition 3.4 of ［24］

iff 靓隆蔖I (PIA(φ))such that β 郸 by (iv) and (ii) of Proposition 1

iff 靓隆蔖IA(φ) such that β 郸.

2 Propositional Logical Differences

In this section we present the formal definitions of three notions of difference and explore their properties.

2.1 Logical difference and clausal difference

Firstly, let us start with logical difference.

Definition 1 (Logical difference)

Let φ，ψ be two formulas/theories, and A. The logical difference of φ from ψ over A, written DiffLA(φ,ψ), is the set of formulas ξs such that Var (ξ)罙，φ 郸 and ψξ, i.e.,

DiffLA(φ,ψ)={ξVar(ξ)罙,φ 郸,ψξ} (1)

It is evident that DiffLA(φ,ψ)=DiffLA(φ′,ψ′) if φ≡φ′ and ψ≡ψ′. Intuitively, the logical difference consists of all of the logical consequences of φ over that are not logical consequence of ψ. For instance, DiffLA(x1,x2)≡x1 if A contains x1 , while DiffLA(x1,x2)≡ if A does not contains x1. It is obvious that, in the case A=, DiffLA(φ,ψ) is {⊥} whenever φ 怠 and ψ⊥，and otherwise. For this reason, we assume A≠ in what follows, unless explicitly stated otherwise. For a theory Γ, we denote Γ|A={ξ∈ΓVar(ξ)罙}. Then DiffLA(φ,ψ)=［Cn (φ)Cn (ψ)］|A=［Cn (φ)］|ACn (ψ).

Example 1

Considering the “argument” scenario in Introduction, for simplicity we denote “milk”, “oat meal”, “cookie”, “good healthy” and “good tasty” by m,ot,c,gh and gt respectively. Now, Bobs knowledge (or theory) Σ about Annies breakfast consists of the following formulas:

m∧ot→gh∧c

m∧c→gt∧ot

ot∧c→(gt∨gh)∧m

m∧(ot∨c)

The above last line encodes Annies ordinary breakfast. If Bob agrees Annies request, then the knowledge Σ is changed to Σ′ by replacing the above last line with ot∧(m∨c). It is not difficult to check that Σ (gh∨gt)∧(c∨ot) and Σ′ (gh∨gt)∧(c∨m). Furthermore, over the signature A={gh,gt}, DiffLA(Σ,Σ′)=DiffLA(Σ′,Σ)=. It means that the quality of breakfast does not change even if Bob agrees with Annies request. Thus, Bob should agree with Annie to change her breakfast choice.

Please note that the theory of logical difference is not closed. For instance, 麯iffL(⊥,) though DiffL(⊥,)≡. In fact, 麯iffLA(φ,ψ) for any formulas φ， ψ and A. The next proposition demonstrates the close relationship between forgetting and logical difference.

Proposition 2

Let φ，ψ be two formulas and A. Then

(i)DiffLA(φ,ψ) is A-irrelevant.

(ii)If ψ1,ψ2 are in DiffLA(φ,ψ) then ψ1∧ψ2 is.

(iii)DiffLA(φ,ψ)={α|Var(α)罙,Forget(φ,A) 郸 and ψα}.

(iv)Forget(φ,A) 礑iffLA(φ,ψ).

(v)DiffLA(φ,ψ) 礔orget (φ,A) iff ψ Forget (φ,A).

(vi)DiffLA(φ,ψ)≡ iff ψ 礔orget(φ,A).

(vii)DiffLA(φ,ψ)≡⊥ iff φ≡⊥ and ψⅰ.

(viii)DiffLA(φ,ψ)=DiffLA(ψ,φ)= iff Forget(φ,A)≡Forget(ψ,A).

Proof:

(i) and (ii) follow directly from Definition 1.

(iii)It follows from the fact φ 郸 iff Forget(φ,A) 郸 when ψ is A-irrelevant.

(iv) It follows from (iii).

(v) It follows from (iii) and (iv).

(vi) It follows from (iv) and (v).

(vii) It is evident.

(vii)It follows from (vi).

In the above proposition, (iv) and (v) indicate that Forget(φ,A) is the strongest theory in DiffLA(φ,ψ) whenever ψ Forget(φ,A). In this case Forget(φ,A) is not a tautology. Item (viii) implies an equivalent condition that φ and ψ are equivalent over signature A, i.e., φ 郸 if and only if ψ 郸 for each formula α with Var (α)罙.

From practice point of view, propositional knowledge bases are generally represented in some regular forms, such as CNF, Horn and so on. In this sense, one may be interested in clauses in logical difference. This motivates the following notion of clausal differences.

Definition 2 (Clausal differences)

Let φ,ψ be two formulas/theories, and A. The clausal difference of φ from ψ over A, written DiffCA(φ,ψ), is the set of clauses ξs such that Var (ξ)罙, φ 郸 and ψξ, i.e.,

{ξ is a clauseVar(ξ)罙,φ 郸,ψξ}(2)

It is evident that DiffCA(φ,ψ)罝iffLA(φ,ψ) and DiffCA(φ,ψ)=DiffCA(φ′,ψ′) if φ≡φ′,ψ≡ψ′. The following example demonstrates more relationships between clausal difference and logical difference.

Example 2

Let φ=x1∧x2,φ′={x1,x2},ψ=x1∧x2,ψ′={x1,x2} and A={x1,x2}. It is evident that φ≡φ′ and ψ≡ψ′. It is not difficult to verify that the following hold:

· DiffLA(φ,ψ)=DiffLA(φ′,ψ)=DiffLA(φ′,ψ′)≡{φ} and x2麯iffLA(φ,ψ).

· DiffLA（ψ,φ)=DiffLA(ψ′,φ)=DiffLA(ψ′,φ′)≡{ψ} and x2麯iffLA(ψ,φ).

· DiffCA(φ,ψ)=DiffCA(φ′,ψ)=DiffCA(φ′,ψ′)={x1,x1∨x2}.

· DiffCA(ψ,φ)=DiffCA(ψ′,φ)=DiffCA(ψ′,φ′)={x1,x1∨x2}.

It is clear that DiffCA(φ,ψ)DiffLA(φ,ψ) and DiffCA(φ,ψ) DiffLA(φ,ψ). The clausal difference DiffCA(φ,ψ) is not closed by the fact x1∨x2麯iffCA(φ,ψ) and DiffCA(φ,ψ) 祒1. Similarly, the logical difference DiffLA(φ,ψ) is not closed because of x1∨x2麯iffLA(φ,ψ) and DiffLA(φ,ψ) 祒1.

The next example shows that the logical difference DiffA(φ,ψ) is possible not a Horn (resp. 2-CNF) theory even if φ is a Horn (resp. 2-CNF) theory.

Example 3

Let φ=x1∧x2,ψ=x1∧x2∧x3 and A= {x1,x2,x3}. Evidently, φ and ψ are both Horn and 2-CNF. The clause ξ=x1∨x2∨x3 is neither Horn, nor 2-CNF, but it belongs to DiffCA(φ,ψ). Thus, DiffCA(φ,ψ) is neither a Horn theory, nor a 2-CNF theory.

The following proposition illustrates the relationship between logical difference and clausal difference.

Proposition 3

Let φ,ψ be two theories and A.

(i)DiffCA(φ,ψ)罝iffLA(φ,ψ).

(ii)DiffCA(φ,ψ)={ξ is a clause|ξ∈DiffLA(φ,ψ)}.

(iii)DiffLA(φ,ψ)= iff DiffCA(φ,ψ)=.

(iv)DiffCA(φ,)≡DiffLA(φ,)≡Forget (φ,A)≡PIA(A).

Proof:

(i)It follows directly from the definitions of Eqs (1) and (2).

(ii)Let α be a clause such that Var(α)罙. We have

α∈DiffCA(φ1,φ2)

iff φ1 郸 and φ2α2

iff α∈DiffLA(φ1,φ2) and α is a clause.

(iii)It is sufficient to show that DiffCA(φ,ψ)= implies DiffLA(φ,ψ)= by (i). Suppose that DiffCA(φ,ψ)= and there is a formula ξ∈DiffLA(φ,ψ). Let ξ≡ξ1∧…∧ξk where each ξi(1≤i≤k) is a clause. We have the following:

ξ∈DiffLA(φ,ψ)

荭1∧…∧ξk ∈DiffLA(φ,ψ)

荭爪1∧…∧ξk

荭爪蝘 for some i(1≤i≤k)

荭蝘 ∈DiffCA(φ,ψ) by φ 郸蝘,ψξi, and Var(ξi)罙.

A contradiction follows. Thus, DiffCA(φ,ψ)= implies DiffLA(φ,ψ)=.

(iv)By (i) and (iv) of Proposition 1 and (i) of the proposition, it is sufficient to show DiffCA(φ,) 礟IA(φ)DiffLA(φ,).

In the case φ≡, the above relationship holds evidently. Suppose φ and α is a clause such that Var(α)罙. We have

α∈PIA(φ)

荭郸 and α

荭痢蔇iffCA(φ,). Thus, DiffCA(φ,) 礟IA(φ).

Secondly, let DiffLA(φ,)≡α1∧…∧αk where αi(1≤i≤k) are clauses. It follows Var(αi )罙 and φ 郸羒 for each i(1≤i≤k). By Lemma 1, there is β∈PIA(φ)such that β 郸羒.Thus,PIA(φ) 礑iffLA(φ,).

Item (iv) of the above proposition shows a sufficient condition under which logical difference is equivalent to clausal difference.In terms of item (iii), clausal difference is sufficient to reveal the emptiness of logical difference. In other words, there is no logical difference for two formulas if and only if they have no clausal difference. The next proposition illustrates further some connections between clausal difference and forgetting.

Proposition 4

Let φ,ψ be formulas and A. Then

(i)DiffCA(φ,ψ) is A-irrelevant.

(ii)Forget (φ,A)DiffCA(φ,ψ).

(iii)DiffCA(φ,ψ)= iff ψ 礔orget(φ,A).

(iv)DiffCA(φ,ψ)≡Forget (φ,A) iff ψα for every α∈PIA(φ).

Proof:

(i)It follows from the definition of clausal difference since every clause in DiffCA(φ,ψ) does not mention any atom from A.

(ii)It follows from Definition 2 and the fact that Forget(φ,A)α if and only if φ 郸 whenever α is A-irrelevant.

(iii)It follows from (ii) of the proposition and Definition 2.

(iv)It follows from (vi) of Proposition 2 and (iv) of Proposition 3.

2.2 Prime difference and its approximation

Though clausal differences are not closed, removing subsumed clauses from a clausal difference preserves logical equivalence. This motivates the following notion of prime difference.

Definition 3

Let φ，ψ be two formulas/theories, and A. The prime difference of φ from ψ over A, written DiffPA(φ,ψ)，is the set of prime clauses of DiffCA(φ,ψ).

It is clear that DiffPA(φ,ψ)罝iffCA(φ,ψ). For the theories φ,φ′,ψ,ψ′ and the signature A in Example 2, we have

DiffPA(φ,ψ)=DiffPA(φ′,ψ′)={ x1}

and

DiffPA(ψ,φ)=DiffPA(ψ′,φ′)={ x1}

To compute the prime difference in terms of its definition, one needs to compute clausal difference at first. As we will see in the following, there exists an equivalent definition which needs not computing DiffCA(φ,ψ) as shown by the following proposition.

Proposition 5

Let φ,ψ be two formulas/theories, and A. Then

DiffPA(φ,ψ)={ξ∈PIA(φ)|ψξ}(3)

Proof:

Let X=DiffPA(φ,ψ) and Y={α∈PIA(φ)|ψα}.

()α∈DiffPA(φ,ψ)

荭痢蔇iffCA(φ,ψ),β∈DiffCA(φ,ψ) s.t βα

軻ar(α)罙,φ 郸,ψα,β:βα & φ 郸

荭痢蔖IA(φ) and ψα

荭痢蕒α∈PIA(φ)|ψα}. Thus, X罽.

()YX≠

α:α∈Y and a黊

α:α∈PIA(φ),ψα and a黊

α:Var(α)罙, φ 郸,α′:φ 郸痢 & α′α,ψα and a黊

α:α∈DiffCA(φ,ψ),a黊 and α′:φ 郸痢 & α′α

β∈DiffCA(φ,ψ) s.t. βα, and α′:φ 郸痢 & α′α and Var(α)罙

there is a clause β:(φ 郸 & βα) and α′:(φ 郸痢& α′α) and Var(α)罙, which is a contradiction.

The next proposition demonstrates a close connection between prime difference and clausal difference.

Proposition 6

Let φ,ψ be two formulas and A.

(i)DiffPA(φ,ψ)≡DiffCA(φ,ψ).

(ii)DiffPA(φ,)=PIA(φ).

(iii)DiffCA(φ,ψ)= iff DiffPA(φ,ψ)= iff DiffLA(φ,ψ)=.

Proof:

(i)It follows from the fact

PI(DiffCA(φ,ψ))≡DiffCA(φ,ψ)

(ii)It follows from (ii) of the proposition, (iii) of Proposition 3 and Lemma 1.

(iii)It follows from (ii) of the proposition and (iv) of Proposition 3.

Item (ii) of the above proposition shows that PIA(φ){ξ∈PIA(φ)|ψ 郸蝳 can be taken as a “minimal approximation” to DiffPA(φ,ψ). We call it an “approximation” because DiffPA(φ,ψ) is not closed under logical consequence, viz, DiffPA(φ,ψ) 郸 does not imply ξ∈DiffPA(φ,ψ).

Recall that prime implicates can be obtained from the resolution proof. Formally speaking, the consensus of two resolvable clauses α1 and α2 is their resolvent, unless the resolvent is a tautology. In this case, the consensus of α1 and α2 is undefined. A consensus proof is a resolution proof in which the consensus rule (that is a restriction of resolution) is applied at each step. For a clausal theory φ,C(φ)denotes the set of clauses that has a consensus proof from φ. It is proved that PI (φ)=min(C(φ), )，cf. Proposition 3.5 of ［24］, where

min(S, )={α∈S|α′∈S s.t. α′ 郸羮

for a clausal theory S. This can be extended to A-prime implicates.

Lemma 2

Let φ be a clausal theory, A and α a clause with Var(α)罙.

(i)(A-covering): φ 郸 iff 靓隆蔖IA(φ) s.t. β 郸.

(ii)(A-saturation): PIA(φ)=min(CA(φ), ).

where CA(φ) denotes the set of clauses γ that has a consensus proof from φ and Var(γ)罙.

Proof:

(i)It follows from(iv) of Proposition 1 and Lemma 1.

(ii)PIA(φ)=PI(PIA(φ))

=PI(Forget(φ,A)) by(iv) of proposition 1

=PI(S) where S is a clausal theory equivalent to Forget(φ,A)

=min(C(S), ) by Proposition 3.5 of ［24］

=min(CA(S), ) since S is A-irrelevant.

A formula φ is Horn (resp. 2-CNF) expressible if there is a Horn (resp. 2-CNF) formula ψ such thatφ≡ψ. Please note here that the formulas and ⊥ are both Horn and 2-CNF. With the above lemma, we can show the categoricity of clausal difference for Horn and 2-CNF theories.

Theorem 1

Let φ,ψ be two formulas/theories and A.

(i)If φ is Horn expressible then DiffCA(φ,ψ) is Horn expressible.

(ii)If φ is 2-CNF expressible then DiffCA(φ,ψ) is 2-CNF expressible.

Proof:

(i)φ is Horn expressible

there is a Horn theory φ′ such that φ′≡φ

輒in(CA(φ′), ) is a Horn theory

軵ICA(φ′) is a Horn theory by A-saturation of

Lemma 2

軩iff PA(φ′,ψ) is a Horn theory by Proposition 5

軩iffCA(φ,ψ) is Horn expressible by Definition 3

(ii)It is similar to that of (i) due to the resolvent of two 2-CNF clauses is also a 2-CNF clause.

Note that prime difference is a subset of clausal difference preserving logical equivalence. However, it is possibly not minimal in the sense that removing some clauses from prime difference may still preserve logical equivalence of clausal difference. For instance, let φ1=(x1∨x3)∧(x2∨x3),φ2= and A={x1,x2,x3}. It is not difficult to see DiffPA(φ1,φ2)≡{x1∨x3,x2∨x3,x1∨x2}. The clause x1∨x2 in Diff PA(φ1,φ2) is redundant in the sense that DiffPA(φ1,φ2)≡{ x1∨x3 ,x2∨x3 }. This motivates the following notion of approximation of clausal difference.

Definition 4

Let Σ be a set of clauses. A set Γ of clauses is an equivalent subset of Σ if Γ力 and Γ≡Σ; it is a minimal equivalent subset of Σ if it is an equivalent subset of Σ and Γ′ is not an equivalent subset of Σ for any Γ′Γ.

The least equivalent subset is called irredundant equivalent subset by Paolo Liberatore ［25］. Formally, a clausal theory Σ is an irredundant equivalent subset(IES) of a clausal theory Γ if Σ力,Σ≡Γ and Σ is irredundant, i.e., Σ{γ}γ for each γ∈Σ. It was showed that a clausal theory Γ has a unique IES if and only if {α∈ΓΓ{α} 郸羮郸, cf. Lemma 2 of ［25］.

In what follows minimal equivalent subsets of prime difference are called minimal difference. Formally, let φ and ψ be two formulas and A. A clausal theory Σ is a minimal difference of φ from ψ over A whenever Σ is a minimal equivalent subset of DiffPA(φ,ψ). It is denoted by DiffMA(φ,ψ). It is evident that Σ ≡DiffPA(φ,ψ) for every minimal difference Σ of φ from ψ over A. The next example demonstrates that the minimal difference is possibly not unique even if φ is a Horn formula or 2-CNF formula.

Example 4

Lets consider the following two cases:

(i) Let φ1={x1∨x4,x1∨x4,x2∨x5,x2∨x3∨x4,x1∨x2∨x3},φ2=and A={xi|1≤i≤5}. Evidently, φ1 is Horn. We have

DiffPA(φ1,φ2)=φ1∪{x3∨x4∨x5,

x1∨x3∨x5}

which has two minimal equivalent subsets:

φ1{x2∨x3∨x4}

and

φ1{x1∨x2∨x3}

(ii)Let φ1={x1∨x2,x1∨x3,x2∨x3,x2∨x4,x1∨x4},φ2= and A={xi|1≤i≤4}. Both φ1 and φ2are 2-CNFs. We have

DiffPA(φ1,φ2)=φ1∪{ x1∨x4 ,x2∨x4 ,x1∨x2,x3∨x4}

which has two minimal equivalent subsets:

φ1{x1∨x3}

and

φ1{x2∨x3}

3 Algorithms and Computational Complexities

In the following we present an algorithm to compute clausal difference, prime difference and minimal difference. The computational complexities for various deciding problems are considered as well.

3.1 An algorithm for computing logical differences

Algorithm 1 enumerates all of the candidate clauses over relevant signature in increasing order of clause length. Recall that PIA(φ′)=DiffPA(φ,)≡Forget (φ,A) in terms of （iii） of Proposition 3 and （iii） of Proposition 6, the algorithm can be used to compute (A-) prime implicates and forgetting results as well.

Input: Two formulas φ and ψ,A罙 and τ

Output: DiffCA(φ,ψ) if τ=0,DiffPA(φ,ψ) if τ=1, and DiffMA(φ,ψ) if τ=2

1 if ψ≡⊥ then return 0;

2 if φ≡⊥ then Σ←{⊥}; else Σ←;

3 if τ=0 then len←A;

4 else

5 if Σ={⊥} then return Σ;

6 if φ is 2-CNF then len←2;

7 end

8 for i←1 to len do

9 foreach clause α over A with α=i do

10 if φα or ψ 郸 then continue;

11 if τ=0 then Σ←Σ∪{α};

12 if τ=1 & β∈Σ s.t.βα then

Σ←Σ∪{α};

13 if τ=2 Σα then Σ←Σ∪{α};

14 end

15 end

16 if τ=2 then

17 foreach α∈Σ do

18 if Σ－{α} 郸 then Σ←Σ－{α}

19 end

20 end

21 return Σ;

Algorithm 1: An algorithm for computing difference

Proposition 7

Algorithm 1 is sound.

Proof:

It is clear that α≤A for each α∈DiffCA(φ,ψ) and α≤2 if α∈DiffPA(φ,ψ) by (ii) of Theorem 1. Thus, it is sufficient to enumerate all the clauses up to the upper bound length len. It follows that the algorithm outputs DiffCA(φ,ψ) in the case τ=0.

For the case τ=1, we show that α is a prime clause of DiffPA(α,ψ) for each α∈Σ. Suppose that α∈Σ is not a prime clause of DiffPA(α,ψ). We have that there is a clause β such that DiffPA(φ,ψ) 郸 and βα

荭<α

荭隆师 or 靓隆洹师 and β′β by the algorithm

荭联Σ. A contradiction follows.

The proof for the case of τ=2 is similar.

The following example shows that how the algorithm can compute the prime difference.

Example 5

Let φ={x1∨x2,x1∨x4,x2∨x3,x2∨x3∨x4},ψ= and A={xi1≤i≤4}. It is tedious but not difficult to check the algorithm:

· i=1, no clause with length i belongs to DiffPA(φ,ψ), i.e., Σ=;

· i=2, the first three clauses in φ are added into Σ;

· i=3, the last clause in φ is added into Σ;

· i=4, there is no clause α with α=4 and α∈DiffPA(φ,ψ), but Σα.

Now, we can see that the clause x1∨x2 of Σ is redundant.

Please note that the number of implicates and prime implicates is bounded by O(3n) and Ω(3n/n)respectively ［19］, where n=. Thus, this algorithm is in exponential time and space in worst case. However, when both φ and ψ are 2-CNF formulas, this algorithm will be in polynomial time for computing prime difference and minimal difference because its number of prime difference is bounded by O(n2).

Please note that, over the signature A with A=n, the number of nonempty and non-tautology clauses with k positive literals are Ckn×(∑n－ki=0Cin－k)=Ckn×2n－kwhere Cmn=n!/[m!(n－m)!] and n!=1×2×…×n. Thus, the total number of non-tautology and nonempty clauses over A is ∑nj=0(Cjn×2n－j)=(2+1)n=3n and the number of prime implicates is bounded by O(3n/n)［19］. It implies that the total number of satisfiability calls is bounded by O(3n). In the case that the satisfiability of both φ and ψ are tractable, e.g., there are linear time algorithms for Horn ［26］ and 2-CNF ［27-28］ formulas, the computational expense of this algorithm is bounded by O(3n(φ+ψ)) when evaluating clausal difference.

Note further that the total number of non-empty and non-tautology 2-CNF clauses over A is C2n(C02+C12+C22)=4C2n=2n(n－1), while the total number of non-empty and non-tautology Horn clauses of A is ∑ni=1Cin+C1n∑n－1i=0Cin－1=2n+n2n－1－1, which is bounded by O(n2n－1). Thus, using the algorithm to compute prime difference DiffPA(φ,ψ) when both φ and ψ are Horn formulas, the computation expense of this algorithm is bounded by O(n2n－1(φ+ψ+n22n－1)) according to (i) of Theorem 1. Here the term n2n－1 is the upper bound of subsumption checking, i.e., 靓隆师 such that βα where Σ≤n2n－1 and checking βα is bounded by O(n). In this case, the clauses in the condition of for-each loop (line 12) are restricted to Horn ones. Similarly, when both φ and ψ are 2-CNF formulas, the computational expense of the algorithm for DiffPA(φ,ψ) is bounded by O(n2(φ+ψ+nn2)) in terms of (ii) of Theorem 1, thus it is polynomial. In this case, the condition A of the for-each loop (line 6) is replaced by 2.

3.2 Complexities

Note that it is BH2-complete to decide if a clause α is a A-prime implicate of a formula φ over a signature A［24］. Namely, it is as hard as the satis-fiability and deduction (i.e., both NP-hard and co-NP-hard) but not much harder since in BH2. This problem falls into P if φ is a Horn formula. In general, there are exponential number of (A-)prime implicates of a formula φ even if φ is a Horn formula［20］.

Now we consider the following deciding problems on clausal difference. Given two formulas φ and ψ a clause ξ, a literal l and A, we define

membership：ξ∈DiffCA(φ,ψ)?

vacuity：DiffCA(φ,ψ)=?

relevance：l∈α for some α∈DiffCA(φ,ψ)?

necessity：l∈α for every α∈DiffCA(φ,ψ)?

The next theorem shows that these problems are intractable in general.

Theorem 2

(i)vacuity is Πp2-complete.

(ii)relevance is Σp2-complete.

(iii)necessity is Πp2-complete.

(iv)membership is BH2-complete.

Proof:

Let φ,ψ be two arbitrary formulas and A.

(i)Membership: If DiffA (φ,ψ)= then there is a clause β with Var(β)罙 such that φ 郸 and ψβ.Thus, if an NP-oracle is available then guessing the clause β with Var(β)罙 and checking the conditions φ 郸 and ψβ can be done in polynomial time in the size of the problem. Thus, the problem is in Πp2.

Hardness: Let φ be a formula over X∪Y with X∩Y=.

DiffX(φ,)=

iff 礔orget(φ,Y) by (iii) of Proposition 4

iff Forget (φ,Y)≡

iff 蠿鯵φ is valid, which is Πp2-complete

Thus, it is Πp2-hard even if ψ≡.

(ii)Membership: If there is a clause α∈DiffA (φ,ψ) containing l then one can guess the clause α and check the conditions Var(α)罙,φ 郸 and ψα. Both the guessing and checking can be done in polynomial in the size of the problem when an NP-oracle is available. Thus, the problem is in Πp2.

Hardness: Let φ be a formula over X∪Y with X∩Y=,x a fresh atom and X′=X∪{x}. The following holds:

There is a clause α containing x in DiffX′ (φ∨x,)

iff there is a non-tautology clause α s.t. Var(α)罼′, x∈Var(α) and φ∨x 郸

iff there is a non-tautology clause α′ s.t. Var(α′)罼, α=α′∨x and φ 郸痢 since x黇ar (φ)∪Var(α)

iff 靓痢:α′∈DiffX (φ,)

iff DiffX(φ,)≠, which is Σp2-hard by (i)

Thus, this problem is Σp2-hard even if ψ≡.

(iii)Membership: If there is a clause α containing no l in DiffA(φ,ψ) then one can guess the clause α and check if φ 郸 and ψα. Both the guessing and checking can be done in polynomial time in the size of the problem when an NP-oracle is available. Thus, this problem is in Πp2.

Hardness: Let φ be a formula over X∪Y with X∩Y=,x a fresh atom and X′=X∪{x}. The following holds:

There is a non-tautology clause α in DiffX′(φ∧x,) containing no x

iff there is a non-tautology clause α containing no x such that φ∧x 郸

iff there is a non-tautology clause α containing no x such that φ 郸 due to x黇ar(φ)∪Var(α)

iff 靓:α∈DiffX (φ,)

iff DiffA(φ,)≠, which is Σp2-hard by (i)

Thus, this problem is Πp2-hard even if ψ≡.

(iv)Membership: If α∈DiffA (φ,ψ) then Var (α)罙, (a) φ 郸 and (b) ψα. Checking the second condition (a) is in co-NP, while checking the last condition (b) is in NP. Thus, the problem is in BH2.

Hardness: Recall that it is BH2-complete to decide if φ is satisfiable while ψ is unsatisfiable for two given formulas φ and ψ. Furthermore, for a formulaand a fresh variable x黇ar(),祒 if and only ifis unsatisfiable.

Thus, 〈ψ,φ〉is BH2-hard

iff ψ is satisfiable and φ is unsatisfiable

iff ψx and φ 祒 where x is a fresh variable

iff x∈Diff{x}(φ∧x,ψ)

Thus, this problem is BH2-hard.

The next theorem shows that the complexities of vacuity, relevance, necessity and membership for Horn theories is one level below that of general case, respectively.

Theorem 3

(i)vacuity is co-NP-complete for Horn theories.

(ii)relevance is NP-complete for Horn theories.

(iii)necessity is co-NP-complete for Horn theories.

(iv)membership is in P for Horn theories.

Proof:

Let φ,ψ be two arbitrary Horn theories and A.

(i)Membership: if DiffA(φ,ψ)= then there exists a clause α with Var (α)罙 such that α∈DiffA(φ,ψ), i.e., (a) φ 郸 and (b) ψα. Note that φ∧α is also a Horn formula, whose satisfiability problem is in P, φ 郸 iff ψα is unsatisfiable, and φα iff φ∧α is satisfiable. Thus, one can efficiently guess the clause α by an NP-oracle and checking the conditions (a) and (b) in polynomial time in the size of the problem. Thus, the problem is in co-NP.

Hardness: It follows from (vi) of Proposition 2 and the facts that DiffA(φ,ψ)= iff ψForget(φ,A), which is co-NP-complete (cf. (i) of Theorem 11 in ［29］).

(ii)Membership: If there exists a clause α∈DiffA (φ,ψ)containing l then one can guess the clause α and check the conditions Var(α)罙,φ 郸 and ψα. The checking task is evidently tractable since both φ and ψ are Horn formulas. Thus, the problem is in NP.

Hardness: Let φ,ψ be two Horn formulas over X∪Y with X∩Y=,x a fresh atom, X′=X∪{x} and φ′ be the Horn formula obtained from φ by replace each Horn clause α of φ by α∨x. It is clear that φ′ is a Horn formula.

DiffX′ (φ′,ψ) contains a non-tautology (reduced) clause α∨x

iff Var(α)罼, φ′ 郸痢舩 and ψα∨x

iff Var(α)罼, Forget(φ′,Y) 郸痢舩 and Forget (ψ,Y)α∨x

iff Var(α)罼, Forget (φ,Y)∨xα∨x and Forget (ψ,Y)α since φ′≡φ∨x and Forget (肌纽肌,X)≡Forget (,X)∨Forget (肌,X)

iff Var(α)罼, Forget(φ,Y) 郸 and Forget(ψ,Y)α since Forget(φ,Y)∨x 郸痢舩 iff Forget (φ,Y) 郸

iff 靓:α∈DiffA (φ,ψ)

iff DiffX (φ,ψ)=, which is NP-hard by (i)

Thus, this problem is NP-hard.

(iii)Membership: It follows from the membership of (i).

Hardness: Let x be a fresh atom. Note that DiffA(φ∧x,) has a clause containing no x if and only if DiffA(φ,)≠, which is NP-complete by (i). Thus, this problem is co-NP-hard.

(iv)It follows from the facts that both φ∧α and ψ∧α are Horn formulas, whose satisfiability problem is tractable.

Recall that the satisfiability of 2-CNF theories is tractable and every prime implicate of a 2-CNF theory containing at most two literals. For a given signature A with A=k, there are at most C12k+C22k=2k2+k number of such clauses. It implies that the vacuity, relevance, necessity and membership problems are tractable in this situation.

Corollary 1

The vacuity, relevance, necessity and membership problems are all tractable for 2-CNF theories.

As the clauses in logical difference are exactly the same clauses in clausal difference, the next corollary follows.

Corollary 2

The vacuity, relevance, necessity and membership for logical difference has the same complexity to that of clausal difference respectively.

4 Experiments

For evaluation purpose, a prototype LDIFF was implemented to compute clausal, prime and minimal differences of CNF theories in C++. It employs miniSAT 2.2 as an internal SAT solver, and the combinatorial routine COMBO as a clause enumerator. Experiments were conducted on a workstation running Linux 3.19.5 with GCC 4.9.2, 32GB memory and 8 Intel(R) Core (TM)i7-4770K CPUs @ 3.50GHz. The experimental data are also accessible along with the LDIFF.

To experimentally investigate the logical difference between two random clause theories, we consider the well-known 3-CNF, 2-CNF and 3Horn (each Horn clause has 3 literals) theories. Experiments were conducted on clausal, prime and minimal differences over a random signature A with A∈{10,15}. and the number n of propositional variables of random clause theories ranges from 20, 40, 80 to 160. All the random 2/3-CNF theories are generated by an adapted C program ksat.c, which makes use of the package SPRNG ［30］.

4.1 3-CNFs

The problem 3-SAT is one of the most well-studied problems in computer science due to its theoretical and practical interest. Random 3-CNFs have a well- known satisfiability phase transition. Its rigorous lower bound and upper bound for the ratio of clauses-to-variables is 3.42 and 4.506 respectively［31-34］. Kirkpatrick and Selman［35］ discover a critical ratio 4.17(±0.5) for the easy-hard-easy satisfiability distribution. The notions of clausal and prime differences allow us to explore the phase transition of random 3-SAT from the perspective of logical difference.

The number m of clauses of tested theories is chosen according to the ratio m/n from 0.2 to 8 with interval 0.2. In each case, 100 trails were conducted and experimental results in average is reported. The memory cost of these experiments is about 7.0 MiB and the worst running time is about 1 300 s where n=160 and m/n=3.8, at which the most time-consuming satisfiability checking occurs. The number of clausal difference and prime difference of n∈{40,160} with clause length in ［0,15］ are reported in Fig.1 and Fig.2 respectively. The other cases can be found in along with LDIFF.

It can be seen that, (a) the number of clausal difference is about 100 times larger than the number of prime difference; (b) both the numbers of clausal difference and prime difference approach peaks sharply when n increasing; (c) the peak of the number of clausal difference appears nearby m/n=3.8 with clause length 9 for n=160, and m/n=3.2 with clause length 5 for n=40, while the peak of the number of prime difference appears nearby m/n=3.6 with clause length 8 for n=160, and m/n=3.2 with clause length 4 for n=40. This phenomenon looks like a phase transition for clausal and prime difference with individual clausal length.

To systematically explore the behavior of the overall numbers of clauses in logical difference, each number x of clauses is normalized by min-max to x* , i.e.,

x*=(x－min(X))/(max(X)－min(X))

where X is the set of numbers of clauses in difference. The normalized numbers of clauses in clausal difference and prime difference for all of the tested cases were reported in Fig.3. As can be seen, with n increasing, the normalized number of clauses in clausal difference and prime difference approaches and then deviates their peaks sharply. Secondly, the peaks of normalized number of clauses in clausal difference and prime difference shifts from a smaller n to a larger one. It also exhibits a phase transition like phenomenon for the overall clausal and prime differences.

It seems that there is a critical point, where these curves seemly intersect at which peaks wont pass over. It is around 4.3 for clausal difference, and around 4.2 for prime difference.

This is similar to Fig.7 about the peak-normalized mean numbers of prime implicates of length 2 in［36］. The results also partially indicate a potential reason for the phase transition and easy-hard-easy distribution of random 3-SAT when n gets larger because it is more difficult to distinguish two random 3-CNF theories once there are larger number of clauses in clausal/prime difference.

Experiments were also conducted on clausal, prime and minimal differences between random 3-CNF theories over a random relevant signature A with A=10. The above similar phenomena with A=15 have been observed from the experimental results, whose statistical figures are accessible at the site of LDIFF.

4.2 2-CNFs

The 2-CNF (or Krom) theories is another well-studied CNF theories. It is in linear time to check whether a 2-CNF theory is satisfiable ［28,37］, and it is tractable to compute a forgetting result for 2-CNF theories ［38］. It was also showed that there is a phrase transition for the satisfiability of random 2-CNF theories with threshold 1 which separates its satisfiability and unsatisfiability ［39］. Accordingly, the number m of clauses of random 2-CNF theories is chosen according to the ratio m/n which ranges from 0.1 to 3 with interval 0.1, where n∈{20，40,80，160}.

Following the above experiments on computing logical difference for random 3-CNF theories, we tested the same cases for random 2-CNF theories. The worst time cost is about 50 seconds when n=160 and m/n ranging from 0.1 to 0.7 for computing clausal difference and minimal difference. When computing the prime differences, the worst time cost is 0.03 s. The overall memory cost is about 7.0 MiB for all the cases.

The min-max normalized numbers of clauses in clausal and prime differences are shown in Fig.4. It is remarkably to note that the normalized numbers of clauses in prime difference faithfully respect the threshold which separates satisfiability and unsatis-fiability of random 2-CNF theories since their peaks are close to 1 very well. The experimental results for the case of A=10 also exhibits the similar phe-nomenon above. In addition, note that the length of clauses in prime/minimal difference is no larger than 2, the number of clauses in clausal difference is about one thousand times larger than that of prime/minimal difference, while it is about 100 times in the case of random 3-CNF theories.

4.3 Horn theories

Note that there are 2n(n+2)－1 Horn clauses over with =n and it is commonly known that Horn satisfiability generally lacks a sharp phase transition ［40］. While random Horn theories with variable-clause-length model and allowing repetition of clauses have a coarse threshold: for every constant c>0 and every function μ(n)=2n(c+o(1)),

limn→∞Prop肌师(n,μ(n))［ is satisfiable］

=1－F(e－c)(4)

where Ω(n,μ(n)) is the collection of Horn theories with n variables and μ(n) Horn clauses, and F(x)=(1－x)(1－x2)(1－x4)…(1－x2k)…. It means that the probability of satisfiability is a continuous function of the parameters of the model ［40］. The values of 1－F(e－c)with k=100 is shown in Tab.1.

For evaluation purpose, we computed clausal, prime and minimal differences between random Horn theories with n∈{10,15} variables and c·2n number of Horn clauses over A=10 where c ranges from 0.09 to 0.99 with interval 0.045 and from 1.19 to 5.00 with interval 0.2. Note that it generated about 300 MiB and 13 GiB of random Horn theories in all for n=10 and n=15 respectively. The worst time cost for computing clausal differences is 0.2 s when n=10 and 0.8 s when n=15. The memory cost is about 6.8 MiB for all cases. The normalized numbers of clauses in clausal difference and prime difference are shown in Fig.5.

Unlike the cases for random 3-CNF theories and 2-CNF theories, there is quite different behavior for random Horn theories. On the one hand, only the numbers of clauses in clausal difference seemly respects to the coarse satisfiability phase transition. On the other hand, the numbers of clauses in prime difference behaves quite different from that of clausal difference since it monotonically decreases. The variable-clause-length model without repeating clauses was also tested. It exhibits similar result to the above. We also tested the logical difference between random 3Horn theories with n∈{20，40,80，160} in which each clause has length 3 and the number of

clauses ranges from 0.1·n to 8·n where n is the number of variables. It also exhibits a quite different behavior between the clausal difference and prime difference.

5 Related Work

Eiter et al proposed a notion of difference which is taken as a logical operator ［16］. Formally, the difference of two formulas φ and ψ is defined as the formula φ∧ψ which is denoted by φψ, One can see that the following holds:

Mod(φψ)=Mod(φ)∩Mod (ψ)

=Mod(φ)Mod(ψ)

The difference of two Horn formulas φ and ψ is investigated particularly in ［16］. It shows that φψ is possibly not Horn expressible even if both φ and ψ are Horn formulas. However, as shown by Theorem 1, the clausal difference of φ and ψ is always Horn expressible for any A.

Recall that φ 礑iffLA(φ,ψ) and φ 礑iffCA(φ,ψ) for every formulas φ and ψ and A. Accordingly, the following corollary easily follows:

Corollary 3

Let φ，ψ be two formulas and A. It holds that φψ 礑iffLA(φ,ψ) and φψ 礑iffCA(φ,ψ).

The next proposition indicates that, under some nontrivial conditions, the logical/clausal difference is identical to Eiter et als difference.

Proposition 8

Let φ,ψ be two formulas and A=Var(φ) where ψ is satisfiable and φ 郸. The following statements hold:

(i)DiffLA(φ,ψ)≡φψ

(ii)If ψα for every α∈PI(φ) then Diff CA(φ,ψ)≡φψ

Proof:

(i)φ 郸

荭郸

荭爪 due to the satisfiability of ψ

荭铡蔇iffLA(φ,ψ) by A=Var(φ) and

Definition 1

荭铡訢iffLA(φ,ψ)

荭铡摩住訢iffLA(φ,ψ) since φ 郸

荭誠ψ≡DiffLA(φ,ψ).

(ii)It follows from (i) and Definition 2.

As demonstrated by Example 2, it is possible that φψDiffLA(φ,ψ) if the condition of (ii) in Proposition 8 does not hold.

6 Conclusion and Future Work

In this paper three logically equivalent differences for propositional logic were proposed-logical difference, clausal difference and prime difference. Their properties and computational complexities were explored.

In particular, we showed that (1) these notions of difference are closely connected with the well-known notion of forgetting in propositional logic, (2) the clausal difference of 2-CNF and Horn theories are categorical, i.e., their clausal difference is still 2-CNF (resp. Horn) expressible when the first being compared theories are 2-CNF (resp. Horn) theories, (3) the problems of vacuity, relevance, necessity and membership are intractable for CNF and Horn theories, while they are tractable for 2-CNF theories.

Comprehensive experimental results revealed some interesting phase transition phenomena for clausal difference and prime difference of random 3-CNF theories and random 2-CNF theories. The number of clausal and prime difference of two theories depends not only on the size of relevant signature, i.e., the larger size of relevant signature the greater number of clausal and prime difference. It also depends on the ratio of the number of clauses to the size of underlying signature.

These notions of difference are defined for propositional logic. It is a challenge to extend these notions to first-order logic and nonmonotonic logics, such as default logic, circumscription and autoepistemic logics. It deserves our endeavor to deeply explore the phase transition of clausal/prime difference of random 3-CNF theories and 2CNF theories. It is also worthwhile to investigate the relationship between the phase transition of 3-SAT and that of clausal/prime difference as well.

References：

[1]AZHAO Y Z, ALGHAMDI G, SCHMIDT R A, et al. Tracking logical difference in large-scale ontologies: a forgetting-based approach［J］. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33: 3116-3124.

［2］ GFRDENFORS P. Belief revision［M］. Cambridge: Cambridge University Press, 2003.

［3］ NOY N F, MUSEN M A. Promptdiff: a fixed-point algorithm for comparing ontology versions［C］// Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence. Edmonton: AAAI Press, 2002: 744-750.

［4］ KONEV B, WALTHER D, WOLTER F. The logical difference problem for description logic terminologies［C］// Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2008: 259-274.

［5］ KONEV B, LUDWIG M, WALTHER D, et al. The logical difference for the lightweight description logic ε［J］. Journal of Artificial Intelligence Research, 2012, 44: 633-708.

［6］ SOLIMANDO A, JIMINEZ-RUIZ E, GUERRINI G. Minimizing conservativity violations in ontology alignments: algorithms and evaluation［J］. Knowledge and Information Systems, 2017, 51(3): 775-819.

［7］ WANG Z, WANG K W, TOPOR R, et al. Forgetting for knowledge bases in DL-Lite［J］. Annuals of Mathematics and Artificial Intelligence. 2010, 58(1/2): 117-151.

［8］ KONTCHAKOV R, WOLTER F, ZAKHARYASCHEV M. Logic-based ontology comparison and module extraction, with an application to DL-Lite［J］. Artificial Intelligence, 2010, 174(15): 1093-1141.

［9］ THOMAS E, GABRIELE K I. A brief survey on forgetting from a knowledge representation and reasoning perspective［J］. Künstliche Intelligenz, 2019, 33(1): 9-33.

［10］KONEV B, LUTZ C, WALTHER D, et al. Model-theoretic inseparability and modularity of description logic ontologies［J］. Artificial Intelligence, 2013, 203: 66-103.

［11］BOTOEVA E, KONTCHAKOV R, RYZHIKOV V, et al. When are description logic knowledge bases indistinguishable［C］// Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015. Buenos Aires: AAAI Press, 2015: 4240-4246.

［12］ROMERO A A, KAMINSKI M, GRAU B C, et al. Module extraction in expressive ontology languages via datalog reasoning［J］. Journal of Artificial Intelligence Research, 2016, 55: 499-564.

［13］ZHAO Y Z, SCHMIDT R A. Role forgetting for ALCOQH()-ontologies using an Ackermann-based approach［C］// Proceedings of the 26th International Joint Conference on Artificial Intelligence, IJCAI17. Melbourne: AAAI Press, 2017: 1354-1361.

［14］FENG S S, ZHANG Y G, OUYANG D D, et al. The logical difference for fuzzy ε+ ontologies［C］// Pro-ceedings of the 23rd International Workshop on Descrip-tion Logics (DL 2010). Waterloo: IEEE Computer Society, 2010: 457-463.

［15］SHUMAN S I. Justification of judicial decisions［J］. California Law Review, 1971, 59(3): 715-732.

［16］EITER T, IBARAKI T, MAKINO K. On the difference of horn theories［J］. Journal of Computer System Science, 2020, 61(3): 478-507.

［17］CLAESSEN K, EEN N, SHEERAN M, et al. SAT-solving in practice, with a tutorial example from supervisory control［J］. Discrete Event Dynamic Systems, 2009, 19(4): 495-524.

［18］WANG Y S, LIU H, ZHANG Y, et al. Logical difference of propositional theories［C］// The 13th International FLINS Conference on Data Science and Knowledge Engineering for Sensing Decision Support. Northern Ireland: World Scientific, 2018: 798-805.

［19］CHANDRA A K, MARKOWSKY G. On the number of prime implicants［J］. Discrete Mathematics, 1978, 24(1): 7-11.

［20］BOROS E, CRAMA Y, HAMMER P L. Polynomial-time inference of all valid implications for horn and related formulae［J］. Annals of Mathematics and Artificial Intelligence, 1990, 1: 21-32.

［21］DECHTER R, RISH I. Directional resolution: the Davis Putnam procedure, revisited［C］// Proceedings of the 4th International Conference on Principles of Knowledge Representation and Reasoning (KR94). Bonn: Morgan Kaufmann, 1994: 134-145.

［22］LIN F Z, REITER R. Forget it［C］// In Proceedings of the AAAI Fall Symposium on Relevance. New Orleans: AAAI Press, 2019: 154-159.

［23］LANG J, LIBERATORE P, MARQUIS P. Propositional independence: formula-variable independence and forgetting［J］. Journal of Artificial Intelligence Research, 2003, 18: 391-443.

［24］MARQUIS P. Handbook of defeasible reasoning and uncertainty management systems: algorithms for defeasible and uncertain reasoning［M］. Amsterdam: Kluwer Academic Publishers, 1999.

［25］LIBERATORE P. Redundancy in logic I: CNF pro-positional formulae［J］. Artificial Intelligence, 2005, 163(2): 203-232.

［26］DOWLING W F, GALLIER J H. Linear-time algorithms for testing the satisfiability of propositional horn formulae［J］. Journal of Logic Programming, 1984, 1(3): 267-284.

［27］EVEN S, ITAI A, SHAMIR A. On the complexity of timetable and multicommodity flow problems［J］. SIAM Journal on Computing, 1976, 5(4): 691-703.

［28］ASPVALL B, PLASS M F, TARJAN R E. A linear-time algorithm for testing the truth of certain quantified Boolean formulas［J］. Information Processing Letters, 1979, 8(3): 121-123.

［29］WANG Y S. On forgetting in tractable propositional fragments ［DB/OL］. (2015-02-10)［2023-11-14］. https://arxiv.org/abs/1502.02799.

［30］MASCAGNI M, SRINIVASAN A. Algorithm 806: SPRNG: a scalable library for pseudorandom number generation［J］. ACM Transactions on Mathematical Software, 2000, 26: 436-461.

［31］MITCHELL D, SELMAN B, LEVESQUE H. Hard and easy distributions of SAT problems［C］// Proceedings of the 10th National Conference on Artificial Intelligence. San Jose: AAAI Press, 1992: 459-465.

［32］DUBOIS O, BOUFKHAD Y, MANDLER J. Typical random 3-SAT formulae and the satisfiability threshold［C］// Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms. San Francisco: ACM/SIAM, 2000: 126-127.

［33］MIZARD M, PARISI G, ZECCHINA R. Analytic and algorithmic solution of random satisfiability problems［J］. Science, 2002, 297: 812-815.

［34］BIROLI G, COCCO S, MONASSON R. Phase transitions and complexity in computer science: an overview of the statistical physics approach to the random satisfiability problem［J］. Physica A: Statistical Mechanics and its Applications, 2002, 306: 381-394.

［35］KIRKPATRICK S, SELMAN B. Critical behavior in the satisfiability of random Boolean expressions［J］. Science, 1994, 264: 1297-1301.

［36］SCHRAG R, CRAWFORD J M. Implicates and prime implicates in random 3-SAT［J］. Artificial Intelligence, 1996, 81(1/2): 199-222.

［37］KROM M R. The decision problem for formulas in prenex conjunctive normal form with binary disjunctions［J］. The Journal of Symbolic Logic, 1970, 35(2): 210-216.

［38］GOERDT A. A threshold for unsatisfiability［J］. Journal of Computer System Science, 1996, 53(3): 469-486.

［39］BOLLOBBS B, BORGS C, CHAYES J T, et al. The scaling window of the 2-SAT transition［J］. Random Struct & Algorithms, 2001, 18(3): 201-256.

［40］ISTRATE G. The phase transition in random horn satisfiability and its algorithmic implications［J］. Random Struct & Algorithms, 2002, 20(4): 483-506.

（責任编辑：周晓南）

摘要：

逻辑差概念在表征基于逻辑的知识库中起到重要作用，这些知识库持续受到动态变化的影响，它们之间存在实质性差异。这一概念与遗忘密切相关，它在各种逻辑中得到了广泛探讨。针对命题理论的相关符号，提出了3种差概念——逻辑差异、子句差异和素子句差异，以分别捕获逻辑推理、子句推理和素子句推理的差异；研究了它们的性质和计算复杂性。结果表明，涉及逻辑差的各种决策问题在多项式层次结构中比相应的可满足性问题高一个层次，除了2-CNF理论，其相关决策问题是易处理的。随机3-CNF、2-CNF和Horn理论的大量实验结果揭示了子句差和素子句差的一些有趣现象：在随机3-CNF理论和2-CNF理论中，子句和素子句差的子句数量都表现出与它们的可满足性类似的相变特征。然而，在随机Horn理论中，尽管子句差的子句数量表现出与其可满足性类似的相变，但素子句差的子句数量与子句差情形十分不同，这些结果揭示了随机命题知识库演化中其可满足性相变现象的新特征：在相变阈值附近的知识库演变会产生更多的差异。

关键词：

逻辑差；子句差；素子句差；计算复杂性；相变；知识管理

王以松，贵州大学计算机科学与技术学院教授,硕士、博士研究生导师，贵州省优秀青年科技人才，香港科技大学、加拿大Alberta大学博士后；北京大学、波茨坦大学、德州理工大学、西悉尼大学、格里菲斯大学、莱斯特大学访问学者。Annals of Mathematics and Artificial Intelligence 刊物Associate Editor，多次担任AAAI、IJCAI等会议程序委员；曾获KR-2006的Ray Reiter最佳论文，指导博士研究生黄羿获ILP-2016最佳学生论文。