Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATP finds a non-reconstructible solution for a false statement (minor) #113

Closed
sliedes opened this issue Oct 25, 2021 · 3 comments
Closed
Assignees
Labels
invalid This doesn't seem right

Comments

@sliedes
Copy link

sliedes commented Oct 25, 2021

Hi,

I assume this is something that should not happen: The hammer tactic reports that CVC4 found a solution for a goal which is provably false (and where nothing in the context allows deriving False). Then, hammer fails to reconstruct the proof, as it obviously should.

I'm sorry that this example is a bit long; I removed the parts of my proof that it didn't need, and this is what is left. I included the proofs of the lemmas to demonstrate that there is nothing inconsistent there. I added comments and Compute statements at the end demonstrating that the goal after the "exists" statement should be unprovable.

(Furthermore, and I assume this is a different issue, if I execute everything in CoqIde straight to the end, including the "hammer." statement, coqproofworker.opt segfaults (no OOM). So what I do in coqtop to demonstrate this is to execute up to the point just before the "hammer." statement, wait until everything before that has been evaluated, and then proceed to the hammer statement.)

I'm using coq 8.13.2 with coq-hammer 1.3.2+8.13 from opam.

Require Import Arith.
Require Import Psatz.
Require Import FunInd.
Require Import Recdef.
Require Import ssreflect ssrfun ssrbool.
Require Import Unicode.Utf8.
Require Import Coq.Program.Basics.

From Hammer Require Import Tactics Hammer Reflect.

Set Hammer GSMode 5.
Set Hammer ATPLimit 20.
Set Hammer ReconstrLimit 5.

Set Implicit Arguments.
Unset Strict Implicit.
Unset Printing Implicit Defensive.

Definition log3_iter_internal_round tup :=
    match tup with
    | (log_up_counter, up_counter, dist_next) =>
            match dist_next with
            | O => ((S log_up_counter), (S up_counter), (2 * up_counter + 1))
            | S dist_next' => (log_up_counter, (S up_counter), dist_next')
            end
    end.

Definition log3_iter (down_counter log_up_counter up_counter dist_next : nat) :=
    match iter (nat*nat*nat) down_counter log3_iter_internal_round (log_up_counter, up_counter, dist_next) with
    | (val, _, _) => val
    end.

Fixpoint log3_iter_alternative down_counter log_up_counter up_counter dist_next :=
    match down_counter with
    | O => log_up_counter
    | S down_counter' => match dist_next with
                            | O => log3_iter_alternative down_counter' (S log_up_counter) (S up_counter) (2 * up_counter + 1)
                            | S dist_next' => log3_iter_alternative down_counter' log_up_counter (S up_counter) dist_next'
                            end
    end.

Definition log3_nat (n : nat) : nat := log3_iter (Nat.pred n) 0 1 1.

Lemma iter_round : ∀ (A : Type) (n : nat) (f : A -> A) (x0 : A), iter A (S n) f x0 = f (iter A n f x0).
Proof.
    auto.
Qed.

Lemma iter_plus : ∀ {A : Set} (n1 n2 n3 : nat) (x0 : A) (f : A -> A),
    n1 = n2 + n3 -> iter A n1 f x0 = iter A n2 f (iter A n3 f x0).
Proof.
    induction n1. sauto lq: on. sauto q: on.
Qed.

Lemma log3_iter_alternative_equal : ∀ down_counter log_up_counter up_counter dist_next,
    log3_iter_alternative down_counter log_up_counter up_counter dist_next =
    log3_iter down_counter log_up_counter up_counter dist_next.
Proof.
    intro.
    set (fst3 (tup : (nat*nat*nat)) := let '(p, _, _) := tup in p).
    assert (∀ (tup: (nat*nat*nat)), (let '(p, _, _) := tup in p) = fst3 tup) as Fst3. auto.

    set (fiter dc logup up dn := fst3 (iter (nat * nat * nat) dc log3_iter_internal_round (logup, up, dn))).
    unfold log3_iter.

    induction down_counter. sfirstorder.
    intros.

    assert (∀ down_counter log_up_counter up_counter dist_next,
            fst3 (iter (nat * nat * nat) down_counter log3_iter_internal_round (log_up_counter, up_counter, dist_next)) =
            fiter down_counter log_up_counter up_counter dist_next) as Fiter. auto.
    rewrite -> Fst3, -> Fiter.
    simpl. destruct dist_next.
    -   (* dist_next = 0 *)
        unfold log3_iter_alternative.
        rewrite -> IHdown_counter. repeat rewrite -> Fst3. unfold fiter. f_equal.
        rewrite (iter_plus (A:=nat*nat*nat) (n2:=down_counter) (n3:=1) (log_up_counter, up_counter, 0)). lia. auto.
    -   (* dist_next > 0 *)
        unfold log3_iter_alternative.
        rewrite -> IHdown_counter. repeat rewrite -> Fst3. unfold fiter. f_equal.
        rewrite (iter_plus (A:=nat*nat*nat) (n2:=down_counter) (n3:=1) (log_up_counter, up_counter, S dist_next)). lia. auto.
Qed.

Lemma log3_iter_alternative_spec : ∀ down_counter log_up_counter up_counter dist_next,
    3^(S log_up_counter) = dist_next + up_counter + 1 →
    dist_next < 2*3^log_up_counter →
    let s := log3_iter_alternative down_counter log_up_counter up_counter dist_next in
    3^s <= down_counter + up_counter < 3^(S s).
Proof.
    assert (∀ z, (3^z + (3^z + 3^z)) = 3*3^z); try lia.
    induction down_counter.
    -   do 3 intro. intros EQ LT. simpl.
        rewrite <- plus_n_O, -> H, <- Nat.pow_succ_r'.
        split; try lia.
        assert (up_counter = 3^(S log_up_counter) - dist_next - 1); try lia.
        rewrite -> H0, -> Nat.pow_succ_r'. lia.
    -   do 3 intro. intros EQ LT.
        destruct dist_next.
        +   rewrite -> plus_Sn_m, -> plus_n_Sm.
            apply IHdown_counter; try lia. simpl.
            repeat rewrite <- plus_n_O.
            rewrite -> H, <- Nat.pow_succ_r', -> EQ.
            lia.
        +   rewrite -> plus_Sn_m, -> plus_n_Sm.
            apply IHdown_counter.
            all: lia.
Qed.

Lemma log3_iter_spec : ∀ down_counter log_up_counter up_counter dist_next,
    3^(S log_up_counter) = dist_next + up_counter + 1 →
    dist_next < 2*3^log_up_counter →
    let s := log3_iter down_counter log_up_counter up_counter dist_next in
    3^s <= down_counter + up_counter < 3^(S s).
Proof.
    intros. subst s.
    rewrite <- log3_iter_alternative_equal. apply log3_iter_alternative_spec; auto.
Qed.

Lemma log3_nat_spec : ∀ n, 0<n → 3^(log3_nat n) <= n < 3^S (log3_nat n).
Proof.
    intros.
    set (s := log3_nat n).
    replace n with (Nat.pred n + 1); try lia.
    apply log3_iter_spec; auto.
Qed.

Lemma log3_nat_mono_weak : ∀ n, log3_nat n <= log3_nat (S n).
Proof.
    intros. unfold log3_nat, log3_iter. destruct n. auto. simpl Nat.pred.
    rewrite (iter_plus (A:=(nat*nat*nat)) (n1:=S n) (n2:=1) (n3:=n) (0, 1, 1)). lia.
    sauto q: on.
Qed.

Lemma log3_nat_mono : ∀ n m, n <= m -> log3_nat n <= log3_nat m.
Proof.
    intros.
    replace m with ((m-n)+n); try lia.
    induction (m-n). auto.
    hauto use: Nat.lt_succ_r, Nat.add_succ_l, Nat.le_lt_trans, Nat.add_comm, log3_nat_mono_weak.
Qed.

Lemma log3_nat_0 : ∀ n, n < 3 <-> log3_nat n = 0.
Proof.
    intros.
    split.
    -   sauto lq: on.
    -   intros Log0. do 3 (destruct n; auto).
        assert (2 < S (S (S n))). lia. apply log3_nat_mono in H.
        replace (log3_nat 3) with 1 in H; compute; lia.
Qed.

Lemma log3_nat_mul3 : ∀ a, 2 < a -> log3_nat (3*a) = S (log3_nat a).
Proof.
    intros a Gt2. do 3 (destruct a; try lia). clear Gt2.
    set (n := S (S (S a))).
    assert (0 < n) as Gt0; try lia.
    pose proof (log3_nat_spec Gt0) as [Spec1_1 Spec1_2].
    clear Gt0. assert (0 < 3*n) as Gt0. lia.
    pose proof (log3_nat_spec Gt0) as [Spec2_1 Spec2_2].
    clear Gt0.

    pose proof (log3_nat_0 (3*n)) as log3_3_pos.

    replace (log3_nat (3*n)) with (S (Nat.pred (log3_nat (3*n)))) in Spec2_1; try lia.

    rewrite -> Nat.pow_succ_r', <- Nat.mul_le_mono_pos_l in Spec2_1; auto.
    rewrite -> Nat.pow_succ_r', <- Nat.mul_lt_mono_pos_l in Spec2_2; auto.

    assert (3^log3_nat n < 3^log3_nat (3*n)) as Lt1. lia.
    assert (3^Nat.pred (log3_nat (3*n)) < 3^S (log3_nat n)) as Lt2. lia.

    rewrite <- Nat.pow_lt_mono_r_iff in Lt1, Lt2; auto; lia.
Qed.

Lemma log3_nat_pow3 : ∀ a, log3_nat (3^a) = a.
Proof.
    intros. induction a. compute. auto.
    destruct a. auto.
    rewrite -> Nat.pow_succ_r', log3_nat_mul3; auto.
    hauto use: Nat.le_gt_cases, log3_nat_0 unfold: lt.
Qed.

Lemma log3_nat_succ_same_within_pow3 : ∀ a, 0 < a ->
    log3_nat (S a) = log3_nat a <-> ∀ p, 3^p ≠ (S a).
Proof.
    intros a Gt0. destruct a. lia.
    pose proof (log3_nat_spec Gt0) as [Spec1_1 Spec1_2]. clear Gt0.
    assert (0 < S (S a)) as Gt0; try lia. pose proof (log3_nat_spec Gt0) as [Spec2_1 Spec2_2]. clear Gt0.
    split.
    -   intros Eq p.
        hcrush use: Nat.lt_neq, Eq, log3_nat_pow3, Nat.succ_lt_mono unfold: lt.
    -   intros Neq.
        unfold log3_nat, log3_iter. simpl Nat.pred. rewrite iter_round.
        case_eq (iter (nat * nat * nat) a log3_iter_internal_round (0, 1, 1)). intros [log_up_counter up_counter] dist_next Iter.
        case_eq dist_next; auto.
        intro Dist_next. rewrite Dist_next in Iter.
        unfold log3_nat, log3_iter in Spec1_1, Spec1_2, Spec2_1, Spec2_2.
        simpl Nat.pred in Spec1_1, Spec1_2, Spec2_1, Spec2_2.
        rewrite Iter in Spec1_1, Spec1_2.
        rewrite -> iter_round, -> Iter in Spec2_1, Spec2_2.
        simpl in Spec2_1, Spec2_2, Spec1_2, Spec2_2.
        replace (3 ^ log_up_counter + (3 ^ log_up_counter + (3 ^ log_up_counter + 0))) with (3*3^log_up_counter) in
            Spec2_1, Spec2_2, Spec1_2; try lia.
        replace (3 * 3 ^ log_up_counter + (3 * 3 ^ log_up_counter + (3 * 3 ^ log_up_counter + 0)))
                with ((3*(3*3^(log_up_counter)))) in Spec2_2; try lia.
        rewrite <- Nat.pow_succ_r' in Spec1_2, Spec2_1, Spec2_2. rewrite <- Nat.pow_succ_r' in Spec2_2.
                assert (S (S a) = 3^(S log_up_counter)). lia. hauto lq: on.
Qed.

Lemma log3_nat_eq_succ_is_pow3 : ∀ a, log3_nat (S a) = S (log3_nat a) -> ∃ b, S a = 3^b.
Proof.
    intros a Eq.

    (* The goal is, I believe, true until this exists instantiates b with something that makes it false: *)
    exists (S (log3_nat a)).

    (* Now, the goal of S a = 3^S (log3_nat a) is false: For example, with a=0,
            S 0 = 3^S (log3_nat 0) <-> 1 = 3^1 <-> 1 = 3. *)
    Compute (S 0 = 3^S (log3_nat 0)).

    (* The only assumption in the context is Eq : log3_nat (S a) = S (log3_nat a), which is not contradictory; eg.
       log3_nat (S 2) = S (log3_nat 2). *)
    Compute (log3_nat (S 2) = S (log3_nat 2)).

    (* Furthermore, there are no axioms in this file, and all lemmas have been duly proved, so there should be nothing in the
       context to derive False.

       Nevertheless, hammer purports to find a solution using CVC4, which it then fails to reconstruct. *)

    hammer.

Here's what hammer outputs:

Extracting features...
Running provers (5 threads)...
CVC4 (knn-256) succeeded
Minimizing dependencies...
- dependencies: Arith.PeanoNat.Nat.lt_succ_diag_r, Arith.PeanoNat.Nat.pow_0_r, Eq, Arith.PeanoNat.Nat.neq_0_lt_0,
                Bug.log3_nat_pow3, Bug.log3_nat_succ_same_within_pow3, Arith.PeanoNat.Nat.lt_neq
- inversions: Init.Datatypes.nat
Reconstructing the proof...
Trying reconstruction batch 1...
Trying reconstruction batch 2...
Trying reconstruction batch 3...
Trying reconstruction batch 4...
Trying reconstruction batch 5...
Trying reconstruction batch 6...
Trying reconstruction batch 7...
Hammer failed: proof reconstruction failed.
@lukaszcz
Copy link
Owner

I assume this is something that should not happen: The hammer tactic reports that CVC4 found a solution for a goal which is provably false (and where nothing in the context allows deriving False). Then, hammer fails to reconstruct the proof, as it obviously should.

This can sometimes happen because the translation is neither sound nor complete for efficiency/practicability reasons. It is the reconstruction backend which ultimately ensures that nothing false can be proven. However, this should be very rare. I'll look into the example to see where it goes wrong exactly and if it's worth the effort to fix it.

(Furthermore, and I assume this is a different issue, if I execute everything in CoqIde straight to the end, including the "hammer." statement, coqproofworker.opt segfaults (no OOM). So what I do in coqtop to demonstrate this is to execute up to the point just before the "hammer." statement, wait until everything before that has been evaluated, and then proceed to the hammer statement.)

Yes, CoqIDE sometimes has problems executing code with "hammer" in it in one go (seems related to issue #86). However, this is a minor issue, because "hammer" is supposed to be used interactively -- you should never leave it in your scripts in normal usage.

@lukaszcz lukaszcz self-assigned this Oct 25, 2021
@lukaszcz lukaszcz changed the title hammer finds a "solution" for a provably false goal (then fails to reconstruct) ATP finds a non-reconstructible solution for a false statement (minor) Oct 25, 2021
@lukaszcz
Copy link
Owner

Well, actually, you're wrong - your example is (classically) provable.

(* Now, the goal of S a = 3^S (log3_nat a) is false: For example, with a=0,
        S 0 = 3^S (log3_nat 0) <-> 1 = 3^1 <-> 1 = 3. *)
Compute (S 0 = 3^S (log3_nat 0)).

Wrong counterexample, because with a = 0 your assumption Eq doesn't hold.

(* The only assumption in the context is Eq : log3_nat (S a) = S (log3_nat a), which is not contradictory; eg.
   log3_nat (S 2) = S (log3_nat 2). *)

But log3_nat (S 0) = 0 <> S (log3_nat 0) = 1

Proof for b = log3_nat a:

Since log3_nat (S a) = S (log3_nat a), we have log3_nat (S a) <> (log3_nat a). In particular, a > 0. Thus, classically, there is p with 3^p = S a, by log3_nat_succ_same_within_pow3. Then p = log3_nat (S a) by log3_nat_pow3. Thus p = S (log3_nat a) by the assumption Eq. Hence, S a = 3 ^ S (log3_nat a).

Please think through your examples more thoroughly before opening an issue.

@lukaszcz lukaszcz added the invalid This doesn't seem right label Oct 25, 2021
@sliedes
Copy link
Author

sliedes commented Oct 25, 2021

Oh, indeed. I'm sorry!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants