Skip to content

Commit

Permalink
Fix issue #5.
Browse files Browse the repository at this point in the history
Berghel's & Roach's implementation assumes that the source string
is shorter than or equal to the target in length. This patch will
enforce this, both by:

    - asserting this condition in `berghel_roach()`, and
    - changing `read_corpus` to always sort the two strings
      by length before returning them
  • Loading branch information
sp1ff committed Jul 2, 2024
1 parent efa50bb commit 3b480ab
Show file tree
Hide file tree
Showing 3 changed files with 24 additions and 3 deletions.
3 changes: 3 additions & 0 deletions src/br.cc
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ berghel_roach(const std::string &A,

size_t m = A.length();
size_t n = B.length();
// If this assertion fires, the caller has violated our precondition that
// `A` be less than or equal to `B` in length.
assert(m <= n);
// The minmal p will be at the end of diagonal k
ptrdiff_t k = n - m;
ptrdiff_t p = k;
Expand Down
2 changes: 2 additions & 0 deletions src/br.hh
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@
* \param B [in] the second of the two strings whose Damerau-Levenshtein
* distance is to be computed
*
* \pre \a A shall be less than or equal to \a B in length
*
* \param D [in] the known Damerau-Levenshtein distance between \a A & \a B
*
* \param verb [in] if true, produce verbose progress messages on \c stdout
Expand Down
22 changes: 19 additions & 3 deletions src/dl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,9 @@ typedef std::tuple<std::string, std::string, size_t> test_case;
* \param pout [in,out] A forward output iterator two which each test case read
* from \a pth shall be copied
*
* \post In each triple, the first string shall be less than or equal to the
* second in length
*
*
* In order to focus on the underlying algorithms for computing
* Damerau-Levenshtein distance, I've kept the format of test cases, and the
Expand All @@ -74,6 +77,13 @@ typedef std::tuple<std::string, std::string, size_t> test_case;
\endcode
*
*
* Nb this function will swap the two strings, if necessary, so that the second
* is at least as long as the first. This is to ensure that the precondition
* imposed by berghel_roach is met "up-front" and so we don't have to clutter
* the implementation, for now. I may come back & just update the implementation
* but for now I just want address issues #3 & #5.
*
*
*/

template <typename FOI> // Forward Output Iterator
Expand All @@ -99,9 +109,15 @@ read_corpus(const std::filesystem::path &pth, FOI pout)
throw std::runtime_error(stm.str());
}

*pout++ = make_tuple(line.substr(0, idx0),
line.substr(idx0 + 1, idx1 - idx0 - 1),
(size_t)stoi(line.substr(idx1 + 1)));
if (idx0 < idx1 - idx0 - 1) {
*pout++ = make_tuple(line.substr(0, idx0),
line.substr(idx0 + 1, idx1 - idx0 - 1),
(size_t)stoi(line.substr(idx1 + 1)));
} else {
*pout++ = make_tuple(line.substr(idx0 + 1, idx1 - idx0 - 1),
line.substr(0, idx0),
(size_t)stoi(line.substr(idx1 + 1)));
}
}

}
Expand Down

0 comments on commit 3b480ab

Please sign in to comment.