From 8c84d94fa7b2e7e6ad6ee69359169f94822a572a Mon Sep 17 00:00:00 2001 From: Gwyneth Llewelyn Date: Mon, 15 Apr 2024 14:56:10 +0100 Subject: [PATCH] feat!: Support for XML/HTML tags on the API calls and a myriad of other things (#20) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * :memo: Add badge to README.md * :pencil: change usage text * :muscle: Delete `stdin` flag * :memo: Update README * ⤴ Update version of library * 🥷 divide package * :bug: fix test and ci * :muscle: Add test * 🐛 Fix default config directory permission to 0755 To avoid "permission denied" * Add renovate.json (#11) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * chore: use conventional commit (#15) * Update module github.com/mattn/go-isatty to v0.0.19 (#12) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * fix(deps): update module github.com/urfave/cli/v2 to v2.25.7 (#13) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * chore(deps): update actions/checkout action to v4 (#16) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * chore(deps): update actions/setup-go action to v4 (#17) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * chore(deps): update goreleaser/goreleaser-action action to v5 (#18) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * build: update golang version * ci: update ci * ci: fix go-version * fix(deps): update module github.com/mattn/go-isatty to v0.0.20 (#19) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> * chore: update .goreleaser * build: update go.sum * Chore: small fixes Mostly English spelling and getting rid of deprecated Go functions. * chore: refactor code for returning error message The idea now is to rely more on `net/http` and less on our own internal table. * chore: major code refactoring This essentially separates the actual API call from the translator bits, so we can now work on the extra nifty features we need. * feat: add simple function to return usage * feat: adding usage call * chore: refactor code to avoid object ambiguities New code requires passing structs representing JSON objects, instead of relying on loose interface conversions which may fail. Stricter is better! * chore: add help for languages * fix: add = to flag `type` usage line * feat: adding autocomplete files * doc: mentioned the autocompletion feature * fix: correctly display versions and build dates Note: I don’t know where the “builtBy” parameter comes from; currently, it needs to be force-pushed at bildtime with a -X tag to the linker. * chore: refactor more code, add option for glossary This was mostly meant as an experiment which can later be copied & pasted for other very similar options. apiCall() gained a new parameter, the method (because some things in the API stupidly use GET and not POST) * docs: better organise the explanatons, add links * chore: refactoring code — moving setup to “Before” * feat: add more flag support, upgrade dependencies * fix: revert changes: init must be done in main() I’ve attempted to do all initialisation chores under the “Before:” for the main cli.App loop. However, this wasn’t retrieving the data properly. Moving everything back to where it was in main(). * feat: major refactoring, we’ll get rid of settings In essence, we can use and reuse the DeepLClient type/object as the ‘de facto’ settings structure, we just need to find a way to read/save settings (possibly with cli-altsrv) * chore: adding ChatGPT-generated texts for testing * feat: adding support for (simple) debugging * test: test data in XML * docs: update README with latest changes * add readline package * fix: interactive prompt now works with readline * Update README.md Correction submitted by @coderabbitai Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * chore: upgrade to latest versions yadda yadda * Bug: fix expected error text for DEEPL_TOKEN I had changed the text in main.g0, but forgot to update it in main_test.go * Docs: add backticks on comments * Docs: changes suggested by @coderabbitai * Docs: comments ending with period * Fix: match correct error text (changed on main.go) * Bug: possible scoping issues with deeplToken (?) Not confirmed. But… this way we can be sure that it gets properly initialised and not scooped up into the “wrong” place… * Chore: bump year to 2024 * Fix: add timeout as per @coderabbitai suggestion * Fix: add try-catch as per @coderabbitai suggestion * Chore: add test for Exists; err.Error() is redundant … at least, when called with the text formatting functions derived from the `fmt` package. * Docs: add comment * Fix: check for edge case of empty string As suggested by @coderabbitai Also: return nil and not []string{}; we’re supposed to check for the `err` code, and nil is returned avoiding memory allocation of something that will never be used… * Chore: use http.StatusXXX instead of numbers It’s more idiomatic that way, even if not necessarily easier to read (everybody knows their HTTP error codes by heart, right? no? well, then perhaps it’s better to follow the usual practice of Go’s core developers…) * Bug: missing `)` * Bug: fix a testing bug The reason for it was that a potential JSON error was not being correctly caught; this was flagged by the test suite, and therefore I sort of fixed it. Now it correctly passes all tests it’s supposed to pass :) * Doc: fix stupid typo * Docs: add comment made by @coderabbitai Future TODO — have `Languages()` optionally reply in structured formats. * Fix: error checking for writing configuration file Caught by @coderabbitai --------- Co-authored-by: mochi-MizLab Co-authored-by: Osamu Takiya Co-authored-by: Omochice <44566328+Omochice@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> --- .editorconfig | 23 ++ .github/workflows/ci.yml | 2 +- .gitignore | 58 +++ LICENSE | 2 +- README.md | 106 +++-- autocomplete/bash_autocomplete | 35 ++ autocomplete/powershell_autocomplete.ps1 | 13 + autocomplete/zsh_autocomplete | 20 + deepl/api.go | 109 +++++ deepl/deepl.go | 118 +++--- deepl/deepl_test.go | 135 ++++-- deepl/status.go | 117 ++++++ go.mod | 9 +- go.sum | 18 +- main.go | 503 +++++++++++++++++++---- main_test.go | 48 ++- test data/test-1.txt | 1 + test data/test-1.xml | 19 + test data/test-2.txt | 1 + 19 files changed, 1097 insertions(+), 240 deletions(-) create mode 100644 .editorconfig create mode 100644 autocomplete/bash_autocomplete create mode 100644 autocomplete/powershell_autocomplete.ps1 create mode 100644 autocomplete/zsh_autocomplete create mode 100644 deepl/api.go create mode 100644 deepl/status.go create mode 100644 test data/test-1.txt create mode 100644 test data/test-1.xml create mode 100644 test data/test-2.txt diff --git a/.editorconfig b/.editorconfig new file mode 100644 index 0000000..903cb36 --- /dev/null +++ b/.editorconfig @@ -0,0 +1,23 @@ +# top-most EditorConfig file +root = true + +# Unix-style newlines with a newline ending every file +[*] +end_of_line = lf +charset = utf-8 +indent_style = tab +indent_size = tab +tab_width = 4 +trim_trailing_whitespace = true + +# The property below is not yet universally supported +[*.md] +max_line_length = 108 +word_wrap = true +# Markdown sometimes uses two spaces at the end to +# mark soft line breaks +trim_trailing_whitespace = false + +[*.css] +indent_style = space +indent_size = 2 diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 48a3e78..8d03a46 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -19,4 +19,4 @@ jobs: - name: Dependencies run: go get -v -t -d ./... - name: Go test - run: go test -v ./... + run: go test -v -timeout 30m ./... diff --git a/.gitignore b/.gitignore index 849ddff..b6a6dba 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,59 @@ +~* +# Stupid macOS temporary files + +# General +.DS_Store +.AppleDouble +.LSOverride + +# Icon must end with two \r +Icon + + +Icon? + +# Thumbnails +._* +nohup.out + +# Files that might appear in the root of a volume +.DocumentRevisions-V100 +.fseventsd +.Spotlight-V100 +.TemporaryItems +.Trashes +.VolumeIcon.icns +.com.apple.timemachine.donotpresent + +# Directories potentially created on remote AFP share +.AppleDB +.AppleDesktop +Network Trash Folder +Temporary Items +.apdisk + +# Stuff from the Nova editor +.nova +node_modules +package.json +package-lock.json +.eslintrc.yml +.prettierrc.json +.env + +# +logs +testdata + +profile.out +coverage.html +coverage.txt +delay.txt + +*.log + +# executables +deepl-translate-cli + +# originally by @Omochice dist/ diff --git a/LICENSE b/LICENSE index 5ed4456..d7bfc4a 100644 --- a/LICENSE +++ b/LICENSE @@ -1,6 +1,6 @@ MIT License -Copyright (c) 2021 Omochice +Copyright (c) 2021,2024 Omochice Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal diff --git a/README.md b/README.md index 8e36752..1679a0d 100644 --- a/README.md +++ b/README.md @@ -1,67 +1,125 @@ [![go-test](https://github.com/Omochice/deepl-translate-cli/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/Omochice/deepl-translate-cli/actions/workflows/ci.yml) [![goreleaser](https://github.com/Omochice/deepl-translate-cli/actions/workflows/autorelease.yml/badge.svg)](https://github.com/Omochice/deepl-translate-cli/actions/workflows/autorelease.yml) -# ✍️Deepl translate cli +# ✍️ [DeepL](https://www.deepl.com) Translate CLI (Unofficial) ![sampleMovie](https://i.gyazo.com/09a4801d44e85980f83666dceda0166e.gif) - ## Installation -### By [github release page](https://github.com/Omochice/deepl-translate-cli/releases) +### Via the [GitHub release page](https://github.com/Omochice/deepl-translate-cli/releases) 1. Download zipped file from [Releases](https://github.com/Omochice/deepl-translate-cli/releases). 2. Unzip downloaded file. -3. Move executable file into directory in PATH. (like `$HOME/.local/bin/`) +3. Move the executable file into a directory in your `PATH` (e.g., `$HOME/.local/bin/`). ### By `go install` -```sh + +```console go install github.com/Omochice/deepl-translate-cli@latest ``` -## Usage +## Basic usage -1. Get deepl access token. See [here](https://www.deepl.com/docs-api). +1. First, [get a DeepL access token](https://www.deepl.com/docs-api). It looks like a [UUID](https://en.wikipedia.org/wiki/Universally_unique_identifier) with the characters `:fx` appended to it. -2. Set access token as `DEEPL_TOKEN` +2. Assign the access token to the `DEEPL_TOKEN` environment variable. - ex. in `Bash`. + e.g., in `bash`: - ```bash - export DEEPL_TOKEN + ```console + export DEEPL_TOKEN= ``` -3. On the first run, if `/.config/deepl-translate-cli/setting.json` does not exist, make it automatically. +3. On the first run, if `$HOME/.config/deepl-translate-cli/setting.json` does not exist, it gets automatically created. + + The format of the settings file is as shown below: - The format of setting file is below. ```json { - "source_lang": "FILLIN", - "target_lang": "FILLIN" + "source_lang": "FILLIN", + "target_lang": "FILLIN" } ``` - For write setting file, see [this page](https://www.deepl.com/docs-api/translating-text/request/). + For all existing languages that can be translated, as well as their identifying tags, see [this page](https://www.deepl.com/docs-api/translating-text/request/). You can also query the server directly: + ```console + deepl-translate-cli languages -4. If file path is not specified, load text from STDIN. + ``` - Currentry, only one path can be specified as argument. +4. If the filename path is not specified, text is read from `STDIN`. + Currently, only one file path can be specified as an argument. +- If you want to select `source_lang`/`target_lang` _without_ using the settings file, you can use the command-line parameters `--source_lang (-s)` and `target_lang (-t)` instead. -- If you want to use `source_lang`/`target_lang` without using setting file, try to use `--source_lang (-s)` or `target_lang (-t)` argument. - ```console cat | deepl-translate-cli --source_lang ES --target_lang DE ``` -- If you use Pro plan, use `--pro` flag to switch endpoint URL. - - _this feature is not tested because I use free plan._ - +- If you are a Pro plan user, switch to the correct endpoint URL with the `--pro` flag. + + _**Note**: This feature has not been tested, because the developers only have a free plan._ + ```console cat | deepl-translate-cli --pro + ``` + +- Note that it's also possible to run `deepl-translate-cli` in interactive mode, when the input comes from a TTY and not a pipe. In this case, only the first sentence typed (terminated by pressing **ENTER**) will be sent via the API for translation. The before-mentioned flags will also be available in this mode. + +## More advanced usage + +`deepl-translate-cli` now includes more commands, namely, + +- `deepl-translate-cli usage` which will query DeepL to return the number of characters still available for translations. +- `deepl-translate-cli languages` will show the languages currently supported by DeepL. By default, only the _source_ languages are listed; with the `--type target` flag, it will also show those languages (and variants) that are available as translation targets. +- `deepl-translate-cli glossary-language-pairs` retrieves the list of language pairs supported by the glossary feature. Right now, it only does that — you cannot use glossaries yet. + +DeepL is also able to translate structured text, i.e. text inside HTML or XML tags. This requires using a few more parameters; see `./deepl-translate-cli translate --help` for a list of all the options. While all are supported and sent to DeepL for processing, there are many possible combinations (some of which make no sense) which haven't been thoroughly tested. + +## Shell autocompletion (⚠️ experimental) + +Under the `autocomplete` folder are three scripts to enable auto-completion (for `bash`, `zsh`, and PowerShell). To use these, do the following (the example is for `bash`): + +```console +PROG=deepl-translate-cli source autocomplete/bash_autocomplete +``` + +## ⚠️ Warning! ⚠️ + +If you run the tests, these may actually use your API Token, and consume some of your monthly credits! + +Make sure you call `deepl-translate-cli usage` every now and then, to be sure you're well within your limits (half a million characters per month for free accounts; however, unlike other services, Unicode characters just count as one character each!). + +## TODO + +- Support uploading documents for translation (the API allows that as well) +- Better configuration/settings support (the system, as it is now, offers too few choices) +- Make calls purely in JSON (as opposed to using `application/x-www-form-urlencoded` to post data, while retrieving the results in JSON) +- Write tests! +- Add more glossary-related options + +## Known bugs 🪳 + +- When trying to run help _without_ a valid authentication token (which will be the case), the error message is confusing +- Help formatting is quite a bit off on many of the (larger) entries +- Wrong orders of parameters/commands give unexpected errors +- You can only give _one_ filename as input (to do more, you'll have to use shell scripting to browse through all files and feed them to `deepl-translate-cli`) +- The interactive command has some annoing quirks and just translates one single (non-structured) sentence; additionally, it has a _huge_ overhead (but it sort of works) + +## Building + +If you wish to embed the build's author in the executable binary (to distinguish _your_ build from someone else's), you can build this with + +```console +go build -ldflags "-X main.TheBuilder=" +``` + +## Disclaimer + +None of the developers are affiliated with [DeepL](https://www.deepl.com/) and this code should not be considered to represent an endorsement by DeepL or any of its affiliates, partners or subsidiaries. It is released in the hope that it might be helpful to the Go programming community (which lacks official support by DeepL at the time of writing), without any warranty whatsoever (see [LICENSE](./LICENSE) for more information). diff --git a/autocomplete/bash_autocomplete b/autocomplete/bash_autocomplete new file mode 100644 index 0000000..a45d380 --- /dev/null +++ b/autocomplete/bash_autocomplete @@ -0,0 +1,35 @@ +#! /bin/bash + +: ${PROG:=$(basename ${BASH_SOURCE})} + +# Macs have bash3 for which the bash-completion package doesn't include +# _init_completion. This is a minimal version of that function. +_cli_init_completion() { + COMPREPLY=() + _get_comp_words_by_ref "$@" cur prev words cword +} + +_cli_bash_autocomplete() { + if [[ "${COMP_WORDS[0]}" != "source" ]]; then + local cur opts base words + COMPREPLY=() + cur="${COMP_WORDS[COMP_CWORD]}" + if declare -F _init_completion >/dev/null 2>&1; then + _init_completion -n "=:" || return + else + _cli_init_completion -n "=:" || return + fi + words=("${words[@]:0:$cword}") + if [[ "$cur" == "-"* ]]; then + requestComp="${words[*]} ${cur} --generate-shell-completion" + else + requestComp="${words[*]} --generate-shell-completion" + fi + opts=$(eval "${requestComp}" 2>/dev/null) + COMPREPLY=($(compgen -W "${opts}" -- ${cur})) + return 0 + fi +} + +complete -o bashdefault -o default -o nospace -F _cli_bash_autocomplete $PROG +unset PROG diff --git a/autocomplete/powershell_autocomplete.ps1 b/autocomplete/powershell_autocomplete.ps1 new file mode 100644 index 0000000..5c70a28 --- /dev/null +++ b/autocomplete/powershell_autocomplete.ps1 @@ -0,0 +1,13 @@ +$fn = $($MyInvocation.MyCommand.Name) +$name = $fn -replace "(.*)\.ps1$", '$1' +Register-ArgumentCompleter -Native -CommandName $name -ScriptBlock { + param($commandName, $wordToComplete, $cursorPosition) + $other = "$wordToComplete --generate-shell-completion" + Try { + Invoke-Expression $other | ForEach-Object { + [System.Management.Automation.CompletionResult]::new($_, $_, 'ParameterValue', $_) + } + } Catch { + Write-Error "Error generating completions: $_" + } + } \ No newline at end of file diff --git a/autocomplete/zsh_autocomplete b/autocomplete/zsh_autocomplete new file mode 100644 index 0000000..622143d --- /dev/null +++ b/autocomplete/zsh_autocomplete @@ -0,0 +1,20 @@ +#compdef $PROG + +_cli_zsh_autocomplete() { + local -a opts + local cur + cur=${words[-1]} + if [[ "$cur" == "-"* ]]; then + opts=("${(@f)$(${words[@]:0:#words[@]-1} ${cur} --generate-shell-completion)}") + else + opts=("${(@f)$(${words[@]:0:#words[@]-1} --generate-shell-completion)}") + fi + + if [[ "${opts[1]}" != "" ]]; then + _describe 'values' opts + else + _files + fi +} + +compdef _cli_zsh_autocomplete $PROG diff --git a/deepl/api.go b/deepl/api.go new file mode 100644 index 0000000..877270f --- /dev/null +++ b/deepl/api.go @@ -0,0 +1,109 @@ +package deepl + +import ( + "encoding/json" + "fmt" + "io" + "net/http" + "net/url" + "os" + "strings" +) + +// Generic API call, takes method, URL parameters and a JSON object to fill, +// validates & parses the response and unmarshals it into the JSON object, +// or throws an error. +// NOTE: Closes the HTTP response that was opened. +func (c *DeepLClient) apiCall(method string, params url.Values, jsonObject any) error { + // If we're debugging, show what was printed out: + if c.Debug > 1 { + fmt.Fprintf(os.Stderr, "Values being called using %q to API endpoint (%s): %q\n", + method, + c.Endpoint, + params.Encode(), + ) + } + + // http.PostForm() unfortunately doesn't allow us to set headers, and we need to send the authorization + // in the headers, not in the body... (gwyneth 20231104) + client := &http.Client{} + req, err := http.NewRequest(method, c.Endpoint, strings.NewReader(params.Encode())) + if err != nil { + return err + } + req.Header.Add("Content-Type", "application/x-www-form-urlencoded") + req.Header.Add("Authorization", "DeepL-Auth-Key " + c.AuthKey) + resp, err := client.Do(req) + if err != nil { + return err + } + defer resp.Body.Close() + + if err := validateResponse(resp); err != nil { + return err + } + err = parseResponse(resp.Body, jsonObject) + if err != nil { + return err + } + return nil +} + +// Validates the response based on its status code, decoding the returned JSON. +// If the status code is "normal", does nothing (`resp` remains untouched and open). +func validateResponse(resp *http.Response) error { + // http.StatusOK - 200; http.StatusMultipleChoices - 300 + if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices { + // Parsed JSON error data. + var data map[string]interface{} + baseErrorText := fmt.Sprintf("Invalid response [%d %s]", + resp.StatusCode, + statusText(resp.StatusCode)) + // NOTE: code simplification, we now just use the "standard" error codes from `net/http`. + // NOTE: on the following code, @Omochice opted for skipping the traditional JSON object struct, + // going directly for the semi-raw map[string]interface{} reply instead. (gwyneth 20231103) + e := json.NewDecoder(resp.Body).Decode(&data) + if e != nil { + // Added the response body as suggested by @coderabbitai + return fmt.Errorf("%s, JSON decoding error was: %s [data received: %v]", baseErrorText, e, resp.Body) + } else { + return fmt.Errorf("%s, %s", baseErrorText, data["message"]) + } + } + return nil +} + +// Generic API response parser, returns whatever the JSON object was. +// The `jsonObject` to be passed can be anything; returns error from parsing, +// or nil if all's ok. +func parseResponse(resp io.ReadCloser, jsonObject any) error { + body, err := io.ReadAll(resp) + + if err != nil { + return fmt.Errorf("%s (occurred while parsing response)", err.Error()) + } + err = json.Unmarshal(body, &jsonObject) + if err != nil { + return fmt.Errorf("%s (occurred while parsing response)", err.Error()) + } + return nil +} + +// Returns an error string based on the HTTP status code, as defined +// by the `net/http` package, plus the additional error(s) from the DeepL API. +// SEE: https://www.deepl.com/docs-api/api-access/error-handling +// NOTE: Replaces former mechanism with a map. +func statusText(statusCode int) string { + // see if Go knows about this error code: + respString := http.StatusText(statusCode) // returns empty string if unknown error code. + if respString == "" { + // currently, the DeepL API only adds error code 456, but it may add more in the future... + switch statusCode { + case 456: + respString = "Quota exceeded. The character limit has been reached." + default: + respString = "Unknown HTTP error." + } + } + return respString +} diff --git a/deepl/deepl.go b/deepl/deepl.go index e6477c9..f50a826 100644 --- a/deepl/deepl.go +++ b/deepl/deepl.go @@ -1,20 +1,31 @@ package deepl import ( - "encoding/json" "fmt" - "io/ioutil" "net/http" "net/url" + "strconv" ) type DeepL interface { - Translate(Text string, sourceLang string, targetLang string) + Translate(Text string) } type DeepLClient struct { - Endpoint string - AuthKey string + Endpoint string `json:"endpoint"` // API endpoint, which differs between the Free and the Pro plans. + AuthKey string `json:"authkey"` // API token, looks like a UUID with ":fx". appended to it. + SourceLang string `json:"source_lang"` + TargetLang string `json:"target_lang"` + LanguagesType string `json:"type"` // For the "languages" utility call, either "source" or "target". + IsPro bool `json:"-"` + TagHandling string `json:"tag_handling"` // "xml", "html". + SplitSentences string `json:"split_sentences"` // "0", "1", "norewrite". + PreserveFormatting string `json:"preserve_formatting"` // "0", "1". + OutlineDetection int `json:"outline_detection"` // Integer; 0 is default. + NonSplittingTags string `json:"non_splitting_tags"` // List of comma-separated XML tags. + SplittingTags string `json:"splitting_tags"` // List of comma-separated XML tags. + IgnoreTags string `json:"ignore_tags"` // List of comma-separated XML tags. + Debug int `json:"debug"` // Debug/verbosity level, 0 is no debugging. } type DeepLResponse struct { @@ -22,24 +33,38 @@ type DeepLResponse struct { } type Translated struct { - DetectedSourceLaguage string `json:"detected_source_language"` - Text string `json:"text"` + DetectedSourceLanguage string `json:"detected_source_language"` + Text string `json:"text"` } -func (c *DeepLClient) Translate(text string, sourceLang string, targetLang string) ([]string, error) { +// API call to translate text from sourceLang to targetLang. +func (c *DeepLClient) Translate(text string) ([]string, error) { + // @coderabbitai suggested to test for `text` being empty. + // This should _not_ happen but it's nevertheless a good idea! (gwyneth 20240412) + if len(text) == 0 { + return nil, fmt.Errorf("received empty string for translation") + } + + // TODO(gwyneth): Make the call with JSON, it probably makes much more sense that way. params := url.Values{} - params.Add("auth_key", c.AuthKey) - params.Add("source_lang", sourceLang) - params.Add("target_lang", targetLang) - params.Add("text", text) - resp, err := http.PostForm(c.Endpoint, params) + params.Add("auth_key", c.AuthKey) + params.Add("source_lang", c.SourceLang) + params.Add("target_lang", c.TargetLang) + params.Add("type", c.LanguagesType) + params.Add("tag_handling", c.TagHandling) + params.Add("split_sentences", c.SplitSentences) + params.Add("preserve_formatting", c.PreserveFormatting) + params.Add("outline_detection", strconv.Itoa(c.OutlineDetection)) + params.Add("non_splitting_tags", c.NonSplittingTags) + params.Add("splitting_tags", c.SplittingTags) + params.Add("ignore_tags", c.IgnoreTags) + params.Add("text", text) - if err := ValidateResponse(resp); err != nil { - return []string{}, err - } - parsed, err := ParseResponse(resp) + var parsed DeepLResponse + + err := c.apiCall(http.MethodPost, params, &parsed) if err != nil { - return []string{}, err + return nil, err } r := []string{} for _, translated := range parsed.Translations { @@ -47,57 +72,10 @@ func (c *DeepLClient) Translate(text string, sourceLang string, targetLang strin } return r, nil } - -var KnownErrors = map[int]string{ - 400: "Bad request. Please check error message and your parameters.", - 403: "Authorization failed. Please supply a valid auth_key parameter.", - 404: "The requested resource could not be found.", - 413: "The request size exceeds the limit.", - 414: "The request URL is too long. You can avoid this error by using a POST request instead of a GET request, and sending the parameters in the HTTP body.", - 429: "Too many requests. Please wait and resend your request.", - 456: "Quota exceeded. The character limit has been reached.", - 503: "Resource currently unavailable. Try again later.", - 529: "Too many requests. Please wait and resend your request.", -} // this from https://www.deepl.com/docs-api/accessing-the-api/error-handling/ - -func ValidateResponse(resp *http.Response) error { - if resp.StatusCode < 200 || resp.StatusCode >= 300 { - var data map[string]interface{} - baseErrorText := fmt.Sprintf("Invalid response [%d %s]", - resp.StatusCode, - http.StatusText(resp.StatusCode)) - if t, ok := KnownErrors[resp.StatusCode]; ok { - baseErrorText += fmt.Sprintf(" %s", t) - } - e := json.NewDecoder(resp.Body).Decode(&data) - if e != nil { - return fmt.Errorf("%s", baseErrorText) - } else { - return fmt.Errorf("%s, %s", baseErrorText, data["message"]) - } - } - return nil -} - -func ParseResponse(resp *http.Response) (DeepLResponse, error) { - var responseJson DeepLResponse - body, err := ioutil.ReadAll(resp.Body) - - if err != nil { - err := fmt.Errorf("%s (occurred while parse response)", err.Error()) - return responseJson, err - } - err = json.Unmarshal(body, &responseJson) - if err != nil { - err := fmt.Errorf("%s (occurred while parse response)", err.Error()) - return responseJson, err - } - return responseJson, err -} - -func GetEndpoint(IsPro bool) string { - if IsPro { - return "https://api.deepl.com/v2/translate" +// Returns the base DeepL API endpoint for either the Free or the Pro Plan (if IsPro is true). +func GetEndpoint(isPro bool) string { + if isPro { + return "https://api.deepl.com/v2" } - return "https://api-free.deepl.com/v2/translate" + return "https://api-free.deepl.com/v2" } diff --git a/deepl/deepl_test.go b/deepl/deepl_test.go index 3e33b2f..73c3ce0 100644 --- a/deepl/deepl_test.go +++ b/deepl/deepl_test.go @@ -4,7 +4,7 @@ import ( "bytes" "encoding/json" "fmt" - "io/ioutil" + "io" "net/http" "reflect" "strings" @@ -21,19 +21,19 @@ func TestValidateResponse(t *testing.T) { t.Fatalf("Error within json.Marshal") } testResponse := http.Response{ - Status: "200 test", - StatusCode: 200, - Body: ioutil.NopCloser(bytes.NewBuffer(testBody)), + Status: fmt.Sprintf("%d %s", http.StatusOK, http.StatusText(http.StatusOK)), // 200 OK + StatusCode: http.StatusOK, + Body: io.NopCloser(bytes.NewBuffer(testBody)), } for c := 100; c < 512; c++ { if http.StatusText(c) == "" { - // unused stasus code + // unused status code continue } testResponse.Status = fmt.Sprintf("%d test", c) testResponse.StatusCode = c - err := ValidateResponse(&testResponse) + err := validateResponse(&testResponse) if c >= 200 && c < 300 { errorText = "If status code is between 200 and 299, no error should be returned" if err != nil { @@ -42,19 +42,19 @@ func TestValidateResponse(t *testing.T) { continue } if err == nil { - t.Fatalf("If Status code is not 200 <= c < 300, should occur error\nStatus code: %d, Response: %v", + t.Fatalf("If status code is not 200 <= c < 300, an error should occur\nStatus code: %d, Response: %v", c, testResponse) } else { if !strings.Contains(err.Error(), http.StatusText(c)) { errorText = fmt.Sprintf("Error text should include Status Code(%s)\nActual: %s", http.StatusText(c), err.Error()) + t.Fatalf("%s", errorText) // was this missing? } - if statusText, ok := KnownErrors[c]; ok { // errored - if !strings.Contains(err.Error(), statusText) { - errorText = fmt.Sprintf("If stasus code is knownded, the text should include it's error text\nExpected: %s\nActual: %s", - statusText, err.Error()) - t.Fatalf("%s", errorText) - } + statusText := fmt.Sprintf("%d", c) // NOTE: we want to make sure this is a string, not a (single) rune (gwyneth 20231101) + if !strings.Contains(err.Error(), statusText) { + errorText = fmt.Sprintf("If status code is known, the text should include its error text\nExpected: %s\nActual: %s", + statusText, err.Error()) + t.Fatalf("%s", errorText) } } } @@ -62,13 +62,13 @@ func TestValidateResponse(t *testing.T) { invalidResp := http.Response{ Status: "444 not exists error", StatusCode: 444, - Body: ioutil.NopCloser(bytes.NewBuffer([]byte("test"))), + Body: io.NopCloser(bytes.NewBuffer([]byte("test"))), } - err = ValidateResponse(&invalidResp) + err = validateResponse(&invalidResp) if err == nil { - t.Fatalf("If status code is invalid, should occur error\nActual: %d", invalidResp.StatusCode) + t.Fatalf("If status code is invalid, an error should occur\nActual: %d", invalidResp.StatusCode) } else if !strings.HasSuffix(err.Error(), "]") { - t.Fatalf("If body is invalid as json, error is formated as %s", + t.Fatalf("If body is invalid as JSON, error is formatted as %s", "`Invalid response [statuscode statustext]`") } @@ -76,21 +76,21 @@ func TestValidateResponse(t *testing.T) { validResp := http.Response{ Status: "444 not exists error", StatusCode: 444, - Body: ioutil.NopCloser(bytes.NewBuffer([]byte(fmt.Sprintf(`{"message": "%s"}`, expectedMessage)))), + Body: io.NopCloser(bytes.NewBuffer([]byte(fmt.Sprintf(`{"message": "%s"}`, expectedMessage)))), } - err = ValidateResponse(&validResp) + err = validateResponse(&validResp) if err == nil { - t.Fatalf("If is status code invalid, should occur error\nActual: %d", validResp.StatusCode) + t.Fatalf("If the status code is invalid, an error should occur\nActual: %d", validResp.StatusCode) } else if !strings.HasSuffix(err.Error(), expectedMessage) { - t.Fatalf("If body is valid as json, error suffix should have `%s`\nActual: %s", + t.Fatalf("If the body is valid JSON, the error suffix should have `%s`\nActual: %s", expectedMessage, err.Error()) } } -func TestParseResponse(t *testing.T) { +func TestParseTranslationResponse(t *testing.T) { baseResponse := http.Response{ - Status: "200 test", - StatusCode: 200, + Status: fmt.Sprintf("%d %s", http.StatusOK, http.StatusText(http.StatusOK)), // 200 OK + StatusCode: http.StatusOK, // 200 } { input := map[string][]map[string]string{ @@ -107,15 +107,15 @@ func TestParseResponse(t *testing.T) { if err != nil { t.Fatal("Error within json.Marshal") } - baseResponse.Body = ioutil.NopCloser(bytes.NewBuffer(b)) - res, err := ParseResponse(&baseResponse) + baseResponse.Body = io.NopCloser(bytes.NewBuffer(b)) + var trans DeepLResponse + err = parseResponse(baseResponse.Body, &trans) if err != nil { - t.Fatalf("If input is valid, should not occur error\n%s", err.Error()) + t.Fatalf("If the input is valid, no errors should occur\n%s", err.Error()) } - - if len(res.Translations) != len(input["Translations"]) { + if len(trans.Translations) != len(input["Translations"]) { t.Fatalf("Length of result.Translations should be equal to input.Translations\nExpected: %d\nActual: %d", - len(res.Translations), len(input["Translations"])) + len(trans.Translations), len(input["Translations"])) } } { @@ -132,19 +132,80 @@ func TestParseResponse(t *testing.T) { if err != nil { t.Fatal("Error within json.Marshal") } - baseResponse.Body = ioutil.NopCloser(bytes.NewBuffer(b)) - res, err := ParseResponse(&baseResponse) + baseResponse.Body = io.NopCloser(bytes.NewBuffer(b)) + var trans DeepLResponse + err = parseResponse(baseResponse.Body, &trans) if err != nil { - t.Fatalf("If input is valid, should not occur error\n%s", err.Error()) + t.Fatalf("If the input is valid, no error should occur\n%s", err.Error()) } - if len(res.Translations) != len(input["Translations"]) { + if len(trans.Translations) != len(input["Translations"]) { t.Fatalf("Length of result.Translations should be equal to input.Translations\nExpected: %d\nActual: %d", - len(res.Translations), len(input["Translations"])) + len(trans.Translations), len(input["Translations"])) } - resType := reflect.ValueOf(res.Translations[0]).Type() + resType := reflect.ValueOf(trans.Translations[0]).Type() expectedNumOfField := 2 if resType.NumField() != expectedNumOfField { - t.Fatalf("Length of Translated's filed should be equal %d\nActual: %d", expectedNumOfField, resType.NumField()) + t.Fatalf("Length of translated field should be equal to %d\nActual: %d", expectedNumOfField, resType.NumField()) } } } + +// added by @coderabbitai +func TestValidateResponseEdgeCases(t *testing.T) { + // Test case: Empty response body + emptyResp := http.Response{ + Status: fmt.Sprintf("%d %s", http.StatusNoContent, http.StatusText(http.StatusNoContent)), // 204 No Content + StatusCode: http.StatusNoContent, + Body: io.NopCloser(bytes.NewBuffer([]byte(""))), + } + err := validateResponse(&emptyResp) + if err != nil { + t.Errorf("Expected no error for empty response body, got: %v", err) + } + + // Test case: Non-JSON content in response body + // Note: this test has to be done with status code below 200 or over 300! (gwyneth 20240413) + nonJSONResp := http.Response{ + // Status: "200 OK", + // StatusCode: 200, + Status: fmt.Sprintf("%d %s", http.StatusNotFound, http.StatusText(http.StatusNotFound)), // 404 Not Found + StatusCode: http.StatusNotFound, + Body: io.NopCloser(bytes.NewBuffer([]byte("Not a JSON response"))), + } + err = validateResponse(&nonJSONResp) + if err == nil { + t.Errorf("Expected error for non-JSON response body; got no error instead") + } else if !strings.Contains(err.Error(), "JSON decoding error") { + t.Errorf("Unexpected error for non-JSON response body: %v", err) + } + // Test case: JSON response with unexpected fields + unexpectedJSONResp := http.Response{ + Status: fmt.Sprintf("%d %s", http.StatusOK, http.StatusText(http.StatusOK)), // 200 OK + StatusCode: http.StatusOK, // 200 + Body: io.NopCloser(bytes.NewBuffer([]byte(`{"unexpected": "field"}`))), + } + err = validateResponse(&unexpectedJSONResp) + if err != nil { + t.Errorf("Expected no error for JSON with unexpected fields, got: %v", err) + } + + // Test case: Network error simulation (this would typically be handled outside this function, but included for completeness). + // This case would need to mock network failure, which is outside the scope of validateResponse function as it does not handle netwrk operations directly. +} + +// A preliminary test for the Translate() function. I'm no good at this, so... baby steps. +// (gwyneth 20240412) +func TestTranslate(t *testing.T) { + var deeplTestClient DeepLClient + + // Translating an empty string should give an error. + translateds, err := deeplTestClient.Translate("") + if err == nil { + t.Errorf("Expected an error from attempt to translate empty string") + } else { + t.Logf("✅🆗 Attempt to translate an empty string returned expected error %v", err) + } + if translateds != nil { + t.Errorf("The translation ought to have been empty, but got the following instead: %v", translateds) + } +} diff --git a/deepl/status.go b/deepl/status.go new file mode 100644 index 0000000..9a82daa --- /dev/null +++ b/deepl/status.go @@ -0,0 +1,117 @@ +// This file handles non-translation calls, namely for usage, languages supported, and so forth. +package deepl + +import ( + // "encoding/json" + "fmt" + "net/http" + "net/url" +) + +type DeepLUsageResponse struct { + CharacterCount int `json:"character_count,omitempty"`// Characters translated so far in the current billing period. + CharacterLimit int `json:"character_limit,omitempty"`// Current maximum number of characters that can be translated per billing period. + DocumentLimit int `json:"document_limit,omitempty"` // Documents translated so far in the current billing period. + DocumentCount int `json:"document_count,omitempty"` // Current maximum number of documents that can be translated per billing period. + TeamDocumentLimit int `json:"team_document_limit,omitempty"` // Documents translated by all users in the team so far in the current billing period. + TeamDocumentCount int `json:"team_document_count,omitempty"` // Current maximum number of documents that can be translated by the team per billing period. +} + +// Check Usage and Limits — +// Retrieve usage information within the current billing period together with the corresponding account limits. +func (c *DeepLClient) Usage() (string, error) { + params := url.Values{} + params.Add("auth_key", c.AuthKey) + + var resp DeepLUsageResponse + + err := c.apiCall(http.MethodPost, params, &resp) + if err != nil { + return "", err + } + + return fmt.Sprintf( + "Character Count: %d; Character Limit: %d; Document Limit: %d; Document Count: %d; Team Document Limit: %d; Team Document Count: %d.", + resp.CharacterCount, + resp.CharacterLimit, + resp.DocumentLimit, + resp.DocumentCount, + resp.TeamDocumentLimit, + resp.TeamDocumentCount), + nil +} + +// The /languages API call returns an array of language/name pairs and a flag +// indicating if this language has support for formal/informal differences. +type DeepLLanguagesResponse struct { + Language string `json:"language"` + Name string `json:"name"` + SupportsFormality bool `json:"supports_formality"` +} + +// Retrieve Supported Languages — +// Retrieve the list of languages that are currently supported for translation, either as source or target language, respectively. +func (c *DeepLClient) Languages() (string, error) { + params := url.Values{} + params.Add("auth_key", c.AuthKey) + // note: translationType was already parsed + if len(c.LanguagesType) > 0 { + params.Add("type", c.LanguagesType) + } + + var langs []DeepLLanguagesResponse + + err := c.apiCall(http.MethodPost, params, &langs) + if err != nil { + return "", err + } + + // TODO(gwyneth): @coderabbitai suggests that this should be returned in different formats, + // namely, structured ones. One possibility would be to have different methods, _or_ pass + // a parameter (e.g., formatJSON, formatXML, formatYAML, formatUnstructured) and spew out + // whatever format is appropriate. + // Currently, the only use case for this function is inside a CLI, where the human reader + // will very likely prefer to read unstructured (but pretty-printed) language pairs... + // (gwyneth 20230413) + var r string + for _, lang := range langs { + r += lang.Language + ": " + lang.Name + if lang.SupportsFormality { + r += " (+ formality)" + } + r += "\n" + } + return r, nil +} + + +// Structure for getting an array of glossary pairs of supported languages +type DeepLGlossaryPairsResponse struct { + SupportedLanguages []GlossaryPair `json:"supported_languages"` +} + +// One pair of supported glossary languages. +type GlossaryPair struct { + SourceLang string `json:"source_lang"` + TargetLang string `json:"target_lang"` +} + +// List language pairs supported by glossaries — +// Retrieve the list of language pairs supported by the glossary feature. +func (c *DeepLClient) GlossaryLanguagePairs() (string, error) { + params := url.Values{} + params.Add("auth_key", c.AuthKey) + + var langPairs DeepLGlossaryPairsResponse + + err := c.apiCall(http.MethodGet, params, &langPairs) + if err != nil { + return "", err + } + + var r string + for _, langPair := range langPairs.SupportedLanguages { + r += langPair.SourceLang + " ⇒ " + langPair.TargetLang + "\n" + } + return r, nil +} \ No newline at end of file diff --git a/go.mod b/go.mod index 098d3fb..2e4ef2e 100644 --- a/go.mod +++ b/go.mod @@ -3,13 +3,14 @@ module github.com/Omochice/deepl-translate-cli go 1.21.3 require ( + github.com/lmorg/readline v0.0.0-20210316231630-be4b7d79fc3a github.com/mattn/go-isatty v0.0.20 - github.com/urfave/cli/v2 v2.25.7 + github.com/urfave/cli/v2 v2.27.1 ) require ( - github.com/cpuguy83/go-md2man/v2 v2.0.2 // indirect + github.com/cpuguy83/go-md2man/v2 v2.0.4 // indirect github.com/russross/blackfriday/v2 v2.1.0 // indirect - github.com/xrash/smetrics v0.0.0-20201216005158-039620a65673 // indirect - golang.org/x/sys v0.6.0 // indirect + github.com/xrash/smetrics v0.0.0-20240312152122-5f08fbb34913 // indirect + golang.org/x/sys v0.19.0 // indirect ) diff --git a/go.sum b/go.sum index d1916e6..8cb37a8 100644 --- a/go.sum +++ b/go.sum @@ -1,12 +1,24 @@ -github.com/cpuguy83/go-md2man/v2 v2.0.2 h1:p1EgwI/C7NhT0JmVkwCD2ZBK8j4aeHQX2pMHHBfMQ6w= -github.com/cpuguy83/go-md2man/v2 v2.0.2/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= +github.com/cpuguy83/go-md2man/v2 v2.0.3 h1:qMCsGGgs+MAzDFyp9LpAe1Lqy/fY/qCovCm0qnXZOBM= +github.com/cpuguy83/go-md2man/v2 v2.0.3/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= +github.com/cpuguy83/go-md2man/v2 v2.0.4 h1:wfIWP927BUkWJb2NmU/kNDYIBTh/ziUX91+lVfRxZq4= +github.com/cpuguy83/go-md2man/v2 v2.0.4/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= +github.com/lmorg/readline v0.0.0-20210316231630-be4b7d79fc3a h1:xiaNGfMrhqdK89RvLnc6WAe4Pe/HKaluZmipsYF3E0o= +github.com/lmorg/readline v0.0.0-20210316231630-be4b7d79fc3a/go.mod h1:DidBcghSSS00hj06CkehMGDhZ/25vZDdydkeTVUy6lY= github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY= github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y= github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk= github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM= github.com/urfave/cli/v2 v2.25.7 h1:VAzn5oq403l5pHjc4OhD54+XGO9cdKVL/7lDjF+iKUs= github.com/urfave/cli/v2 v2.25.7/go.mod h1:8qnjx1vcq5s2/wpsqoZFndg2CE5tNFyrTvS6SinrnYQ= +github.com/urfave/cli/v2 v2.27.1 h1:8xSQ6szndafKVRmfyeUMxkNUJQMjL1F2zmsZ+qHpfho= +github.com/urfave/cli/v2 v2.27.1/go.mod h1:8qnjx1vcq5s2/wpsqoZFndg2CE5tNFyrTvS6SinrnYQ= github.com/xrash/smetrics v0.0.0-20201216005158-039620a65673 h1:bAn7/zixMGCfxrRTfdpNzjtPYqr8smhKouy9mxVdGPU= github.com/xrash/smetrics v0.0.0-20201216005158-039620a65673/go.mod h1:N3UwUGtsrSj3ccvlPHLoLsHnpR27oXr4ZE984MbSER8= -golang.org/x/sys v0.6.0 h1:MVltZSvRTcU2ljQOhs94SXPftV6DCNnZViHeQps87pQ= +github.com/xrash/smetrics v0.0.0-20240312152122-5f08fbb34913 h1:+qGGcbkzsfDQNPPe9UDgpxAWQrhbbBXOYJFQDq/dtJw= +github.com/xrash/smetrics v0.0.0-20240312152122-5f08fbb34913/go.mod h1:4aEEwZQutDLsQv2Deui4iYQ6DWTxR14g6m8Wv88+Xqk= +golang.org/x/sys v0.0.0-20210316164454-77fc1eacc6aa/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs= golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= +golang.org/x/sys v0.14.0 h1:Vz7Qs629MkJkGyHxUlRHizWJRG2j8fbQKjELVSNhy7Q= +golang.org/x/sys v0.14.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= +golang.org/x/sys v0.19.0 h1:q5f1RH2jigJ1MoAWp2KTp3gm5zAGFUTarQZ5U386+4o= +golang.org/x/sys v0.19.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= diff --git a/main.go b/main.go index 849db74..eafa6c1 100644 --- a/main.go +++ b/main.go @@ -1,62 +1,143 @@ +// Original code by @Omochice +// With some extra tweaks by Gwyneth Llewelyn package main import ( "encoding/json" "fmt" - "io/ioutil" + "io" "log" "os" "path/filepath" "runtime" "runtime/debug" + "time" "github.com/Omochice/deepl-translate-cli/deepl" - + "github.com/lmorg/readline" "github.com/mattn/go-isatty" "github.com/urfave/cli/v2" ) +// versionInfoType holds the relevant information for this build. +// It is meant to be used as a cache. +type versionInfoType struct { + version string // Runtime version. + commit string // Commit revision number. + dateString string // Commit revision time (as a RFC3339 string). + date time.Time // Same as before, converted to a time.Time, because that's what the cli package uses. + builtBy string // User who built this (see note). + goOS string // Operating system for this build (from runtime). + goARCH string // Architecture, i.e., CPU type (from runtime). + goVersion string // Go version used to compile this build (from runtime). + init bool // Have we already initialised the cache object? +} + +// NOTE: I don't know where the "builtBy" information comes from, so, right now, it gets injected +// during build time, e.g. `go build -ldflags "-X main.TheBuilder=gwyneth"` (gwyneth 20231103) + var ( - version = "dev" - commit = "none" - date = "unkdown" - buildBy = "unkdown" + versionInfo versionInfoType // cached values for this build. + TheBuilder string // to be overwritten via the linker command `go build -ldflags "-X main.TheBuilder=gwyneth"`. + debugLevel int // verbosity/debug level. ) -func getVersion() string { - if version != "" { - return version +// Initialises the versionInfo variable. +func initVersionInfo() error { + if versionInfo.init { + // already initialised, no need to do anything else! + return nil } - i, ok := debug.ReadBuildInfo() + // get the following entries from the runtime: + versionInfo.goOS = runtime.GOOS + versionInfo.goARCH = runtime.GOARCH + versionInfo.goVersion = runtime.Version() + + // attempt to get some build info as well: + buildInfo, ok := debug.ReadBuildInfo() if !ok { - return "unknown" + return fmt.Errorf("no valid build information found") + } + versionInfo.version = buildInfo.Main.Version + + // Now dig through settings and extract what we can... + + var vcs, rev string // Name of the version control system name (very likely Git) and the revision. + for _, setting := range buildInfo.Settings { + switch setting.Key { + case "vcs": + vcs = setting.Value + case "vcs.revision": + rev = setting.Value + case "vcs.time": + versionInfo.dateString = setting.Value + } } - return i.Main.Version + versionInfo.commit = "unknown" + if vcs != "" { + versionInfo.commit = vcs + } + if rev != "" { + versionInfo.commit += " [" + rev + "]" + } + // attempt to parse the date, which comes as a string in RFC3339 format, into a date.Time: + var parseErr error + if versionInfo.date, parseErr = time.Parse(versionInfo.dateString, time.RFC3339); parseErr != nil { + // Note: we can safely ignore the parsing error: either the conversion works, or it doesn't, and we + // cannot do anything about it... (gwyneth 20231103) + // However, the AI revision bots dislike this, so we'll assign the current date instead. + versionInfo.date = time.Now() + + if debugLevel > 1 { + fmt.Fprintf(os.Stderr, "date parse error: %v", parseErr) + } + } + + // NOTE: I have no idea where the "builtBy" info is supposed to come from; + // the way I do it is to force the variable with a compile-time option. (gwyneth 20231103) + versionInfo.builtBy = TheBuilder + + return nil } +// Internal settings, to be filled by LoadSettings(), and which gets saved to a file to +// be reused on subsequent calls. +// NOTE: This might become utterly different if we implement settings stored via +// the github.com/urfave/cli-altsrc package. (gwyneth 20231103) type Setting struct { - AuthKey string `json:"-"` - SourceLang string `json:"source_lang"` - TargetLang string `json:"target_lang"` - IsPro bool `json:"-"` + AuthKey string `json:"-"` // API token, looks like a UUID with ":fx". + SourceLang string `json:"source_lang"` + TargetLang string `json:"target_lang"` + LanguagesType string `json:"type"` // For the "languages" utility call, either "source" or "target". + IsPro bool `json:"-"` + TagHandling string `json:"tag_handling"` // "xml", "html". + SplitSentences string `json:"split_sentences"` // "0", "1", "norewrite". + PreserveFormatting string `json:"preserve_formatting"` // "0", "1". + OutlineDetection int `json:"outline_detection"` // Integer; 0 is default. + NonSplittingTags string `json:"non_splitting_tags"` // List of comma-separated XML tags. + SplittingTags string `json:"splitting_tags"` // List of comma-separated XML tags. + IgnoreTags string `json:"ignore_tags"` // List of comma-separated XML tags. + Debug int `json:"debug"` // Debug/verbosity level, 0 is no debugging. } +// Open the settings file, or, if it doesn't exist, create it first. +// TODO: Probably change all this to use github.com/urfave/cli-altsrc instead. func LoadSettings(setting Setting, automake bool) (Setting, error) { if setting.AuthKey == "" { - return setting, fmt.Errorf("No deepl token is set.") + return setting, fmt.Errorf("no DeepL token is set; use the environment variable `DEEPL_TOKEN` to set it") } if setting.TargetLang == "" || setting.SourceLang == "" { homeDir, err := os.UserHomeDir() configPath := filepath.Join(homeDir, ".config", "deepl-translate-cli", "setting.json") - // if eigher is not set, load file. + // if either is not set, load file. if err != nil { return setting, err } - bytes, err := ioutil.ReadFile(configPath) + bytes, err := os.ReadFile(configPath) if err != nil { - errStr := fmt.Errorf("Not exists such file. %s\n\tauto make it, please write it. ", configPath) + errStr := fmt.Errorf("settings file does not exist. %s\n\tIt was autogenerated, please edit it to reflect your preferences", configPath) if automake { err := InitializeConfigFile(configPath) if err != nil { @@ -66,18 +147,20 @@ func LoadSettings(setting Setting, automake bool) (Setting, error) { return setting, errStr } if err := json.Unmarshal(bytes, &setting); err != nil { - return setting, fmt.Errorf("%s (occurred while loading setting.json)", err.Error()) + return setting, fmt.Errorf("%s (occurred while loading `setting.json`)", err.Error()) } if setting.SourceLang == "FILLIN" || setting.TargetLang == "FILLIN" { - return setting, fmt.Errorf("Did write config file? (%s)", configPath) + return setting, fmt.Errorf("did write config file? (%s)", configPath) } } if setting.SourceLang == setting.TargetLang { - return setting, fmt.Errorf("Equal source lang(%s) and target lang(%s)", setting.SourceLang, setting.TargetLang) + return setting, fmt.Errorf("cannot have identical source lang(%s) and target lang(%s)", setting.SourceLang, setting.TargetLang) } return setting, nil } +// Attempts to create the directory for the configuration file, with a minimalist configurztion if successful. +// If creating the directory (or the file within) fails, then abort and return error. func InitializeConfigFile(ConfigPath string) error { if err := os.MkdirAll(filepath.Dir(ConfigPath), 0755); err != nil { return err @@ -99,102 +182,348 @@ func InitializeConfigFile(ConfigPath string) error { return err } - out.Write(([]byte)(decoded)) + if _, err := out.Write(([]byte)(decoded)); err != nil { + return fmt.Errorf("failed to write to config file: %s", err) + } return nil } +// TODO: Try to use "github.com/urfave/cli/v3" in the future... +// TODO: @urfave has his own library to deal with configuration files, cli-altsrc. +// It's obscure and sparsely documented (see ). +// But it's probably far more flexible than the simplistic scheme used here. (gwyneth 20231103) func main() { + // Global settings for this cli app. + var setting Setting + // DeepL Token, usually coming from the environmant variable `DEEPL_TOKEN`. + var deeplToken string + + // Set up the version/runtime/debug-related variables, and cache them: + initVersionInfo() + + // Test if the authentication can work or not, depending if we got the token + // set as an environment variable. + deeplToken, ok := os.LookupEnv("DEEPL_TOKEN") + if !ok { + fmt.Fprintln(os.Stderr, "Please set first your DeepL authentication key using the environment variable DEEPL_TOKEN.") + os.Exit(1) // NOTE: the cli.Exit() function cannot be used here, because cli is not initialized yet. + // return cli.Exit(fmt.Sprintln("Please set first your DeepL authentication key using the environment variable DEEPL_TOKEN."), 1) + } + // Generic error variable to work around scoping issues. + var err error + // Configure all settings from the very start, because we need the authkey & endpoint + // for all other calls, not just translations. + setting, err = LoadSettings( + Setting{ + AuthKey: deeplToken, + }, + true) + if err != nil { + fmt.Fprintf(os.Stderr, "cannot init settings, error was: %q", err) + os.Exit(1) + // return cli.Exit(fmt.Sprintf("cannot init settings, error was: %q", err), 1) + } + + // start app app := &cli.App{ Name: "deepl-translate-cli", - Usage: "Translate sentences.", - UsageText: "deepl-translate-cli [-s|-t] ", + Usage: "Translate sentences, using the DeepL API.", + UsageText: "deepl-translate-cli [-s|-t][--pro] trans [--tag_handling [xml|html]] \ndeepl-translate-cli usage\ndeepl-translate-cli languages [--type=[source|target]]\ndeepl-translate-cli glossary-language-pairs", Version: fmt.Sprintf( "%s (rev %s) [%s %s %s] [build at %s by %s]", - getVersion(), - commit, - runtime.GOOS, - runtime.GOARCH, - runtime.Version(), - date, - buildBy, + versionInfo.version, + versionInfo.commit, + versionInfo.goOS, + versionInfo.goARCH, + versionInfo.goVersion, + versionInfo.dateString, // Date as string in RFC3339 notation. + versionInfo.builtBy, // see note at the top... ), + DefaultCommand: "translate", // to avoid brealing compatibility with earlier versions. + EnableBashCompletion: true, + Compiled: versionInfo.date, // Converted from RFC333 Authors: []*cli.Author{ { Name: "Omochice", + Email: "somewhere@here.jp", + }, + { + Name: "Gwyneth Llewelyn", + Email: "gwyneth.llewelyn@gwynethllewelyn.net", }, }, - + Copyright: "© 2021-2024 by Omochice. All rights reserved. Freely distributed under a MIT license.\nThis software is not affiliated nor endorsed by DeepL SE.", Flags: []cli.Flag{ &cli.StringFlag{ Name: "source_lang", Aliases: []string{"s"}, - Usage: "Set source language without using setting file", + Usage: "Set source language without using the settings file", + Value: "EN", + Destination: &setting.SourceLang, }, - &cli.StringFlag{ Name: "target_lang", Aliases: []string{"t"}, - Usage: "Set target language without using setting file", + Usage: "Set target language without using the settings file", + Value: "JA", + Destination: &setting.TargetLang, + }, + &cli.BoolFlag{ + Name: "pro", + Usage: "Use Pro plan's endpoint?", + Value: false, + Destination: &setting.IsPro, }, &cli.BoolFlag{ - Name: "pro", - Usage: "use pro plan's endpoint", + Name: "debug", + Aliases: []string{"d"}, + Usage: "Debugging; repeating the flag increases verbosity.", + Count: &debugLevel, }, }, - Action: func(c *cli.Context) error { - setting, err := LoadSettings(Setting{ - SourceLang: c.String("source_lang"), - TargetLang: c.String("target_lang"), - AuthKey: os.Getenv("DEEPL_TOKEN"), - IsPro: c.Bool("pro"), - }, true) - if err != nil { - return err - } + Commands: []*cli.Command{ + { + Name: "translate", + Aliases: []string{"trans"}, + Usage: "Basic translation of a set of Unicode strings into another language", + Description: "Text to be translated.\nOnly UTF-8-encoded plain text is supported. May contain multiple sentences, but the total request body size must not exceed 128 KiB (128 · 1024 bytes).\nPlease split up your text into multiple calls if it exceeds this limit.", + Category: "Translations", + Flags: []cli.Flag{ + &cli.StringFlag{ + Name: "tag_handling", + Usage: "Set to XML or HTML in order to do more advanced parsing (empty means just using the plain text variant)", + Aliases: []string{"tag"}, + Value: "", + Destination: &setting.TagHandling, + Action: func(c *cli.Context, v string) error { + switch v { + case "xml", "html": + return nil + default: + return fmt.Errorf("tag_handling must be either `xml` or `html` (got: %s)", v) + } + }, + }, + &cli.StringFlag{ + Name: "split_sentences", + Usage: "Sets whether the translation engine should first split the input into sentences. For text translations where `tag_handling` is not set to `html`, the default value is `1`, meaning the engine splits on punctuation and on newlines.\nFor text translations where `tag_handling=html`, the default value is `nonewlines`, meaning the engine splits on punctuation only, ignoring newlines.\n\nThe use of `nonewlines` as the default value for text translations where `tag_handling=html` is new behavior that was implemented in November 2022, when HTML handling was moved out of beta.\n\nPossible values are:\n\n * `0` - no splitting at all, whole input is treated as one sentence\n * `1` (default when `tag_handling` is not set to `html`) - splits on punctuation and on newlines\n * `nonewlines` (default when `tag_handling=html`) - splits on punctuation only, ignoring newlines\n\nFor applications that send one sentence per text parameter, we recommend setting `split_sentences` to `0`, in order to prevent the engine from splitting the sentence unintentionally.\n\nPlease note that newlines will split sentences when `split_sentences=1`. We recommend cleaning files so they don't contain breaking sentences or setting the parameter `split_sentences` to `nonewlines`.", + Aliases: []string{"split"}, + Value: "", // NOTE: default value should depend on `tag_handling`. + Destination: &setting.SplitSentences, + Action: func(c *cli.Context, v string) error { + switch v { + case "0", "1", "nonewlines": + return nil + default: + return fmt.Errorf("split_sentences can only be 0, 1, or `nonewlines` (got: %s)", v) + } + }, + }, + &cli.StringFlag{ + Name: "preserve_formatting", + Usage: "Sets whether the translation engine should respect the original formatting, even if it would usually correct some aspects. Possible values are:\n * `0` (default)\n * `1`\n\nThe formatting aspects affected by this setting include:\n * Punctuation at the beginning and end of the sentence\n * Upper/lower case at the beginning of the sentence", + Aliases: []string{"preserve"}, + Value: "0", + Destination: &setting.PreserveFormatting, + Action: func(c *cli.Context, v string) error { + switch v { + case "0", "1": + return nil + default: + return fmt.Errorf("preserve_formatting can only be 0 or 1 (got: %s)", v) + } + }, + }, + &cli.IntFlag{ + Name: "outline_detection", + Usage: "The automatic detection of the XML structure won't yield best results in all XML files. You can disable this automatic mechanism altogether by setting the `outline_detection` parameter to `false` and selecting the tags that should be considered structure tags. This will split sentences using the `splitting_tags` parameter.", + Aliases: []string{"outline"}, + Value: 0, + Destination: &setting.OutlineDetection, + }, + &cli.StringFlag{ + Name: "non_splitting_tags", + Usage: "Comma-separated list of XML tags which never split sentences.", + Aliases: []string{"never"}, + //Value: [""], + Destination: &setting.NonSplittingTags, + }, + &cli.StringFlag{ + Name: "splitting_tags", + Usage: "Comma-separated list of XML tags which always cause splits.", + Aliases: []string{"always"}, + //Value: [""], + Destination: &setting.SplittingTags, + }, + &cli.StringFlag{ + Name: "ignore_tags", + Usage: "Comma-separated list of XML tags which will always be ignored.", + Aliases: []string{"ignore"}, + //Value: [""], + Destination: &setting.IgnoreTags, + }, + }, + Action: func(c *cli.Context) error { +/* + if c.String("source_lang") != "" { + setting.SourceLang = c.String("source_lang") + } + if c.String("target_lang") != "" { + setting.TargetLang = c.String("target_lang") + } + if c.Bool("pro") { + setting.IsPro = true + } +*/ + // TODO(gwyneth): Create constants for debugging levels. + if debugLevel > 1 { + fmt.Fprintf(os.Stderr, "Number of args (Narg): %d, c.Args.Len(): %d\n", c.NArg(), c.Args().Len()) + } + // The captured sentence for translation, unprocessed; it can come from different sources! + var rawSentence string + if c.NArg() == 0 { + // no filename path passed; read from STDIN (TTY or pipe) + if isatty.IsTerminal(os.Stdin.Fd()) { + // is not pipe (i.e. TTY) + // NOTE: This seems not to work very well...(gwyneth 20231101) + // fmt.Scan(&rawSentence) + // Replaced it by using a readline (from a library), but it might really be overkill, since + rl := readline.NewInstance() + rawSentence, err = rl.Readline() + if err != nil { + return err + } + } else { + // is pipe + pipeIn, err := io.ReadAll(os.Stdin) + if err != nil { + return err + } + rawSentence = string(pipeIn) + } + } else { + if c.NArg() >= 2 { + return fmt.Errorf("cannot specify multiple file paths") + } + f, err := os.Open(c.Args().First()) + if err != nil { + return err + } + b, err := io.ReadAll(f) + if err != nil { + return err + } + rawSentence = string(b) + } + + client := deepl.DeepLClient{ + Endpoint: deepl.GetEndpoint(c.Bool("pro")) + "/translate", + AuthKey: deeplToken, + SourceLang: c.String("source_lang"), + TargetLang: c.String("target_lang"), + LanguagesType: c.String("type"), + IsPro: c.Bool("pro"), + TagHandling: c.String("tag_handling"), + SplitSentences: c.String("split_sentences"), + PreserveFormatting: c.String("preserve_formatting"), + OutlineDetection: c.Int("outline_detection"), + NonSplittingTags: c.String("non_splitting_tags"), + SplittingTags: c.String("splitting_tags"), + IgnoreTags: c.String("ignore_tags"), + Debug: debugLevel, + } - var rawSentense string - if c.NArg() == 0 { - if isatty.IsTerminal(os.Stdin.Fd()) { - // is not pipe - fmt.Scan(&rawSentense) - } else { - // is pipe - pipeIn, err := ioutil.ReadAll(os.Stdin) + // Simplified call to Translate, now everything is passed via the DeepLClient + // initialisation. + translateds, err := client.Translate(rawSentence) if err != nil { return err } - rawSentense = string(pipeIn) - } - } else { - if c.NArg() >= 2 { - return fmt.Errorf("Cannot specify multiple file paths.") - } - f, err := os.Open(c.Args().First()) - if err != nil { - return err - } - b, err := ioutil.ReadAll(f) - if err != nil { - return err - } - rawSentense = string(b) - } - - client := deepl.DeepLClient{ - Endpoint: deepl.GetEndpoint(c.Bool("pro")), - AuthKey: setting.AuthKey, - } - translateds, err := client.Translate(rawSentense, setting.SourceLang, setting.TargetLang) - if err != nil { - return err - } - for _, translated := range translateds { - fmt.Print(translated) - } - return nil + for _, translated := range translateds { + fmt.Print(translated) + } + return nil + }, + }, + { + Name: "usage", + Aliases: []string{"u"}, + Usage: "Check usage and limits", + Description: "Retrieve usage information within the current billing period together with the corresponding account limits.", + Category: "Utilities", + Action: func(c *cli.Context) error { + client := deepl.DeepLClient{ + Endpoint: deepl.GetEndpoint(c.Bool("pro")) + "/usage", + AuthKey: setting.AuthKey, + } + s, err := client.Usage() + if err != nil { + return err + } + fmt.Println(s) + return nil + }, + }, + { + // TODO: make a call to languages and store the valid pairs retrieved, + // so that we can later validate them. (gwyneth 20231105) + Name: "languages", + // Aliases: []string{"l"}, + Usage: "Retrieve supported languages", + Description: "Retrieve the list of languages that are currently supported for translation, either as source or target language, respectively.", + Category: "Utilities", + Flags: []cli.Flag{ + &cli.StringFlag{ + Name: "type", + Usage: "`TYPE` sets whether source or target languages should be listed. Possible options are:\n`source`: For languages that can be used in the `source_lang` parameter of translate requests.\n`target`: For languages that can be used in the `target_lang` parameter of translate requests.\n", + Value: "source", + DefaultText: "source", + Action: func(c *cli.Context, v string) error { + switch v { + case "source", "target": + return nil + default: + return fmt.Errorf("type must be either `source` or `target` (got: %s)", v) + } + }, + }, + }, + Action: func(c *cli.Context) error { + client := deepl.DeepLClient{ + Endpoint: deepl.GetEndpoint(c.Bool("pro")) + "/languages", + AuthKey: setting.AuthKey, + LanguagesType: c.String("type"), + } + s, err := client.Languages() + if err != nil { + return err + } + fmt.Println(s) + return nil + }, + }, + { + Name: "glossary-language-pairs", + // Aliases: []string{"l"}, + Usage: "List language pairs supported by glossaries", + Description: "Retrieve the list of language pairs supported by the glossary feature.", + Category: "Glossary", + Action: func(c *cli.Context) error { + client := deepl.DeepLClient{ + Endpoint: deepl.GetEndpoint(c.Bool("pro")) + "/glossary-language-pairs", + AuthKey: setting.AuthKey, + } + s, err := client.GlossaryLanguagePairs() + if err != nil { + return err + } + fmt.Println(s) + return nil + }, + }, }, } - err := app.Run(os.Args) + err = app.Run(os.Args) if err != nil { log.Fatal(err) } diff --git a/main_test.go b/main_test.go index 4852346..5bed313 100644 --- a/main_test.go +++ b/main_test.go @@ -1,7 +1,6 @@ package main import ( - "io/ioutil" "os" "path/filepath" "testing" @@ -14,7 +13,7 @@ func TestLoadsettings(t *testing.T) { var errorText string // - errorText = "The function should not overload SourceLang / TargetLang if eigher is not set." + errorText = "The function should not overload SourceLang / TargetLang if either is not set." setting = Setting{ AuthKey: "test", SourceLang: "EN", @@ -23,17 +22,17 @@ func TestLoadsettings(t *testing.T) { } actual, err = LoadSettings(setting, false) if err != nil { - t.Fatalf(errorText+"\n%#v", err) + t.Fatalf(errorText + "\n%#v", err) } if setting.AuthKey != actual.AuthKey || setting.SourceLang != actual.SourceLang || setting.TargetLang != actual.TargetLang { - t.Fatalf(errorText+"\nExpected: %#v\nActual: %#v", setting, actual) + t.Fatalf(errorText + "\nExpected: %#v\nActual: %#v", setting, actual) } // - errorText = "The function should occur error if AuthKey is not set." - expectedErrorText := "No deepl token is set." // DRY... + errorText = "There should occur an error on this function if AuthKey is not set." + expectedErrorText := "no DeepL token is set; use the environment variable `DEEPL_TOKEN` to set it" // DRY... setting = Setting{ AuthKey: "", SourceLang: "EN", @@ -44,11 +43,11 @@ func TestLoadsettings(t *testing.T) { if err == nil { t.Fatalf(errorText+"\nReturned: %#v", actual) } else if err.Error() != expectedErrorText { - t.Fatalf(errorText+"\nExpected: %s\nActual: %s", expectedErrorText, err.Error()) + t.Fatalf(errorText+"\nExpected: %s\nActual: %s", expectedErrorText, err) } // - errorText = "The function should occur error if SourceLang == TargetLang" + errorText = "There should occur an error on this function if SourceLang == TargetLang" setting = Setting{ AuthKey: "test", SourceLang: "EN", @@ -57,28 +56,51 @@ func TestLoadsettings(t *testing.T) { } actual, err = LoadSettings(setting, false) if err == nil { - t.Fatalf(errorText+"\nInputed: %#v", setting) + t.Fatalf(errorText+"\nInput: %#v", setting) } } +// Internal function to test if a file exists or not; this function will *also* be tested below. func Exists(filename string) bool { _, err := os.Stat(filename) return err == nil } +// This test function first creates a temporary file and checks if Exists correctly identifies it as existing. It then checks a non-existent file path to ensure Exists returns false as expected. You can add this test to your main_test.go file to enhance the reliability of the Exists function. +// As suggested by @coderabbitai (gwyneth 20230413) +func TestExists(t *testing.T) { + // Test with a file that does exist + tempFile, err := os.CreateTemp("", "exist_test") + if err != nil { + t.Fatalf("Failed to create temporary file: %s", err) + } + defer os.Remove(tempFile.Name()) // Clean up after the test + + if !Exists(tempFile.Name()) { + t.Errorf("Exists should return true for existing file: %s", tempFile.Name()) + } + defer os.Remove(tempFile.Name()) // Clean up after the test + + // Test with a file that does not exist + nonExistentFile := filepath.Join(os.TempDir(), "non_existent_file.txt") + if Exists(nonExistentFile) { + t.Errorf("Exists should return false for non-existing file: %s", nonExistentFile) + } +} + func TestInitializeConfigFile(t *testing.T) { - dir, err := ioutil.TempDir("", "example") + dir, err := os.MkdirTemp("", "example") if err != nil { - t.Fatalf("Error occurred in iotuil.Tempdir") + t.Fatalf("Error occurred in os.MkdirTemp: %s", err) } defer os.RemoveAll(dir) p := filepath.Join(dir, "config.json") // will success if err := InitializeConfigFile(p); err != nil { - t.Fatalf("The function should not occur error\nActual: %s", err.Error()) + t.Fatalf("There should be no errors thrown by this function\nActual: %s", err) } if !Exists(p) { - t.Fatalf("The function should make configfile(%s)", p) + t.Fatalf("The function should have created the config file: %q", p) } } diff --git a/test data/test-1.txt b/test data/test-1.txt new file mode 100644 index 0000000..6690590 --- /dev/null +++ b/test data/test-1.txt @@ -0,0 +1 @@ +The Peculiar Adventures of Sir Fluffington and the Quantum Teapot: In the quaint village of Bumbleberry, Sir Fluffington, a dapper hedgehog with a penchant for solving riddles, embarked on an improbable quest. Armed with a rusty teaspoon and a pocket-sized quantum teapot, he set out to unravel the mysteries of interdimensional biscuit crumbs. His faithful sidekick, Professor Whiskerfluff, a perspicacious tabby cat, rode shotgun atop a levitating cucumber. Together, they navigated wormholes made of rainbow floss, dodging existential squirrels and befriending sentient marshmallow clouds. Their mission? To locate the elusive “Eureka Biscuit,” rumored to grant omniscience and a lifetime supply of crumpets. Alas, the teapot malfunctioned, and they found themselves in a parallel universe where socks had feelings and gravity whispered limericks. Undeterred, Sir Fluffington adjusted his monocle, patted the cucumber for luck, and declared, “Onward, my dear Whiskerfluff! Let us sally forth into the quizzical unknown!” \ No newline at end of file diff --git a/test data/test-1.xml b/test data/test-1.xml new file mode 100644 index 0000000..c758770 --- /dev/null +++ b/test data/test-1.xml @@ -0,0 +1,19 @@ + + + + + + + +]> + + The Peculiar Adventures of Sir Fluffington and the Quantum Teapot + + In the quaint village of Bumbleberry, Sir Fluffington, a dapper hedgehog with a penchant for solving riddles, embarked on an improbable quest. Armed with a rusty teaspoon and a pocket-sized quantum teapot, he set out to unravel the mysteries of interdimensional biscuit crumbs. + His faithful sidekick, Professor Whiskerfluff, a perspicacious tabby cat, rode shotgun atop a levitating cucumber. Together, they navigated wormholes made of rainbow floss, dodging existential squirrels and befriending sentient marshmallow clouds. + Their mission? To locate the elusive “Eureka Biscuit,” rumored to grant omniscience and a lifetime supply of crumpets. + Alas, the teapot malfunctioned, and they found themselves in a parallel universe where socks had feelings and gravity whispered limericks. + Undeterred, Sir Fluffington adjusted his monocle, patted the cucumber for luck, and declared, “Onward, my dear Whiskerfluff! Let us sally forth into the quizzical unknown!” + + diff --git a/test data/test-2.txt b/test data/test-2.txt new file mode 100644 index 0000000..8e2e48f --- /dev/null +++ b/test data/test-2.txt @@ -0,0 +1 @@ +The Curious Case of the Disappearing Mustache Orchestra: In the bustling city of Whimsyburg, where rainbows wore top hats and pigeons tap-danced, the renowned Mustache Orchestra held nightly concerts. Led by Maestro Snickerdoodle, a walrus with a handlebar mustache that could conduct symphonies, the orchestra played instruments made of moonbeams and hiccup harmonicas. Their melodies swirled through the air, turning lampposts into jazz pianos and fire hydrants into operatic sopranos. But one fateful Tuesday, during their rendition of “Sousaphones in Space,” the mustaches vanished! Panic ensued. Maestro Snickerdoodle twirled his invisible mustache in distress, while the flutes hiccupped mournfully. Detective Whiskerbottom, a persnickety sloth in a deerstalker hat, took the case. Clutching a magnifying glass and a bag of enchanted pretzels, he interrogated the cello strings and questioned the trombone valves. Suspects included a mischievous moonbeam and a rogue rainbow, both with alibis involving cosmic cupcakes. As the city held its breath, Detective Whiskerbottom vowed, “Fear not, citizens! We shall unravel this follicular enigma and restore harmony to our whimsical metropolis!” \ No newline at end of file