Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transmit R session's LANG to the child process for the test runner #2488

Merged
merged 1 commit into from
Mar 20, 2024

Conversation

jennybc
Copy link
Member

@jennybc jennybc commented Mar 19, 2024

Companion to posit-dev/ark#270.

Addresses #1808 (comment).

How to test:

Pre-requisite: you will need to be on macOS to truly test this. (Windows is actually (!!) unaffected by this problem, but I have still tested these 2 PRs together on Window and all seems well.)

Get the source of this R package: https://github.com/jennybc/localetest.
Note this is a public package, but I don't explain its purpose anywhere. It just looks like an experiment.
Perhaps using usethis::create_from_github("jennybc/localetest").
This package contains exactly 1 snapshot test, specific for macOS, the only affected OS.

Run the test suite via devtools::test() in the Console or via Cmd + Shift + T (basically use any method other than Positron's test explorer).
The test should pass.

Now run the test suite via Positron's test explorer (accessible via the flask icon).
With a Positron dev build from this PR (along with the amalthea PR), the test should also pass.

With a Positron release build, the snapshot test should fail and will look something like this (i.e. the LANG env var is undefined and several locale categories are "C" instead of "en_US.UTF-8".

Snapshot of code has changed (variant 'darwin'):
old[1:11] vs new[1:11]
  Code
    Sys.getenv("LANG")
  Output
-   [1] "en_US.UTF-8"
+   [1] ""
  Code
    Sys.getlocale()
  Output
-   [1] "C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8"
+   [1] "C"
  Code
    Sys.getlocale("LC_COLLATE")
and 1 more ...

old[13:23] vs new[13:23]
  Code
    Sys.getlocale("LC_CTYPE")
  Output
-   [1] "en_US.UTF-8"
+   [1] "C"
  Code
    Sys.getlocale("LC_MONETARY")
  Output
-   [1] "en_US.UTF-8"
+   [1] "C"
  Code
    Sys.getlocale("LC_NUMERIC")
and 1 more ...

     old                        | new                            
[25] Code                       | Code                       [25]
[26]   Sys.getlocale("LC_TIME") |   Sys.getlocale("LC_TIME") [26]
[27] Output                     | Output                     [27]
[28]   [1] "en_US.UTF-8"        -   [1] "C"                  [28]

* Run `testthat::snapshot_accept('darwin/locale')` to accept the change.
* Run `testthat::snapshot_review('darwin/locale')` to interactively review the change.

@@ -97,24 +97,20 @@ export async function runThatTest(

const wd = testingTools.packageRoot.fsPath;
LOGGER.info(`Running devtools call in working directory ${wd}`);
const locale = await getLocale();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typical value for locale on macOS;

{
  LANG: "en_US.UTF-8",
  LC_ALL: "en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8",
  LC_COLLATE: "en_US.UTF-8",
  LC_CTYPE: "en_US.UTF-8",
  LC_MONETARY: "en_US.UTF-8",
  LC_NUMERIC: "C",
  LC_TIME: "en_US.UTF-8",
  LC_MESSAGES: "en_US.UTF-8",
  LC_PAPER: "",
  LC_MEASUREMENT: "",
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here's how it looks in my Windows VM:

2024-03-19 14:17:47.351 [info] Locale info from active R session: {
  "LANG": "en_US.UTF-8",
  "LC_ALL": "LC_COLLATE=English_United States.utf8;LC_CTYPE=English_United States.utf8;LC_MONETARY=English_United States.utf8;LC_NUMERIC=C;LC_TIME=English_United States.utf8",
  "LC_COLLATE": "English_United States.utf8",
  "LC_CTYPE": "English_United States.utf8",
  "LC_MONETARY": "English_United States.utf8",
  "LC_NUMERIC": "C",
  "LC_TIME": "English_United States.utf8",
  "LC_MESSAGES": "",
  "LC_PAPER": "",
  "LC_MEASUREMENT": ""
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, here is what I see:

2024-03-20 13:15:59.975 [info] Locale info from active R session: {
  "LANG": "en_US.UTF-8",
  "LC_ALL": "en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8",
  "LC_COLLATE": "en_US.UTF-8",
  "LC_CTYPE": "en_US.UTF-8",
  "LC_MONETARY": "en_US.UTF-8",
  "LC_NUMERIC": "C",
  "LC_TIME": "en_US.UTF-8",
  "LC_MESSAGES": "en_US.UTF-8",
  "LC_PAPER": "",
  "LC_MEASUREMENT": ""
}

shell: true,
env: {
...process.env,
LANG: locale['LANG']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I originally thought I would set individual components, such as LC_CTYPE and LC_TIME, but reading about related child_process.spawn() + locale issues in VS Code and extensions suggested that LANG is the best place to intervene. It appears to lead to correct values of other aspects of the locale and strikes me as the cleanest solution.

@jennybc
Copy link
Member Author

jennybc commented Mar 19, 2024

Recording more background here in case we need to revisit this and also so I can close my browser tabs.

Here's a precise description of the problem. The info below applies specifically to the environment when running testthat tests. It's not exactly what you get with Sys.getlocale(), because testthat, for example, sets LC_COLLATE to "C".

category macOS Positron R Console & Terminal macOS Positron child_process Windows Positron R Console & Terminal Windows child_process
LC_COLLATE C C C C
LC_CTYPE en_US.UTF-8 C English_United States.utf8 English_United States.utf8
LC_MONETARY en_US.UTF-8 C English_United States.utf8 English_United States.utf8
LC_NUMERIC C C C C
LC_TIME en_US.UTF-8 C English_United States.utf8 English_United States.utf8

Background on "Internationalization Variables" (see section 8.2 especially) for POSIX.1-2017:

https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08

Good overview of LANG and more detailed categories, such as LC_COLLATE and friends.


Example of a past issue around child processes and LANG in VS Code itself or in extensions:

microsoft/vscode#85675

That issue had a fix applied, which seemed to cause other troubles, and was eventually reverted. My main take-away is that we aren't the first to encounter locale troubles with child_process.spawn() and what I'm doing is reasonable.

@jennybc jennybc force-pushed the feature/r-session-locale branch from 63b0ce2 to 71aaf07 Compare March 19, 2024 19:21
@jennybc jennybc marked this pull request as ready for review March 19, 2024 22:05
@jennybc jennybc requested a review from juliasilge March 19, 2024 22:05
@juliasilge
Copy link
Contributor

When I use the current release build, I see devtools::test() pass and the test explorer fail. ✅

I installed both this PR and the amalthea PR, and then I see both devtools::test() and the test explorer pass. ✅

passing-test

@jennybc jennybc merged commit 18916f1 into main Mar 20, 2024
1 check passed
@jennybc jennybc deleted the feature/r-session-locale branch March 20, 2024 19:23
@jennybc jennybc mentioned this pull request Mar 28, 2024
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants