Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace all backends by Process.spawn for portability, reliability and simplicity #175

Merged
merged 4 commits into from
Jan 7, 2024

Conversation

eregon
Copy link
Contributor

@eregon eregon commented Feb 15, 2021

From #172

@coveralls
Copy link

coveralls commented Feb 15, 2021

Coverage Status

Coverage increased (+2.4%) to 92.727% when pulling 0eff297 on eregon:process-spawn into 4422792 on enkessler:master.

@eregon eregon mentioned this pull request Feb 16, 2021
@eregon
Copy link
Contributor Author

eregon commented Feb 16, 2021

TravisCI passes, except on JRuby: https://travis-ci.org/github/enkessler/childprocess/builds/759130946

@eregon
Copy link
Contributor Author

eregon commented Feb 16, 2021

TravisCI now passes with the latest JRuby, as good as master + actually it also works on ruby-head:
https://travis-ci.org/github/enkessler/childprocess/builds/759183880
vs
https://travis-ci.org/github/enkessler/childprocess/builds/759135035

@eregon eregon force-pushed the process-spawn branch 6 times, most recently from 974a16a to 22a811b Compare June 20, 2021 17:10
@eregon eregon marked this pull request as ready for review June 20, 2021 17:13
@eregon
Copy link
Contributor Author

eregon commented Jun 20, 2021

This is almost ready now.
CI: https://github.com/eregon/childprocess/runs/2869763551?check_suite_focus=true
Linux & macOS pass.

Windows has only 2 failures:

Failures:

  1) ChildProcess lets a detached child live on
     Failure/Error: expect(alive?(p_pid)).to_not be true

       expected not #<TrueClass:20> => true
                got #<TrueClass:20> => true

       Compared using equal?, which compares object identity.
     # ./spec/childprocess_spec.rb:213:in `block (2 levels) in <top (required)>'

  2) ChildProcess kills the full process tree
     Failure/Error: raise msg

     RuntimeError:
       timed out after 3 seconds:

       expected: false
            got: true

       (compared using eql?)

       Diff:
       @@ -1 +1 @@
       -false
       +true
     # ./spec/spec_helper.rb:197:in `wait_until'
     # ./spec/childprocess_spec.rb:291:in `block (3 levels) in <top (required)>'
     # ./spec/childprocess_spec.rb:281:in `block (2 levels) in <top (required)>'

Finished in 22.11 seconds (files took 0.83769 seconds to load)
66 examples, 2 failures

The first one might be some issue with Process.detach on Windows.
The second is the issue that there is no builtin way to kill a process group on Windows.

JRuby on Linux & macOS has only 1 failure:

   1) ChildProcess can unset env vars
     Failure/Error: expect(child_env).to_not have_key('CHILDPROCESS_UNSET')
       expected `{"ACCEPT_EULA"=>"Y", "AGENT_TOOLSDIRECTORY"=>"/opt/hostedtoolcache", "ANDROID_HOME"=>"/usr/local/lib/...", "XDG_CONFIG_HOME"=>"/home/runner/.config", "_"=>"/home/runner/.rubies/jruby-9.2.19.0/bin/bundle"}.has_key?("CHILDPROCESS_UNSET")` to be falsey, got true
     # ./spec/childprocess_spec.rb:161:in `block in <main>'
     # ./spec/childprocess_spec.rb:150:in `block in <main>'

Which is most likely a bug of JRuby (since that works fine on CRuby).

JRuby on Windows doesn't seem to work well, due to another JRuby bug, Process.waitpid2 seems to return nil for the status, which must be a JRuby bug.

JoshCheek added a commit to JoshCheek/seeing_is_believing that referenced this pull request Jun 21, 2021
mscottford added a commit to corgibytes/freshli-agent-java that referenced this pull request Dec 15, 2022
Ruby version 3.1 causes failures on Windows. This appears to be caused by the `childprocess` gem, and the way it launches executables. The error shows up when the Cucumber/Aruba library using `childprocess` to launch a program. `childprocess` is throwing an exception when it attempts to get a OS file handle for the file that output should be redirected to. The function call is returning -1, but the Windows "last error" isn't being set. This causes `childprocess` to generate an exception message that reads, "Windows says that the process completed successfully, but it did not". The process that is being referred to is the attempt to obtain the file handle for the file that should be used for streaming output. It is a little confusing, because at first glance, it seems like it might be referring to the process that should be launched.

There is an open pull request (enkessler/childprocess#175) and corresponding issue that might resolve the issue (enkessler/childprocess#172). For now, running on 3.0 seems like a decent enough work-around.

An issue describing this problem, with detailed steps to repeat it, will be created in the `childprocess` GitHub project. It will also be brought to the attention of the Cucumber/Aruba project that is using the `childprocess` library.
@eregon eregon force-pushed the process-spawn branch 2 times, most recently from 16ddc23 to f7471c8 Compare February 21, 2023 17:19
@eregon
Copy link
Contributor Author

eregon commented Feb 21, 2023

Update: CI is green on Linux and macOS.
https://github.com/eregon/childprocess/actions/runs/4235205251/jobs/7358538501

On Windows there is a single failure:

   1) ChildProcess kills the full process tree
     Failure/Error: raise msg

     RuntimeError:
       timed out after 3 seconds:

       expected: false
            got: true

       (compared using eql?)

       Diff:
       @@ -1 +1 @@
       -false
       +true
     # ./spec/spec_helper.rb:197:in `wait_until'
     # ./spec/childprocess_spec.rb:291:in `block (3 levels) in <top (required)>'
     # ./spec/childprocess_spec.rb:281:in `block (2 levels) in <top (required)>'

Note that keeping the original code does not solve this.
Instead, the original code for Windows seems to have a permission issue, at least in GitHub Actions:
https://github.com/eregon/childprocess/actions/runs/4235173639/jobs/7358466488


  1) ChildProcess kills the full process tree
     Failure/Error: ok or raise LaunchError, Lib.last_error_message

     ChildProcess::LaunchError:
       Access is denied. (5)
     # ./lib/childprocess/windows/process_builder.rb:89:in `create_process'
     # ./lib/childprocess/windows/process_builder.rb:34:in `start'
     # ./lib/childprocess/windows/process.rb:70:in `launch_process'
     # ./lib/childprocess/abstract_process.rb:81:in `start'
     # ./spec/childprocess_spec.rb:284:in `block (3 levels) in <top (required)>'
     # ./spec/childprocess_spec.rb:281:in `block (2 levels) in <top (required)>'

Maybe https://stackoverflow.com/a/61113184/388803 is an option.

JRuby on Windows is so broken it can't even load rspec, so I ignore that for now.

I gave it a try to keep the existing logic for JRuby and Windows backends, but that seems to only results in more test failures, so I don't see the point: https://github.com/eregon/childprocess/tree/process-spawn-non-windows https://github.com/eregon/childprocess/actions/runs/4235173639

@eregon
Copy link
Contributor Author

eregon commented Dec 13, 2023

Given the latest release is 4.1.0, I think this change would deserve its own major version, i.e. 5.0.0.

@kaspergrubbe
Copy link

Thank you so much on your work on this! :) I don't how many are out there, but I like the interfaces provided by this gem, and depend on it for my private tooling.

Copy link
Collaborator

@sds sds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your effort, @eregon. Sorry for the delay in response.

Can we rebase this PR and remove the JRuby/Windows tests that we can't get working? Would rather have a working CI build for macOS/Linux and just cut a new major version with a heads up to folks that support for these other OSes & Ruby runtimes may not be complete.

Thank you!

@sds sds mentioned this pull request Dec 20, 2023
@dentarg
Copy link

dentarg commented Dec 29, 2023

This would also solve issues like #186?

With this branch, I was able to use childprocess in the docker image ghcr.io/graalvm/truffleruby-community:23.1.0-debian on my Apple silicon Mac where CHILDPROCESS_POSIX_SPAWN didn't get it done ("posix_spawn is not yet supported on aarch64-linux (aarch64-linux), falling back to default implementation")

@dentarg
Copy link

dentarg commented Jan 4, 2024

I found a difference between this PR and the latest release: keys in the environment hash now needs to be strings

From docker run -it --rm -v $(pwd):/app -w /app ruby:3.2.2 bash

root@14303433afe8:/app# cat minimal_repro.rb
require "bundler/inline"

gemfile do
  if ENV.key?("PR")
    gem "childprocess", github: "https://github.com/enkessler/childprocess/pull/175"
  else
    source "https://rubygems.org"
    gem "childprocess"
    gem "ffi" if ENV.key?("CHILDPROCESS_POSIX_SPAWN")
  end
end

process = ChildProcess.build("ruby", "--help")
process.environment.merge!({ FOO: "123" })
process.start
root@14303433afe8:/app# ruby minimal_repro.rb
root@14303433afe8:/app# PR=1 ruby minimal_repro.rb
/usr/local/bundle/bundler/gems/childprocess-f7471c8a90c1/lib/childprocess/process_spawn_process.rb:73:in `spawn': no implicit conversion of Symbol into String (TypeError)

        @pid = ::Process.spawn(@environment, *args, options)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	from /usr/local/bundle/bundler/gems/childprocess-f7471c8a90c1/lib/childprocess/process_spawn_process.rb:73:in `launch_process'
	from /usr/local/bundle/bundler/gems/childprocess-f7471c8a90c1/lib/childprocess/abstract_process.rb:81:in `start'
	from minimal_repro.rb:14:in `<main>'

@dentarg
Copy link

dentarg commented Jan 4, 2024

Actually, that might not be a difference between this PR and the latest release. It is just that this PR is the only way for me to use childprocess with truffleruby and aarch64. I suspect the error above happens with the latest release and CHILDPROCESS_POSIX_SPAWN.

@eregon eregon force-pushed the process-spawn branch 2 times, most recently from db3769e to 228ec6d Compare January 4, 2024 18:23
@dentarg
Copy link

dentarg commented Jan 4, 2024

I suspect the error above happens with the latest release and CHILDPROCESS_POSIX_SPAWN

It doesn't blow up, but childprocess 4.1.0 has a similar problem with CHILDPROCESS_POSIX_SPAWN on TruffleRuby/JRuby, if you pass symbol keys in the env, they wont be used if the ENV already has that key (on CRuby the value you pass is used). sinatra/sinatra@0d5b0f6

@eregon eregon changed the title Add backend based on Process.spawn Replace all backends by Process.spawn for portability, reliability and simplicity Jan 4, 2024
@dentarg
Copy link

dentarg commented Jan 4, 2024

It doesn't blow up, but childprocess 4.1.0 has a similar problem with CHILDPROCESS_POSIX_SPAWN on TruffleRuby/JRuby, if you pass symbol keys in the env, they wont be used if the ENV already has that key (on CRuby the value you pass is used).

Not a concern for this PR, as it removes all usage of ffi (🤩) but maybe something you want to look into for TruffleRuby in general @eregon? Or maybe such differences with ffi is expected between CRuby/TruffleRuby.

@eregon
Copy link
Contributor Author

eregon commented Jan 4, 2024

This is ready now, the CI tests every Ruby from 2.4 (the required_ruby_version) to 3.3 as well as jruby and truffleruby.

JRuby on Windows fails to load rspec, which is fully independent from this gem so I excluded that.
TruffleRuby does not support Windows, so that's also excluded.

I managed to fix the kill process tree feature on Windows by using taskkill, that's a bit heavy but seems reasonable enough for killing an entire process tree.
That means this PR should be fully compatible with the existing childprocess, or at least all tests pass, on all platforms.

@eregon
Copy link
Contributor Author

eregon commented Jan 4, 2024

@dentarg

Not a concern for this PR, as it removes all usage of ffi (🤩) but maybe something you want to look into for TruffleRuby in general @eregon? Or maybe such differences with ffi is expected between CRuby/TruffleRuby.

That's because

envp = Envp.new(ENV.to_hash.merge(@environment))
does not do coercion.
It's the typical problem of having many backends, they get out of sync.
This PR does the environment coercion correctly for all platforms.

@eregon eregon requested a review from sds January 4, 2024 18:51
@eregon eregon force-pushed the process-spawn branch 2 times, most recently from 17ee66b to 50f3aad Compare January 4, 2024 18:58
eregon added 2 commits January 4, 2024 20:00
* Process.spawn requires only very minimal variants between platforms.
* Remove all other backends, Process.spawn is enough.
* Inspired by https://stackoverflow.com/a/61113184/388803
* Using a subprocess for that is a bit heavy, OTOH we are already creating
  subprocess in ChildProcess and it only happens for a leader process on Windows.
* There seems to be no other alternative which works reliably or it would incur huge complexity.
dentarg added a commit to dentarg/sinatra that referenced this pull request Jan 5, 2024
Use the awesome pull request by eregon: enkessler/childprocess#175
that uses native Process.spawn on all platforms

JVM rubies can't use the default fork+exec approach in childprocess, so
we fall back to CHILDPROCESS_POSIX_SPAWN, but that is not available on
aarch64 (me using Docker on Apple silicon). Annoying when debugging
tests.

(This should still work even after the PR is merged and the branch removed)
dentarg added a commit to sinatra/sinatra that referenced this pull request Jan 5, 2024
Use the awesome pull request by eregon: enkessler/childprocess#175
that uses native Process.spawn on all platforms

JVM rubies can't use the default fork+exec approach in childprocess, so
we fall back to CHILDPROCESS_POSIX_SPAWN, but that is not available on
aarch64 (me using Docker on Apple silicon). Annoying when debugging
tests.

(This should still work even after the PR is merged and the branch
removed)
@sds sds merged commit 82afeeb into enkessler:master Jan 7, 2024
28 of 29 checks passed
dliappis added a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
github-actions bot pushed a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
(cherry picked from commit 9f1d55c)
github-actions bot pushed a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
(cherry picked from commit 9f1d55c)
github-actions bot pushed a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
(cherry picked from commit 9f1d55c)
dliappis added a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
(cherry picked from commit 9f1d55c)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
dliappis added a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
(cherry picked from commit 9f1d55c)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
dliappis added a commit to elastic/logstash that referenced this pull request Jan 8, 2024
This commit pins the `childprocess` gem to version `4` since version `5.0.0` of
enkessler/childprocess#175 seems to have broken JRuby support for spawning.

Closes #15757

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
(cherry picked from commit 9f1d55c)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants