Spec update 20241105 #1210

headius · 2024-11-06T00:37:43Z

No description provided.

Previously, a TypeError was not raised if there were no thread variables, because the conversion to symbol was done after that check. Convert to symbol before checking for whether thread variables are set to make the behavior consistent. Fixes [Bug #20606]

[Feature #19236] When building a large hash, pre-allocating it with enough capacity can save many re-hashes and significantly improve performance. ``` /opt/rubies/3.3.0/bin/ruby --disable=gems -rrubygems -I./benchmark/lib ./benchmark/benchmark-driver/exe/benchmark-driver \ --executables="compare-ruby::../miniruby-master -I.ext/common --disable-gem" \ --executables="built-ruby::./miniruby --disable-gem" \ --output=markdown --output-compare -v $(find ./benchmark -maxdepth 1 -name 'hash_new' -o -name '*hash_new*.yml' -o -name '*hash_new*.rb' | sort) compare-ruby: ruby 3.4.0dev (2024-03-25T11:48:11Z master f53209f023) +YJIT dev [arm64-darwin23] last_commit=[ruby/irb] Cache RDoc::RI::Driver.new (ruby/irb#911) built-ruby: ruby 3.4.0dev (2024-03-25T15:29:40Z hash-new-rb 77652b08a2) +YJIT dev [arm64-darwin23] warming up... | |compare-ruby|built-ruby| |:-------------------|-----------:|---------:| |new | 7.614M| 5.976M| | | 1.27x| -| |new_with_capa_1k | 13.931k| 15.698k| | | -| 1.13x| |new_with_capa_100k | 124.746| 148.283| | | -| 1.19x| ```

[Bug #20593] It's fairly common to use `format` to interpolate a number of values into a user provided strings. The arguments not matching are a problem when they are positional, but when they are named, it's absolutely fine and we shouldn't emit a warning.

…Xcode 16 beta

* "Allow ambiguosity of `return` line" 65b991bc8571b7b718fc22bd33a43c4d269bf52d * "Move to test/.excludes-prism" 3b4ff810d2fefdf0194bd774bc04f6f17e2ccae7 * "Pending `EVENT_RETURN` settracefunc tests with Prism" a7f33c99c69e3cc62b7a24ce35f51f76cc5bfaa2

Prepare for ruby/uri#107

…r#size What a "word" is when talking about sizes is confusing because it's a highly overloaded term. Intel, Microsoft, and GDB are just a few vendors that have their own definition of what a "word" is. Specs that used the "wordsize" guard actually were mostly testing for the size of the C `long` fundamental type, so rename the guard for clarity. Also, get the size of `long` directly from RbConfig instead of guessing using Integer#size. Integer#size is not guaranteed to have anything to do with the `long` type.

There is no guarantee that Integer#size will continue to return `sizeof(long)` for small integers. Use the `l!` specifier for Array#pack instead. It is a public interface that has a direct relationship with the `long` type.

The spec is actually testing a behaviour stemming from NUM2INT(), and since `sizeof(long)>=sizeof(int)`, `min_long-1` always makes NUM2INT() raise `RangeError`.

Make Range#step to consistently use + for iteration [Feature #18368] Previously, non-numerics expected step to be integer, and iterated with begin#succ, skipping over step value steps. Since this commit, numeric and non-numeric iteration behaves the same way, by using + operator.

[Feature #20707] Converting Time into RFC3339 / ISO8601 representation is an significant hotspot for applications that serialize data in JSON, XML or other formats. By moving it into core we can optimize it much further than what `strftime` will allow. ``` compare-ruby: ruby 3.4.0dev (2024-08-29T13:11:40Z master 6b08a50a62) +YJIT [arm64-darwin23] built-ruby: ruby 3.4.0dev (2024-08-30T13:17:32Z native-xmlschema 34041ff71f) +YJIT [arm64-darwin23] warming up...... | |compare-ruby|built-ruby| |:-----------------------|-----------:|---------:| |time.xmlschema | 1.087M| 5.190M| | | -| 4.78x| |utc_time.xmlschema | 1.464M| 6.848M| | | -| 4.68x| |time.xmlschema(6) | 859.960k| 4.646M| | | -| 5.40x| |utc_time.xmlschema(6) | 1.080M| 5.917M| | | -| 5.48x| |time.xmlschema(9) | 893.909k| 4.668M| | | -| 5.22x| |utc_time.xmlschema(9) | 1.056M| 5.707M| | | -| 5.40x| ```

[Feature #20702] Works the same way than `Hash#fetch_values` for for array.

[Feature #20594] A handy method to construct a string out of multiple chunks. Contrary to `String#concat`, it doesn't do any encoding negociation, and simply append the content as bytes regardless of whether this result in a broken string or not. It's the caller responsibility to check for `String#valid_encoding?` in cases where it's needed. When passed integers, only the lower byte is considered, like in `String#setbyte`.

This is a static function only called in two places (rb_to_id and rb_to_symbol), and in both places, both symbols and strings are allowed. This makes the error message consistent with rb_check_id and rb_check_symbol. Fixes [Bug #20607]

The absence of either the integer or fractional part should be allowed.

[Feature #20205] The warning now suggests running with --debug-frozen-string-literal: ``` test.rb:3: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information) ``` When using --debug-frozen-string-literal, the location where the string was created is shown: ``` test.rb:3: warning: literal string will be frozen in the future test.rb:1: info: the string was created here ``` When resurrecting strings and debug mode is not enabled, the overhead is a simple FL_TEST_RAW. When mutating chilled strings and deprecation warnings are not enabled, the overhead is a simple warning category enabled check. Co-authored-by: Jean Boussier <byroot@ruby-lang.org> Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org> Co-authored-by: Jean Boussier <byroot@ruby-lang.org>

[Bug #20803] `abc` is used a lot across the ruby spec suite, if another test runs before this spec is loaded and create this symbol dynamically (`"abc".to_sym`) the spec will fail. So it's preferable to use a symbol name that is very unlikely to be used elsewhere to avoid flakes.

* YJIT: Replace Array#each only when YJIT is enabled * Add comments about BUILTIN_ATTR_C_TRACE * Make Ruby Array#each available with --yjit as well * Fix all paths that expect a C location * Use method_basic_definition_p to detect patches * Copy a comment about C_TRACE flag to compilers * Rephrase a comment about add_yjit_hook * Give METHOD_ENTRY_BASIC flag to Array#each * Add --yjit-c-builtin option * Allow inconsistent source_location in test-spec * Refactor a check of BUILTIN_ATTR_C_TRACE * Set METHOD_ENTRY_BASIC without touching vm->running

* It was calculating remaining time once up-front, properly decrementing remaining nanos for the loop, but continuing to sleep based on the original calculation * Now it calculates the remainder on each iteration, based on the current decremented nanoseconds * Also moves the spec to a more appropriate shared ruby spec location

andrykonchin · 2024-11-06T11:03:00Z

Fixed issues with core/encoding/compatible_spec.rb file.

headius · 2024-11-06T17:35:19Z

@eregon:

anything relying on sleep is extremely brittle and will likely cause transients

It doesn't appear to be transient, though. It consistently passes on Windows and Linux and consistently fails on MacOS.

And it's hard to spec sleep without relying on sleep.

seems unrelated to Thread#wakeup and seems about Kernel#sleep instead

Agreed, I will move it.

If it passes reliably on Linux you could guard it

I will do this for now and continue investigating the CI MacOS failures separately.

So far the person syncing the specs also fixed the failures

I think it would be better going forward if the person that introduced a breaking change corrects that change.

@andrykonchin Thank you for correcting those issues!

This is really a sleep spec, so it should be tested under the Kernel#sleep specs. Fails consistently on MacOS (in CI) and passes consistently on Linux and Windows so omit MacOS for now.

eregon · 2024-11-07T12:38:57Z

It doesn't appear to be transient, though. It consistently passes on Windows and Linux and consistently fails on MacOS.

As I suspected, it fails in TruffleRuby's CI on Linux, both on CRuby and TruffleRuby.
I think we should just remove the spec if it's not made more stable (e.g. with the idea in #1210 (comment)).
I'll quarantine it now in truffleruby's copy.

eregon · 2024-11-07T12:43:44Z

And it also has caused failures in CRuby: ruby/ruby@7c893c8
So, yeah, any transient spec relying on sleep timing will be removed or quarantined, it's almost guaranteed to fail in my long-term experience with those.
If it fails on one platform due to timing it's very likely to fail on other platforms too.

* See discussion on ruby/spec#1210

headius · 2024-11-07T13:00:35Z

Have you considered the possibility that it might actually just be broken in CRuby and TruffleRuby? It has been passing in JRuby's CI consistently for several weeks.

Regardless, it is behavior that was important enough for @ioquatix to file issues with both TruffleRuby and JRuby. Perhaps he can weigh in on how to verify the behavior in a more reliable way.

headius · 2024-11-07T13:05:34Z

I should also point out that in the pr where I am investigating the failure, CRuby isn't just off a little bit. It's off by an order of magnitude (8-35x off from expected). Something is causing a 0.1ms sleep to take way longer on Darwin. This is not just an inconvenience for testing. I think it exposes a real problem.

eregon · 2024-11-07T13:14:02Z

man nanosleep on Linux makes it pretty clear there are no guarantees (e.g. nanosleep() suspends the execution of the calling thread until either at least the time specified in *req has elapsed, or ...) and on a busy system sleep calls take longer than what's asked.
Also I guess JRuby uses java.lang.Thread#sleep(long, int) like TruffleRuby does.

Anyway, feel free to try to improve this spec or check CRuby's sleep impl.
The issue is adding a transiently failing spec is causing havoc everywhere.

headius · 2024-11-07T13:21:01Z

And I'm saying it wasn't transiently failing when I merged it over. It was consistently passing on Linux and Windows and consistently failing on Darwin by a wide margin. I went with your suggestion to leave it enabled only on the passing platforms and opened a draft PR to investigate further.

I don't care if it is quarantined for now but I believe it is wrong to dismiss the failures as being due solely to unreliable sleep times. Being off by an order of magnitude is not just unreliable, it's broken.

ioquatix · 2024-11-07T20:47:52Z

IIRC, the issue was sleep(x) for some values of x < 1ms (again, IIRC), was the same as sleep(0) on JRuby and TruffleRuby due to the internal implementation truncating values. If used in a loop, this behaviour can cause the loop to execute far more rapidly than expected. I think the expectation is that sleep(x) sleeps for >= a wall clock time of x.

If your sleep(x) is sleeping for less than x, or consistently more than 10x, I think the implementation may have issues. It's true that a over-subscribed system may put a chunky factor on sleep(x), so it's not good to rely in sleep(x) <= fx where f is some factor like 2 or 3. In practice, as has been discussed and shown by tests time and time again, it's extremely likely to be problematic (i.e. "flaky tests")

      (actual_duration > 0.01).should == true # 100 * 0.0001 => 0.01
      (actual_duration < 1).should == true

The first assertion seems totally reasonable to me, although technically it could be >= if we are pedantic.

The second assertion is frustratingly problematic. On an oversubscribed system, sleep could be out by a large factor 10-100x, especially on GitHub runners, and especially the macOS runners, which seem to be far more over-subscribed in practice, leading to a lot of scheduling issues.

The only reasonable way I've found to deal with this is to try and detect how over-subscribed the system is and try to compensate for it, when dealing with these kinds of tests. The other option is to simply remove assertions on upper bounds. Trying to add a multiplier of 100x is still bound to be flaky on some edge cases (bell curve etc).

As a counter point, on a normal under-subscribed system (not overloaded), sleep(x) should be within some reasonable factor of x, maybe at most 10%. It's also true, there are sometimes bugs in the implementation that cause that factor to be large (invalid scheduling, problematic implementation, etc). Therefore, this kind of test will catch those kinds of problems. So I do believe that there is value in such tests, just that it's extremely hard to express without being flaky.

Bear in mind that adding a 100x factor may also hide bugs... In other words, the assertion, when run on an under-subscribed system was better for detecting issues. It's an extremely tricky issue. Maybe we need to differentiate between "hard failure" (assertion) and some kind of informational "something looks wrong".

(Sorry, I'm not sure this really helps much, I've been here multiple times with async's scheduler and I understand the pain and frustration).

headius · 2024-11-07T21:08:50Z

@ioquatix thank you for the input! I agree that the excessively long sleep times on MRI seem like they're a problem to be investigated. For now, I have simply removed the < 0.03 check since the purpose of this spec and fix was to check that sleeps were long enough, not whether they are too long.

I won't have time to look further into why MRI failed the 0.03 check but I think somebody should.

eregon · 2024-11-07T21:09:23Z

@ioquatix that makes a lot of sense and matches my experiences as well.
@headius removed the upper bound in https://github.com/ruby/spec/pull/1212/files so that should no longer fail transiently based on how oversubscribed is the system.

@headius Did you get some conflicts when rebasing specs? Anything non-trivial?
Sometimes it can be not so easy to resolve and typically one needs to look at the changes to that file on 2 branches to see what each commit meant to do.

ioquatix · 2024-11-07T21:14:25Z

the excessively long sleep times on MRI seem like they're a problem to be investigated

Is this when running locally, e.g. under-subscribed system, or when running in CI, where we can expect it to be "problematic"?

eregon · 2024-11-07T21:14:29Z

FWIW in oracle/truffleruby#2716 (comment) I made some measurements.
It looks like sub-millisecond sleeps are very imprecise, on all Ruby implementations, most likely because the OS doesn't give much guarantees there.

headius · 2024-11-07T22:07:34Z

Did you get some conflicts when rebasing specs? Anything non-trivial?

@eregon There were a few small conflicts, I think both TruffleRuby and CRuby added the same specs. Nothing that was difficult to resolve.

Is this when running locally, e.g. under-subscribed system, or when running in CI, where we can expect it to be "problematic"?

@ioquatix Locally for me on MacOS and in CI on Linux and Windows it passed consistently. CI on MacOS consistently failed, with 100 sleeps of 0.1ms taking anywhere from 80ms to 350ms.

Apparently TruffleRuby also had issues.

JRuby passes it consistently in CI on all platforms we test (mostly Linux).

headius · 2024-11-07T22:31:57Z

It looks like sub-millisecond sleeps are very imprecise

Perhaps I'm looking at it wrong, but it seems like only TruffleRuby is having difficulties there. JRuby, now fixed, always has a quantum under in the range of 0.003 to 0.005 on my M1 Mac.

It's also hard to say it's the sleep that's imprecise in your testing. If it warms up does it stabilize for TR?

CRuby is always around 0.003 except when it GCs (looping over the example 10k times).

eregon · 2024-11-08T19:56:58Z

It's the same for all Rubies, in theory they should all print 0.10... but they don't (this is on a pretty idle Linux desktop):

$ ruby -v -rbenchmark -e '3.times { p Benchmark.realtime { 1000.times { sleep 0.0001 } } }'
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
0.15385151699956623
0.1538250559988228
0.1538307459995849

$ ruby -v -rbenchmark -e '3.times { p Benchmark.realtime { 1000.times { sleep 0.0001 } } }'
jruby 9.4.9.0-SNAPSHOT (3.1.4) 2024-11-04 746b84aa02 OpenJDK 64-Bit Server VM 17.0.12+7 on 17.0.12+7 +jit [x86_64-linux]
0.16409351599941147
0.15885490400069102
0.15931887199985795

$ ruby -v -rbenchmark -e '3.times { p Benchmark.realtime { 1000.times { sleep 0.0001 } } }'
truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM Native [x86_64-linux]
0.16984848700121802
0.16197076400021615
0.16043854000054125

$ ruby -v -rbenchmark -e '3.times { p Benchmark.realtime { 1000.times { sleep 0.0001 } } }'
truffleruby 24.1.1, like ruby 3.2.4, Oracle GraalVM JVM [x86_64-linux]
0.16442835199995898
0.16147058500064304
0.15863075099878188

eregon · 2024-11-08T20:08:47Z

Using strace -T (+ --follow-forks on JRuby), they all do:

futex(0x822210, FUTEX_WAKE_PRIVATE, 1)  = 0 <0.000016>
futex(0x823260, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=11789, tv_nsec=488181439}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) <0.000148>
futex(0x822210, FUTEX_WAKE_PRIVATE, 1)  = 0 <0.000015>
futex(0x823260, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=11789, tv_nsec=488404312}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) <0.000148>
futex(0x822210, FUTEX_WAKE_PRIVATE, 1)  = 0 <0.000015>
futex(0x823260, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=11789, tv_nsec=488626766}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) <0.000149>
futex(0x822210, FUTEX_WAKE_PRIVATE, 1)  = 0 <0.000016>
futex(0x823260, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=11789, tv_nsec=488848929}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out) <0.000149>

Notice how each sleep is too long by about 50% / 0.00005s.

eregon · 2024-11-08T20:11:46Z

My take on this is we are not using real-time operating systems, so sleep() is at least as much as passed, with pretty much no guaranteed upper bound under load or when the sleep amount is small (e.g. below 1ms).

ioquatix · 2024-11-08T21:01:49Z

Nice analysis! It's great to see the original problem I had (sleep(small x) = no op) gone and all platforms behaving very similarly.

jeremyevans and others added 30 commits November 5, 2024 11:29

Pend some tests because these are not working with macOS 15 beta and …

838758c

…Xcode 16 beta

Revert pending EVENT_RETURN tests

c2ec9df

* "Allow ambiguosity of `return` line" 65b991bc8571b7b718fc22bd33a43c4d269bf52d * "Move to test/.excludes-prism" 3b4ff810d2fefdf0194bd774bc04f6f17e2ccae7 * "Pending `EVENT_RETURN` settracefunc tests with Prism" a7f33c99c69e3cc62b7a24ce35f51f76cc5bfaa2

Split URI::Parser examples with RFC2396 and RFC3986

6f6918a

Prepare for ruby/uri#107

Stop depending on Integer#size to return sizeof(long)

a076c75

There is no guarantee that Integer#size will continue to return `sizeof(long)` for small integers. Use the `l!` specifier for Array#pack instead. It is a public interface that has a direct relationship with the `long` type.

Make Integer#round spec work regardless of pointer size

b3bf3f2

The spec is actually testing a behaviour stemming from NUM2INT(), and since `sizeof(long)>=sizeof(int)`, `min_long-1` always makes NUM2INT() raise `RangeError`.

Range#step: restore legacy behavior for String ranges

29950c6

Implement Array#fetch_values

4887390

[Feature #20702] Works the same way than `Hash#fetch_values` for for array.

Skip failing examples related with ruby/ruby@d81b058

06e6412

Update exception message in string_for_symbol

e18ddeb

This is a static function only called in two places (rb_to_id and rb_to_symbol), and in both places, both symbols and strings are allowed. This makes the error message consistent with rb_check_id and rb_check_symbol. Fixes [Bug #20607]

Add anonymous module name spec.

36b253d

Temporary disabled RactorLocalSingleton from constant leak check.

45fa1bb

Define RactorLocalSingleton on .mspec.constants

6555ac7

Update spec/ruby/ for colon-style hash inspect

671b997

Better handling of timeout in rb_io_maybe_wait_*. (#9531)

0067f7f

Update io_spec.c to use rb_io_maybe_wait* if possible. (#11792)

0c6daf4

[Bug #20705] Update strtod implementation

87ffbe2

The absence of either the integer or fractional part should be allowed.

Support IO#timeout for rsock_connect. (#11880)

3c35c8d

[Bug #20797] Check seconds in UTC offset as well as minutes

5e2e976

[Bug #20797] Yet another test

51bda14

andrykonchin added 3 commits November 6, 2024 12:56

Fix specs for Encoding.compatible? with --enable-frozen-string-literal

0c1e155

Fix Rubocop warnings for core/encoding/compatible_spec.rb

7723ed9

Fix Rubocop warnings for core/module/name_spec.rb

1474df8

headius added 2 commits November 6, 2024 11:39

Move sleep ns precision spec to kernel and omit darwin

3cc8136

This is really a sleep spec, so it should be tested under the Kernel#sleep specs. Fails consistently on MacOS (in CI) and passes consistently on Linux and Windows so omit MacOS for now.

Merge branch 'master' into spec_update_20241105

287ac86

headius merged commit a230079 into ruby:master Nov 6, 2024
14 checks passed

headius deleted the spec_update_20241105 branch November 6, 2024 17:53

headius mentioned this pull request Nov 6, 2024

Update specs for 3.4 behavior changes #1207

Closed

graalvmbot pushed a commit to oracle/truffleruby that referenced this pull request Nov 7, 2024

Quarantine unreliable spec

87f41df

* See discussion on ruby/spec#1210

matzbot pushed a commit to ruby/ruby that referenced this pull request Nov 7, 2024

Quarantine unreliable spec

6385a28

* See discussion on ruby/spec#1210

eregon mentioned this pull request Nov 7, 2024

Spec update from MRI #1208

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spec update 20241105 #1210

Spec update 20241105 #1210

headius commented Nov 6, 2024

andrykonchin commented Nov 6, 2024

headius commented Nov 6, 2024

eregon commented Nov 7, 2024

eregon commented Nov 7, 2024 •

edited

Loading

headius commented Nov 7, 2024

headius commented Nov 7, 2024

eregon commented Nov 7, 2024

headius commented Nov 7, 2024 •

edited

Loading

ioquatix commented Nov 7, 2024 •

edited

Loading

headius commented Nov 7, 2024

eregon commented Nov 7, 2024

ioquatix commented Nov 7, 2024

eregon commented Nov 7, 2024 •

edited

Loading

headius commented Nov 7, 2024

headius commented Nov 7, 2024

eregon commented Nov 8, 2024 •

edited

Loading

eregon commented Nov 8, 2024

eregon commented Nov 8, 2024

ioquatix commented Nov 8, 2024

Spec update 20241105 #1210

Spec update 20241105 #1210

Conversation

headius commented Nov 6, 2024

andrykonchin commented Nov 6, 2024

headius commented Nov 6, 2024

eregon commented Nov 7, 2024

eregon commented Nov 7, 2024 • edited Loading

headius commented Nov 7, 2024

headius commented Nov 7, 2024

eregon commented Nov 7, 2024

headius commented Nov 7, 2024 • edited Loading

ioquatix commented Nov 7, 2024 • edited Loading

headius commented Nov 7, 2024

eregon commented Nov 7, 2024

ioquatix commented Nov 7, 2024

eregon commented Nov 7, 2024 • edited Loading

headius commented Nov 7, 2024

headius commented Nov 7, 2024

eregon commented Nov 8, 2024 • edited Loading

eregon commented Nov 8, 2024

eregon commented Nov 8, 2024

ioquatix commented Nov 8, 2024

eregon commented Nov 7, 2024 •

edited

Loading

headius commented Nov 7, 2024 •

edited

Loading

ioquatix commented Nov 7, 2024 •

edited

Loading

eregon commented Nov 7, 2024 •

edited

Loading

eregon commented Nov 8, 2024 •

edited

Loading