From 7285f2ec75bd8d8426474dccbc1674b921fe576a Mon Sep 17 00:00:00 2001 From: Will Childs-Klein Date: Wed, 18 Oct 2023 17:40:52 -0400 Subject: [PATCH] Add support for BIO_FP_TEXT in file BIOs (#1153) OpenSSL defines a "close flag" `BIO_FP_TEXT` which is used by callers to indicate that file-backed BIOs should be opened in "text" mode as opposed to the default of raw "binary" mode. Of platforms currently supported by AWS-LC, this is only meaningful on windows. From linux's `fopen` [man page](https://man7.org/linux/man-pages/man3/fopen.3.html): ``` The mode string can also include the letter 'b' either as a last character or as a character between the characters in any of the two-character strings described above. This is strictly for compatibility with ISO C and has no effect; the 'b' is ignored on all POSIX conforming systems, including Linux. (Other systems may treat text files and binary files differently, and adding the 'b' may be a good idea if you do I/O to a binary file and expect that your program may be ported to non-UNIX environments.) ``` And from [Windows](https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfopen?view=msvc-170#generic-text-routine-mappings): ``` t Open in text (translated) mode. Carriage return-line feed (CR-LF) combinations are translated into single line feeds (LF) on input and LF characters are translated to CR-LF combinations on output. Also, CTRL+Z is interpreted as an end-of-file character on input. b Open in binary (untranslated) mode; translations involving carriage-return and line feed characters are suppressed. ``` The terminology here is a bit confusing. In text mode, "Input" here refers to reading files and "output" refers to writing files to disk. So, if an application is using conventional unixy LF line endings (`\n`) and then writes a line of text to a file, the Windows CRT will instead write a CRLF line ending (`\r\n`). When reading from a file, the CRT will translate CRLF back to LF transparently to the application. In binary mode, what you see is what you get and no translation occurs. OpenSSL supports setting text mode for a variety of BIO operations (including [when opening files with `fopen`](https://github.com/openssl/openssl/blob/3d254b31344e82b8f10fda8bab196757a377eb63/crypto/bio/bss_file.c#L267-L290)), but only sets this flag internally [when calling `BIO_new_fp`](https://github.com/search?q=repo%3Aopenssl/openssl%20BIO_FP_TEXT&type=code). Note that `BIO_new_fp` operates on open `FILE *`s, and [does not itself open the file](https://github.com/openssl/openssl/blob/3d254b31344e82b8f10fda8bab196757a377eb63/crypto/bio/bss_file.c#L207-L266). Instead, it uses the Windows CRT-specific function `_setmode` to set binary/text mode on the already opened file. To implement this functionality in AWS-LC, we only set translation mode in functions [using the `BIO_C_SET_FILE_PTR` control directive](https://github.com/aws/aws-lc/blob/main/crypto/bio/file.c#L191C10-L191C28): `BIO_set_fp` and `BIO_new_fp`. AWS-LC's BIO functions that call `fopen` along the way [do not provide the caller a parameter for specifying flags](https://github.com/aws/aws-lc/blob/main/crypto/bio/file.c#L287-L306), so we're limited here by the interface. Finally, it's a little odd to refer to `BIO_FP_TEXT` as a "close flag", as the behavior it determines has nothing to do with how or if the file is closed. However, this is how OpenSSL's code refers to it, so it's the terminology we use here. ## Links Documentation links for the Windows-specific functions used in this change can be found here: - [`_setmode`](https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170): used to set translation mode on an open file - [`_fileno`](https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fileno?view=msvc-170): used to convert a `FILE *` to a file descriptor --- crypto/bio/bio_test.cc | 99 ++++++++++++++++++++++++++++++++++++++++++ crypto/bio/file.c | 12 +++++ include/openssl/bio.h | 30 +++++++++---- 3 files changed, 133 insertions(+), 8 deletions(-) diff --git a/crypto/bio/bio_test.cc b/crypto/bio/bio_test.cc index 9266581381..e0df1a139d 100644 --- a/crypto/bio/bio_test.cc +++ b/crypto/bio/bio_test.cc @@ -154,6 +154,105 @@ TEST(BIOTest, Printf) { } } +TEST(BIOTest, CloseFlags) { +#if defined(OPENSSL_ANDROID) + // On Android, when running from an APK, |tmpfile| does not work. See + // b/36991167#comment8. + GTEST_SKIP(); +#endif + + // unique_ptr will automatically call fclose on the file descriptior when the + // variable goes out of scope, so we need to specify BIO_NOCLOSE close flags + // to avoid a double-free condition. + using TempFILE = std::unique_ptr; + + const char *test_str = "test\ntest\ntest\n"; + + // Assert that CRLF line endings get inserted on write and translated back out on + // read for text mode. + TempFILE text_bio_file(tmpfile(), fclose); + ASSERT_TRUE(text_bio_file); + bssl::UniquePtr text_bio(BIO_new_fp(text_bio_file.get(), BIO_NOCLOSE | BIO_FP_TEXT)); + int bytes_written = BIO_write(text_bio.get(), test_str, strlen(test_str)); + EXPECT_GE(bytes_written, 0); + ASSERT_TRUE(BIO_flush(text_bio.get())); + ASSERT_EQ(0, BIO_seek(text_bio.get(), 0)); // 0 indicates success here + char contents[256]; + OPENSSL_memset(contents, 0, sizeof(contents)); + int bytes_read = BIO_read(text_bio.get(), contents, sizeof(contents)); + EXPECT_GE(bytes_read, bytes_written); + EXPECT_EQ(test_str, std::string(contents)); + + // Windows should have translated '\n' to '\r\n' on write, so validate that + // by opening the file in raw binary mode (i.e. no BIO_FP_TEXT). + bssl::UniquePtr text_bio_raw(BIO_new_fp(text_bio_file.get(), BIO_NOCLOSE)); + ASSERT_EQ(0, BIO_seek(text_bio.get(), 0)); // 0 indicates success here + OPENSSL_memset(contents, 0, sizeof(contents)); + bytes_read = BIO_read(text_bio_raw.get(), contents, sizeof(contents)); + EXPECT_GT(bytes_read, 0); +#if defined(OPENSSL_WINDOWS) + EXPECT_EQ("test\r\ntest\r\ntest\r\n", std::string(contents)); +#else + EXPECT_EQ(test_str, std::string(contents)); +#endif + + // Assert that CRLF line endings don't get inserted on write for + // (default) binary mode. + TempFILE binary_bio_file(tmpfile(), fclose); + ASSERT_TRUE(binary_bio_file); + bssl::UniquePtr binary_bio(BIO_new_fp(binary_bio_file.get(), BIO_NOCLOSE)); + bytes_written = BIO_write(binary_bio.get(), test_str, strlen(test_str)); + EXPECT_EQ((int) strlen(test_str), bytes_written); + ASSERT_TRUE(BIO_flush(binary_bio.get())); + ASSERT_EQ(0, BIO_seek(binary_bio.get(), 0)); // 0 indicates success here + OPENSSL_memset(contents, 0, sizeof(contents)); + bytes_read = BIO_read(binary_bio.get(), contents, sizeof(contents)); + EXPECT_GE(bytes_read, bytes_written); + EXPECT_EQ(test_str, std::string(contents)); + + // This test is meant to ensure that we're correctly handling a ftell/fseek + // bug on Windows documented and reproduced here: + // https://developercommunity.visualstudio.com/t/fseek-ftell-fail-in-text-mode-for-unix-style-text/425878 + long pos; + char b1[256], b2[256]; + binary_bio.reset(BIO_new_fp(binary_bio_file.get(), BIO_NOCLOSE)); + ASSERT_EQ(0, BIO_seek(binary_bio.get(), 0)); // 0 indicates success here + BIO_gets(binary_bio.get(), b1, sizeof(b1)); + pos = BIO_tell(binary_bio.get()); + BIO_gets(binary_bio.get(), b1, sizeof(b1)); + BIO_seek(binary_bio.get(), pos); + BIO_gets(binary_bio.get(), b2, sizeof(b2)); + EXPECT_EQ(std::string(b1), std::string(b2)); + + // Assert that BIO_CLOSE causes the underlying file to be closed on BIO free + // (ftell will return < 0) + FILE *tmp = tmpfile(); + ASSERT_TRUE(tmp); + BIO *bio = BIO_new_fp(tmp, BIO_CLOSE); + EXPECT_EQ(0, BIO_tell(bio)); + // save off fd to avoid referencing |tmp| after free and angering valgrind + int tmp_fd = fileno(tmp); + EXPECT_LT(0, tmp_fd); + EXPECT_TRUE(BIO_free(bio)); + // Windows CRT hits an assertion error and stack overflow (exception code + // 0xc0000409) when calling _tell or lseek on an already-closed file + // descriptor, so only consider the non-Windows case here. +#if !defined(OPENSSL_WINDOWS) + EXPECT_EQ(-1, lseek(tmp_fd, 0, SEEK_CUR)); + EXPECT_EQ(EBADF, errno); // EBADF indicates that |BIO_free| closed the file +#endif + + // Assert that BIO_NOCLOSE does not close the underlying file on BIO free + tmp = tmpfile(); + ASSERT_TRUE(tmp); + bio = BIO_new_fp(tmp, BIO_NOCLOSE); + EXPECT_EQ(0, BIO_tell(bio)); + EXPECT_TRUE(BIO_free(bio)); + EXPECT_TRUE(tmp); + EXPECT_EQ(0, ftell(tmp)); // 0 indicates file is still open + EXPECT_EQ(0, fclose(tmp)); // 0 indicates success for fclose +} + TEST(BIOTest, ReadASN1) { static const size_t kLargeASN1PayloadLen = 8000; diff --git a/crypto/bio/file.c b/crypto/bio/file.c index 8ba9c544c2..c1ea7bfc36 100644 --- a/crypto/bio/file.c +++ b/crypto/bio/file.c @@ -79,6 +79,11 @@ #include #include +#if defined(OPENSSL_WINDOWS) +#include +#include +#endif // OPENSSL_WINDOWS + #include #include @@ -193,6 +198,13 @@ static long file_ctrl(BIO *b, int cmd, long num, void *ptr) { b->shutdown = (int)num & BIO_CLOSE; b->ptr = ptr; b->init = 1; +#if defined(OPENSSL_WINDOWS) + // Windows differentiates between "text" and "binary" file modes, so set + // the file to text mode if caller specifies BIO_FP_TEXT flag. + // + // https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170#remarks + _setmode(_fileno(b->ptr), num & BIO_FP_TEXT ? _O_TEXT : _O_BINARY); +#endif break; case BIO_C_SET_FILENAME: file_free(b); diff --git a/include/openssl/bio.h b/include/openssl/bio.h index b0e8694737..cc296db116 100644 --- a/include/openssl/bio.h +++ b/include/openssl/bio.h @@ -395,11 +395,6 @@ OPENSSL_EXPORT int BIO_read_asn1(BIO *bio, uint8_t **out, size_t *out_len, // // |BIO_ctrl_pending| returns the number of bytes currently stored. -// BIO_NOCLOSE and |BIO_CLOSE| can be used as symbolic arguments when a "close -// flag" is passed to a BIO function. -#define BIO_NOCLOSE 0 -#define BIO_CLOSE 1 - // BIO_s_mem returns a |BIO_METHOD| that uses a in-memory buffer. OPENSSL_EXPORT const BIO_METHOD *BIO_s_mem(void); @@ -451,6 +446,23 @@ OPENSSL_EXPORT int BIO_set_mem_buf(BIO *bio, BUF_MEM *b, int take_ownership); OPENSSL_EXPORT int BIO_set_mem_eof_return(BIO *bio, int eof_value); +// BIO close flags. +// +// These can be used as symbolic arguments when a "close flag" is passed to a +// BIO function. + +// BIO_NOCLOSE will not close the underlying file on BIO free +#define BIO_NOCLOSE 0 + +// BIO_CLOSE will close the underlying file on BIO free +#define BIO_CLOSE 1 + +// BIO_FP_TEXT will cause the file to be treated as a text file instead of the +// default behavior of treating it as a raw binary file. This is only relevant +// on Windows due to CRLF endings. +#define BIO_FP_TEXT 0x10 + + // File descriptor BIOs. // // File descriptor BIOs are wrappers around the system's |read| and |write| @@ -505,7 +517,8 @@ OPENSSL_EXPORT BIO *BIO_new_file(const char *filename, const char *mode); // BIO_new_fp creates a new file BIO that wraps the given |FILE|. If // |close_flag| is |BIO_CLOSE|, then |fclose| will be called on |stream| when -// the BIO is closed. +// the BIO is closed. If |close_flag| is |BIO_FP_TEXT|, the file will be set as +// a text file after opening (only on Windows). OPENSSL_EXPORT BIO *BIO_new_fp(FILE *stream, int close_flag); // BIO_get_fp sets |*out_file| to the current |FILE| for |bio|. It returns one @@ -513,8 +526,9 @@ OPENSSL_EXPORT BIO *BIO_new_fp(FILE *stream, int close_flag); OPENSSL_EXPORT int BIO_get_fp(BIO *bio, FILE **out_file); // BIO_set_fp sets the |FILE| for |bio|. If |close_flag| is |BIO_CLOSE| then -// |fclose| will be called on |file| when |bio| is closed. It returns one on -// success and zero otherwise. +// |fclose| will be called on |file| when |bio| is closed. If |close_flag| is +// |BIO_FP_TEXT|, the file will be set as a text file (only on Windows). It +// returns one on success and zero otherwise. OPENSSL_EXPORT int BIO_set_fp(BIO *bio, FILE *file, int close_flag); // BIO_read_filename opens |filename| for reading and sets the result as the