Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add toSnakeCase method to StringUtils #1310

Conversation

ShailendraRathore
Copy link

PR Description: Add toSnakeCase Method to StringUtils

Summary

This PR introduces the toSnakeCase(String str) method to the StringUtils class in Apache Commons Lang. This method converts camelCase or PascalCase strings to snake_case, enhancing the library’s support for string case transformations. The optimized implementation ensures accurate conversion, even when dealing with sequences of uppercase letters, and reduces unnecessary condition checks for improved readability and performance.

Why This Change is Needed

String case transformations are widely used across applications, especially when adhering to naming conventions for different domains (e.g., snake_case for database fields, camelCase for variables in code). Currently, StringUtils lacks a direct method for converting camel case to snake case, which is essential in many applications. This enhancement provides a reliable and performant way to handle camelCase-to-snake_case transformations without requiring developers to implement custom solutions.

Method Behavior

The toSnakeCase method:

  1. Converts camelCase or PascalCase strings (e.g., "camelCase", "CamelCase") to snake_case (e.g., "camel_case").
  2. Intelligently handles consecutive uppercase letters, converting "JSONParser" to "json_parser" without unnecessary underscores.
  3. Returns null for null inputs and an empty string for empty inputs.

Examples

StringUtils.toSnakeCase(null)                   = null
StringUtils.toSnakeCase("")                     = ""
StringUtils.toSnakeCase("simple")               = "simple"
StringUtils.toSnakeCase("camelCase")            = "camel_case"
StringUtils.toSnakeCase("CamelCase")            = "camel_case"
StringUtils.toSnakeCase("thisIsATest")          = "this_is_a_test"
StringUtils.toSnakeCase("JSONParser")           = "json_parser"
StringUtils.toSnakeCase("already_snake_case")   = "already_snake_case"
StringUtils.toSnakeCase("A")                    = "a"
StringUtils.toSnakeCase("multipleUPPERCases")   = "multiple_upper_cases"
StringUtils.toSnakeCase("HelloWorld")           = "hello_world"

Implementation Details and Optimizations

  1. Tracking Previous Character: The method uses a prevChar variable to track the previous character, reducing redundant checks on str.charAt(i - 1).
  2. Handling the First Character Separately: The result is initialized with the lowercase version of the first character, ensuring there’s no unnecessary underscore at the start.
  3. Minimized Conditional Checks: By only adding underscores when necessary (such as transitions from lowercase to uppercase or handling uppercase sequences with lowercase following), the code is both efficient and readable.

Unit Tests

Unit tests have been added in StringUtilsTest to verify the correctness of toSnakeCase:

  • Conversion of standard camel case and Pascal case strings.
  • Accurate handling of consecutive uppercase letters.
  • Edge cases, including null, empty strings, single uppercase letters, and already snake case strings.

Example Test Cases

assertEquals("camel_case", StringUtils.toSnakeCase("camelCase"));
assertEquals("json_parser", StringUtils.toSnakeCase("JSONParser"));
assertEquals("simple", StringUtils.toSnakeCase("simple"));
assertEquals("this_is_a_test_string", StringUtils.toSnakeCase("thisIsATestString"));
assertEquals("", StringUtils.toSnakeCase(""));
assertEquals("a", StringUtils.toSnakeCase("A"));
assertEquals("hello_world", StringUtils.toSnakeCase("HelloWorld"));
assertEquals("multiple_upper_cases", StringUtils.toSnakeCase("multipleUPPERCases"));
assertNull(StringUtils.toSnakeCase(null));
assertEquals("already_snake_case", StringUtils.toSnakeCase("already_snake_case"));

Additional Notes

This optimized implementation is efficient with a time complexity of (O(n)) and minimal overhead. The added functionality aligns with Apache Commons Lang's design philosophy, providing a simple, well-tested solution for common string transformations.

@garydgregory
Copy link
Member

@ShailendraRathore
LANG is the wrong component for text-like processing. We've migrated this type of functionality to TEXT. See also apache/commons-text#552

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants