-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancing Variable Naming #181
Comments
Similar/related:
This sounds reasonable to me. Maybe we could also add an option/config to allow this to be specified/tweaked by individuals as needed.
This sounds like a cool approach. It sounds similar to what a lot of the good AI coding assistants do to provide context Eg. Heres a blog post about Aider implementing their 'repo map': 'Stack Graphs' is also another potentially interesting tool/tech in this space, that is part of what powers GitHub's 'precise code navigation': A few notes/links/references I recently collated RE: stack graphs + related libs
It'd probably be better to raise this as a separate bug issue than to group it within this one; as it's more likely to be seen/fixed quicker that way. |
First of all, I'd like to thank the team for developing such a valuable tool. The integration of humanify for JavaScript deobfuscation is incredibly useful, and I appreciate the work that has been done.
However, using the local version of humanify (with the 2b model), I've encountered a few issues that could be resolved to improve the tool's functionality, especially when working with large, complex codebases.
Repeated variable names
In large code bases with many variables, humanify frequently assigns the same name to different variables, leading to confusion and reduced readability - especially when multiple variables are renamed in generic terms such as _______variable.
We should introduce a check to prevent duplication of variable names in the same scope. If a name has already been used, the system could ask for confirmation, ensuring that names remain unique and clear.
Limiting the length of variable names
Currently, variable names are limited to 12 characters, which can have the effect of truncating the names, making them more difficult to understand.
We should raise the limit to 25 characters, which would result in more descriptive names. In addition, we should ensure that names are valid in ASCII and avoid arbitrary word cuts for better readability.
If the model cannot give a concise enough name, we can add a condition to explicitly ask it to give a shorter name.
Insufficient Context for Variable Naming
When variables are defined in isolation (e.g., in very large blocks of declarations), the model struggles to assign meaningful names due to the lack of context:
`
var context = this;
var counter;
var generateCodes;
var d = 0;
var k = 0;
var g = 0;
var e = 8;
var b = -4;
var h = 0;
var n = 2;
var m = 0;
var q = 0.5;
var r = 0;
var s = 15;
var l = [];
var v = 99999;
var C;
var z;
var A = 0;
var G = false;
var J = false;
var K = false;
`
Providing the model with additional context from different sections of the code where the variables are actually used would likely improve the accuracy of the variable names. (We could go through the ast around the declaration zone and take several pieces where the variable is modified and used.) This enhancement would allow the model to make more informed decisions based on actual usage patterns rather than relying solely on initial definitions.
Edit: the '--contextSize' parameter is used to enlarge the program's context window. However, the parameter seems to have been declared as an optional Boolean and not as a valid optional integer (see:
humanify/src/commands/local.ts
Lines 24 to 28 in 706108e
Base on the commander.js lib documentation, it should be:
.option( "-c, --contextSize <contextSize>", "The context size to use for the LLM",
${DEFAULT_CONTEXT_WINDOW_SIZE})
Thank you for considering these suggestions. I look forward to future improvements.
The text was updated successfully, but these errors were encountered: