Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add user callback for handling invalid characters. #283

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yosion-p
Copy link
Contributor

hi buddy,this is work-in progress.
It might not be easy to get the user to write a callback function, but I think we can at least do it first.

I have noticed that there are many issues about callback(#53) at present,
It may be difficult to solve all the problems, but we can take them apart step by step,
So now I have taken the first step (at least we can solve #210 for now),
and the first step now seems easy(or too easy🤣).

First, we declare and tell the user what custom rules can be processed:

iconv.invalidChar;
iconv.ignore = '';
// ...other rules
// Sorry, I don't know how to describe it here, maybe TS would be better

Then, the options argument can be passed in when the user calls public API, like:

iconv.decode(Buffer, encoding,{
  invalidChar: function(){/**/},
  ignore: someRules,
  // ...other rules
})

Finally, we pick it up at decode:

iconv.decode = function decode(buf, encoding, options) {
// ...
if(options && options.invalidChar)  // ...
// ...
}

and the encoding is written similarly.

In short, we accept some special rules that user-written callback functions.

@yosion-p
Copy link
Contributor Author

HI,bro @ashtuchkin
Do you mind if I try to do that?

@ashtuchkin
Copy link
Owner

Hey, I think it's a good intention, but it's also a pretty complex topic (that's why I didn't implement it back in the days) and I don't have enough time these days to actively review or discuss design. Maybe in a couple months when the load on my day job decreases..

I don't think your current approach would work because '�' character can be in the input and we don't want to call the callback in that case.

@yosion-p
Copy link
Contributor Author

yosion-p commented Oct 8, 2021

Hi bro, sincerely speaking, since Github is an open source community, i'd love to contribute, no matter it's easy or hard.

I'm just back from the holiday and got the email. But I'm a little confused.

If we're just trying to solve #210 , this would work,
because the user wants to return as custom expectations.

Now that I'm back, I'm thinking about this PR(or a related set of questions).

because '�' character can be in the input and we don't want to call the callback in that case.

May I just throw an error instead of calling the callback when '�' character be inputed?

@ashtuchkin
Copy link
Owner

ashtuchkin commented Oct 9, 2021

i'd love to contribute, no matter it's easy or hard.

I understand that, and I think it's a great intention and a great way to learn. I appreciate that you're doing this. The problem is that it would require time commitment from my side to take this PR to a level at which I would feel comfortable merging it. This issue is more complex than your previous pull request. I don't have a good design in my head that would resolve it in a good way, so to even start code reviewing I would need to spend time thinking about it / designing it. I don't think an easy solution exists here that would not harm baseline performance and would avoid adding unexpected behavior.

May I just throw an error instead of calling the callback when '�' character be inputed?

No, that would be unexpected behavior for the library users and would potentially break existing use cases.

@yosion-p
Copy link
Contributor Author

More than a month ago, I analyzed almost 40 issues and sorted them into an excel.
I think we can mainly focus on solving these issues:

  1. Callback related: Change default character #210, 127, 83, 81, 73, 55, 53
  2. Coding related: Improper encoding / decoding of some special 7-bit values in cp437, macintosh #257, 231, 218, 215, 201, 145, 129, 109, but for some scattered coding, you want to build an external NPM package to solve it, right? (Mechanism to add encodings from external npm packages #253)
  3. There are issues might be related to the bottom layer of node or Angular: 230, 213, 118, I think we might not be able to solve them.
  4. Features, such as uint8array, eg Mechanism to add encodings from external npm packages #253 mentioned above.

what do u think?

For the second point, about #253, I think this issue is of great significance.
(In case you are distracted by multiple messages, I will not leave messages at 253.)
If I can decouple an independent NPM package, many related coding problems can be solved.
And completing 253 won't affect your next release.

Initially, I'll expose two interfaces (or methods) that allow iconv calls and adds encoding,

Create a test harness that would help authors make sure their codecs work on the whole range of supported environments.

I'm not particularly good at reporting this, but I don't think testing is a particularly important work in the early stage.

Recently, I have plenty of free time,
I'm keen to contribute for this great project,
I learned a lot of knowledge about coding and made related notes which i think is great.
I don't know when will you release the next version,
I'd also love to write UTs and contribute on others besides assist solving the current issues.

@keidarcy
Copy link

Hope we have this callback to handle invalid characters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants