-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[$1000] Web - Inconsistent parsing behavior on copying and editing an invalid URL with HTTP protocol #23535
Comments
Triggered auto assignment to @isabelastisser ( |
Bug0 Triage Checklist (Main S/O)
|
ProposalPlease re-state the problem that we are trying to solve in this issue.Inconsistent parsing behavior on copying and editing an invalid URL with HTTP protocol What is the root cause of that problem?The frontend recognizes the invalid URL, however subsequently it gets replaced by the backend. What changes do you think we should make in order to solve the problem?We should correct the regular expression for detecting URLs to make it consistent with the backend. What alternative solutions did you explore? (Optional) |
ProposalPlease re-state the problem that we are trying to solve in this issue.Web - Inconsistent parsing behavior on copying and editing an invalid URL with HTTP protocol What is the root cause of that problem?The root cause of the problem is to add there regular expression like that function ChatMessage({ message }) {
const isValidURL = /^(?:(?:https?|ftp):\/\/)?[\w-]+(?:\.[\w-]+)+(?:[\w.,@?^=%&:/~+#-]*[\w@?^=%&/~+#-])?$/i.test(message);
return (
<div>
{isValidURL ? (
<a href={message} target="_blank" rel="noopener noreferrer">
{message}
</a>
) : (
message
)}
</div>
);
} What changes do you think we should make in order to solve the problem?We can make changes in What alternative solutions did you explore? (Optional) |
Job added to Upwork: https://www.upwork.com/jobs/~01e96ccfaa92b9a019 |
Current assignee @isabelastisser is eligible for the External assigner, not assigning anyone new. |
Triggered auto assignment to Contributor-plus team member for initial proposal review - @abdulrahuman5196 ( |
ProposalPlease re-state the problem that we are trying to solve in this issue.URLs with protocol added (e.g. What is the root cause of that problem?A "loose" regex was introduced by this PR in order to handle localhost-style URLs. The PR's implementation applies the loose regex to any URL which has a protocol (e.g.
As you can see
This results in invalid URLs being briefly shown as links by FE (as it parses them as valid links), before the BE strips the URL from the link (as it is invalid). What changes do you think we should make in order to solve the problem?We don't really need a completely separate regex for these localhost-style URLs, as it results in quite a bit of code duplication for little benefit and makes it more difficult to maintain than one single regex. When we're talking about localhost-style URLs, we're really just wanting to turn almost any URL with a protocol into a clickable link, regardless of TLD presence/validity. The exceptions would include invalid domains, such as those that start or end in a hyphen, or invalid characters like We can therefore adjust the regex in
The second part is the one that handles localhost-style URLs, and just adds a bit of regex to prevent ending a URL with a We then just need to tweak
This passes all existing tests here, as well as the following additional tests:
What alternative solutions did you explore? (Optional)We can also just keep the loose regex to support localhost-style URLs, but tighten it back up so that it still blocks incorrect values, such as leading (and ending) hyphens or underscores. This is easy to do by just reusing the domain part of
|
@isabelastisser, @abdulrahuman5196 Uh oh! This issue is overdue by 2 days. Don't forget to update your issues! |
Reviewing today |
ProposalPlease re-state the problem that we are trying to solve in this issue.The string (http://_) is parsed as Url, but it doesn't have the valid TLD. What is the root cause of that problem?LOOSE_URL_WEBSITE_REGEX is allowing underscore(_) which is not valid on domain name. Additionally for domain label, starting and ending with hyphen is not allowed, but the regex validates those cases What changes do you think we should make in order to solve the problem?To fix the root cause, we should tweak a bit to block underscore, and starting and ending with hyphen on LOOSE_URL_WEBSITE_REGEX. while doing that, we can combine URL_WEBSITE_REGEX and LOOSE_URL_WEBSITE_REGEX into one regex pattern, so it makes maintaining consistency rather than maintaining 2 regex of URL validation. a more tweak is to add total/each label 's length limitation. According to RFC1034, each label is zero to 63 octets in length, and the total number of octets that represent a domain name is limited to 255.
and remove capturing parenthesis on MARKDOWN_URL_REGEX. or we can remove variable MARKDOWN_URL_REGEX and directly use URL_REGEX.
it's tested with following cases
and does not make original autotest cases failed. |
ProposalPlease re-state the problem that we are trying to solve in this issue.String "http://_" get parsed as Url but it doesn't have the valid TLD so it should be processed as the text What is the root cause of that problem?In the lib expensify-common has expensifyMark.js which process the comment based on regex rules on these rules "autolink" rules regex is accepting this as valid url and processing it What changes do you think we should make in order to solve the problem?We have update the regular expression to check for the TLD as well |
📣 @abdullah1111! 📣
|
This comment was marked as off-topic.
This comment was marked as off-topic.
✅ Contributor details stored successfully. Thank you for contributing to Expensify! |
📣 It's been a week! Do we have any satisfactory proposals yet? Do we need to adjust the bounty for this issue? 💸 |
Hey @abdulrahuman5196, can you please review the proposals? Thanks! |
Hi @abdulrahuman5196, please review the proposals so we can move this along. |
Hey @abdulrahuman5196, can you please review the proposal so we can move this along? Thanks! |
@isabelastisser @abdulrahuman5196 this issue is now 4 weeks old and preventing us from maintaining WAQ, can you:
Thanks! |
Current assignee @abdulrahuman5196 is eligible for the Internal assigner, not assigning anyone new. |
@abdulrahuman5196, can you please review the updated proposal today? Thanks! |
Reviewing now |
I need to spend more time reviewing if this doesn't bring back the original localhost issue and keeping the new one strict. I am little worried on this, so will test and update here @jjcoffee
@Antasel I not sure if I could consider the proposal as new one, its almost similar to @jjcoffee 's proposal, |
@abdulrahuman5196 Take your time! Let me know if you need anything more from me 🙂 Side-note: BE doesn't have any length limitations when it filters URLs so it's more of a nice-to-have to add it on FE. Localhost URLs also don't have length limitations so the test case in @Antasel's proposal is technically incorrect, so this URL:
works in Chrome. |
Hi @jjcoffee Im not sure what you are taking about, have you taged me by mistake? |
@isabelastisser @jjcoffee @abdulrahuman5196 it's because Backend did not use FILTER_VALIDATE_URL to filter url . (more precisely, FILTER_VALIDATE_URL is limiting total lengh to 253 )
@jjcoffee didn't you test FILTER_VALIDATE_URL when commenting it ? |
@abdullah1111 Damn sorry, meant to tag @abdulrahuman5196 - typing too fast 😅 |
@abdulrahuman5196, what are the next steps here? Do we need to choose a proposal? This issue was changed to Internal. Should we have an Expensify engineer take over? |
No need for internal engineer yet. I have to re-evaluate the proposals |
Bump @abdulrahuman5196 |
Will be working on proposal review today |
@isabelastisser I am not seeing this issue anymore on staging. Can you check again? |
Hey @abdulrahuman5196, thanks for the heads up; I just tried and can no longer reproduce this in staging. I will process your payment in Upwork for your time reviewing the proposals and @kerupuksambel for reporting the issue. |
Payment summary: @abdulrahuman5196 $500 for reviewing the proposals. The offers were sent in Upwork, please accept them, and I will process the payments. Thanks! |
@isabelastisser Offer accepted. Thank you! |
@isabelastisser @abdulrahuman5196 Screencast.from.6-9-23.04.31.00.webm |
If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!
Action Performed:
Expected Result:
Either the URL isn't parsed at all (since it doesn't have the valid TLD and isn't a reserved name (like localhost for example)), or the behavior on both editing and upon copying should be consistent
Actual Result:
he message before edit and copied message is different (Message before edit message shown as "http://_", copied message shown as "http://_")
Workaround:
Unknown
Platforms:
Which of our officially supported platforms is this issue occurring on?
Version Number: 1.3.46.7
Reproducible in staging?: n/a
Reproducible in production?: n/a
If this was caught during regression testing, add the test name, ID and link from TestRail:
Email or phone of affected tester (no customers):
Logs: https://stackoverflow.com/c/expensify/questions/4856
Notes/Photos/Videos: Any additional supporting documentation
2nd.-.Invalid.URL.Parsing.mp4
Expensify/Expensify Issue URL:
Issue reported by: @kerupuksambel
Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1689596021560619
View all open jobs on GitHub
Upwork Automation - Do Not Edit
The text was updated successfully, but these errors were encountered: