Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for milliseconds formatted as %f in strptime(fmt). #1413

Closed
wants to merge 1 commit into from

Conversation

reidwagner
Copy link

Addresses #1409.

The milliseconds should be added at the end of the fmt string. They're chopped off of the original fmt string which is passed to strptime(3). As before, if the format does not match, e.g. milliseconds in date but not fmt string, a jv error is returned.

I also moved some of the cleanup in case of an error to a goto label to reduce redundancy.

Please let me know if I can change or improve anything!

… of the date format string, e.g. "%Y-%m-%dT%H:%M:%S.%fZ".
char *fmtdup = strdup(fmt);

int ms = -1;
char *pf = strstr(fmtdup, "%f");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Watch out! % can be escaped with a preceding (unescaped) %. You have to step over the format string character (well, byte) by character (byte).

@nicowilliams
Copy link
Contributor

I wonder if we may not have to implement strptime() from scratch, rather than depend on the OS strptime(). That may be advisable as to all system date functions, since they tend to vary somewhat by OS. That would be painful though, especially as it means having to deal with timezones, which we should absolutely not want to do! So I guess we'll hack it somewhat like your PR.

int ms = -1;
char *pf = strstr(fmtdup, "%f");
if (pf) {
char *msfmtloc = pf - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer underrun if the format string starts with "%f".

char *msfmtloc = pf - 1;
char delim = *msfmtloc;
*msfmtloc = '\0';
char *msloc = strrchr(inputdup, delim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assumption here is that delim will not be used for anything else after the %f... I'm not very comfortable with this.

@nicowilliams
Copy link
Contributor

Note that strptime() returns a pointer to the first character not consumed by strptime(). So I think what we want to do is something like this:

  • implement a function that supports all the format string features that strptime() supports and which half-parses the input string, only skipping each field as it goes
  • if and when it reaches %f this function should
    • parse the field
    • construct new format and input strings with the %f and milliseconds field removed, respectively
  • call the real strptime() with the possibly-new format and input strings to do the real parsing

That's a lot of work. In the mean time users can use regular expressions to parse and fixup timestamps.

Lastly, %f... is that used by any implementation anywhere? We don't want to conflict with C library implementations nor POSIX...

@nicowilliams
Copy link
Contributor

We're in luck as to %f: it's already in use in various places to mean "milliseconds" (e.g., Python 2.6 and up).

@nicowilliams
Copy link
Contributor

(Hit wrong button; didn't mean to close this!)

@nicowilliams
Copy link
Contributor

Bah, Ruby uses %L and %N. R uses %OSn (which conflicts with glibc!).

What a mess.

@reidwagner
Copy link
Author

Thanks for the fast response. I'm going to try working out the plan you proposed and address the code comments.

@c0b
Copy link

c0b commented Dec 13, 2017

have progress since May?

@reidwagner
Copy link
Author

I haven't gone any further.

@nicowilliams
Copy link
Contributor

I'd definitely like to complete this. I can't promise that I will soon though :(

@spier
Copy link

spier commented Feb 9, 2018

I just also ran into this issue with parsing milliseconds in jq.
Is there anything that I could do to help with this PR?

@reidwagner
Copy link
Author

reidwagner commented Feb 19, 2018

I'm taking a fresh look at this using the plan put forth by @nicowilliams.

@reidwagner reidwagner closed this Mar 3, 2018
@reidwagner reidwagner reopened this Mar 3, 2018
@reidwagner
Copy link
Author

I was hoping to find a clever way to simply parse out %f from the format string without paying attention to the values of each field. I.e. determining the delimiters and matching fields until %f. And you can get pretty close making some assumptions. But there's no two ways around something like "Februaryr%Br%Y". So yeah, it would have to support every feature as you said.

Because I probably don't have the bandwidth for that I'm going to close this to encourage anyone else who wants to try.

@reidwagner reidwagner closed this Mar 17, 2018
@reidwagner reidwagner deleted the milliseconds branch March 17, 2018 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants