Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't Implement [table_data Due to Tab Being a Record-Separator #17

Open
tajmone opened this issue May 15, 2021 · 5 comments
Open

Can't Implement [table_data Due to Tab Being a Record-Separator #17

tajmone opened this issue May 15, 2021 · 5 comments
Labels
⭐ PML syntax Topic: PML syntax definition for ST4 ⚠️ blocked Action blocked until PMLC is updated/fixed

Comments

@tajmone
Copy link
Owner

tajmone commented May 15, 2021

The PML Reference Manual mentions for table_data:

Cell values in a row are separated by a comma or a TAB character.

This is a problem with Sublime Text if the user (or the PML package) has set it to use spaces for indentation, because the editor will automatically convert all Tabs to spaces, both during typing as well as at save time.

Until PML removes Tabs as CSV separators, it's going to be problematic to implement table_data — not to mention testing it.

There are already plans to fix this in future PML editions, providing alternative separators as well as introducing a new attribute to specify a custom record-separator.

Implementation of table_data should be blocked and postponed until a new PML release fixes this.

UPDATE — PML 2.3.0 (see Changelog) partly fixed this by allowing also the vertical pipe and semicolon characters as row.separators.

Although this mitigates the problem by offering users more alternatives, which allows them to avoid using Tabs, the problem still remains when dealing with third party documents that do use tabs as separators, since if the user has set ST to enforce spaces all separators will be lost at first save operation.

@tajmone tajmone added ⭐ PML syntax Topic: PML syntax definition for ST4 ⚠️ blocked Action blocked until PMLC is updated/fixed labels May 15, 2021
@pml-lang
Copy link
Collaborator

In the current PML version the user can use a comma or a TAB to separate cells. PML chooses the separator based on the first separator encountered in the data. The only sound reason for choosing a TAB would be that some cells contain commas.

providing alternative separators as well as introducing a new attribute to specify a custom record-separator.

Yes. It will work like this in a future version:

  • By default, cells are separated by commas, and rows by new lines.
  • An additional parameter will allow the user to select a different cell separator (can be one or more characters). Example: cell_sep = "|". A TAB separator will also be supported (if the user explicitly asks for it), but then a warning should be emitted to warn the user about the potential problems he/she or other people with other editor configurations might encounter.
  • An additional parameter will allow the user to select a different row separator (can be one or more characters). Example: row_sep = "\n===\n".

An insert_table node will also be added. This allows table data to be imported from files or URL's (e.g. a CSV file). This node will also have cell_sep and row_sep parameters, as well as other parameters to make table importing as flexible as possible.

@tajmone
Copy link
Owner Author

tajmone commented May 18, 2021

The upcoming additional parameters are very good.

A TAB separator will also be supported (if the user explicitly asks for it), but then a warning should be emitted to warn the user about the potential problems he/she or other people with other editor configurations might encounter.

That's an excellent idea!

The only sound reason for choosing a TAB would be that some cells contain commas.

I was wondering why with the comma separator cells with commas can't be handled by enclosing them within double quotes:

[table_data
    cell 1 without commas, "cell 2, with comma", cell 3
table_data]

this is usually accepted in standard CSV files, where the enclosing d-quotes delimiters are stripped away from the results:

  • cell 1 without commas
  • cell 2, with comma
  • cell 3

@pml-lang
Copy link
Collaborator

why with the comma separator cells with commas can't be handled by enclosing them within double quotes

Enclosing cells with double quotes (or single quotes) is yet another feature supported in an upcoming version.

@tajmone
Copy link
Owner Author

tajmone commented Jul 13, 2022

Problem Mitigated But Not Solved!

@pml-lang, the addition of v-pipe and semicolon as row-separators (PML 2.3.0) does mitigate the problem on the writer's side since it allows more choices to avoid tabs.

Beware though that it doesn't though solve the original problem, i.e. that if a user has configured ST to use only spaces for indentation the editor will still replace all existing Tabs at save-time, which means that any edits to third party PML sources using Tabs will end up corrupting [table_data nodes!

This is true not just for ST but for any editor which supports indentation settings and enforcing them, not to mention code linting tools and plug-ins.

IMO, relying on Tabs as a separator is a bad choice that can lead to disruption of documents in collaborative editing or derivative projects. What makes this issue particularly noxious is the fact that the user will be oblivious of such damage taking place, since it will happen silently in the background, and when he/she realizes it it will be too late (can't restore the original separators unless you've kept a backup or are version controlling the document).

The number of people preferring spaces-indentation over tabs is significant enough to expect that many (if not most) such users will take advantage of their editor indentation enforcing settings, which could result in huge frustrations when working with PML documents.

I strongly advise dropping the Tab separator altogether — it's not worth keeping it seeing the potential damage it can do; and relying on Tabs for syntax semantics is not a wise choice either.

@pml-lang
Copy link
Collaborator

I strongly advise dropping the Tab separator altogether — it's not worth keeping it seeing the potential damage it can do; and relying on Tabs for syntax semantics is not a wise choice either.

Yes, I fully agree.

I've removed the TAB separator already in my local dev branch. The TAB will no more be supported in version 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⭐ PML syntax Topic: PML syntax definition for ST4 ⚠️ blocked Action blocked until PMLC is updated/fixed
Projects
None yet
Development

No branches or pull requests

2 participants