Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main: introduce --_makeTagEntryReflection-<LANG> option to filter tags #3027

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

masatake
Copy link
Member

@masatake masatake commented May 15, 2021

See #3020.

The new option --_makeTagEntryReflection-<LANG> allows users to add optscript code which ctags runs when making a tag.
In the optscript code, you can modify the tag, make an extra tag derived from the tag, and/or make the tag not being printed.

Let's see examples.

input.md:

# TITLE LEVEL 1

## TITLE LEVEL 2

## ANOTHERTITLE LEVEL 2

# ANOTHER TITLE LEVEL 1

mymd.ctags

--_extradef-Markdown=withfname,appending input filename
--extras-Markdown=+{withfname}
--_makeTagEntryReflection-Markdown={{
    /Markdown.withfname _extraenabled {
        . :extras {
            /Markdown.withfname _amember not
        } {
            true
        } ifelse
        {
            % Make the original tag invisible
            . _markplaceholder

            mark . :name (@) . :input _buildstring
            . :kind
            . _tagloc _tag dup /Markdown.withfname _markextra
            _commit pop
        } if
    } if
}}

the command session:

$  ./ctags --options=mymd.ctags  --fields=+'{extras}' -o - input.md
ANOTHER TITLE LEVEL 1@input.md	input.md	/^# ANOTHER TITLE LEVEL 1$/;"	c	extras:withfname
ANOTHERTITLE LEVEL 2@input.md	input.md	/^## ANOTHERTITLE LEVEL 2$/;"	s	extras:withfname
TITLE LEVEL 1@input.md	input.md	/^# TITLE LEVEL 1$/;"	c	extras:withfname
TITLE LEVEL 2@input.md	input.md	/^## TITLE LEVEL 2$/;"	s	extras:withfname

The input file name is appended to the names of original tags.
The script hides the original tags.
If you don't want to hide them, remove the line . _markplaceholder in mymd.ctags.

TODO:

  • initForeignRefTagEntry used in rpmspec parser crashes with this pull request.
  • performance evaluation and optimization. This pull request increases sub parser calls.
  • test cases
  • reconsider the name of the option
  • introduce --_makeTagEntryNotification-<LANG> option and support it in optlib2c (?, should I do this when we really need it?).
  • documentation for ctags.io

@masatake masatake marked this pull request as draft May 15, 2021 18:43
@masatake
Copy link
Member Author

@rickalex21, could you try this pull request?

$ git clone https://github.com/masatake/ctags.git
$ cd ctags
$ git checkout -b masatake-makeTagEntryReflection master
$ git pull https://github.com/masatake/ctags.git makeTagEntryReflection
$ bash ./autogen.sh
$ ./configure; make
$ ./ctags --quiet --options=NONE --version

--quiet --optoins is not for reading your ~/.ctags.d/*.ctags files.

I will merge this pull request but not in soon because this pull request includes one critical bug.

@masatake
Copy link
Member Author

Without this pull request:

[jet@living]~/var/codebase% ./codebase ctags C
version: 6124a60c
features: +wildcards +regex +iconv +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript
log: results/6124a60c,C...................,..........,time......,default...,2021-05-16-17:46:07.log
tagsoutput: /dev/null
cat: code/qemu/perf.data.old: Permission denied
cat: code/qemu/perf.data: Permission denied
cmdline: + u-ctags --quiet --options=NONE --sort=no --options=profile.d/maps --totals=yes --languages=C -o - -R code/linux code/php-src code/qemu code/r-source code/ruby
31391 files, 21700950 lines (613548 kB) scanned in 15.4 seconds (39951 kB/s)
1285017 tags added to tag file

real    0m16.377s
user    0m15.800s
sys     0m0.538s
+ set +x

With this pull request:

[jet@living]~/var/codebase% ./codebase ctags C
version: 34d67c4f
features: +wildcards +regex +iconv +option-directory +xpath +json +interactive +sandbox +yaml +packcc +optscript
log: results/34d67c4f,C...................,..........,time......,default...,2021-05-16-17:46:53.log
tagsoutput: /dev/null
cat: code/qemu/perf.data.old: Permission denied
cat: code/qemu/perf.data: Permission denied
cmdline: + u-ctags --quiet --options=NONE --sort=no --options=profile.d/maps --totals=yes --languages=C -o - -R code/linux code/php-src code/qemu code/r-source code/ruby
31391 files, 21700950 lines (613548 kB) scanned in 15.5 seconds (39637 kB/s)
1285017 tags added to tag file

real    0m16.487s
user    0m15.935s
sys     0m0.512s
+ set +x

I repeated the test. With this pull request, ctags runs slower but it is acceptable impatct.

@masatake masatake changed the title main: introduce --_makeTagEntryReflection-<LANG> to filer tags main: introduce --_makeTagEntryReflection-<LANG> option to filter tags May 16, 2021
@codecov
Copy link

codecov bot commented May 16, 2021

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.52%. Comparing base (df13338) to head (ff6417f).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3027      +/-   ##
==========================================
+ Coverage   85.50%   85.52%   +0.02%     
==========================================
  Files         237      237              
  Lines       57042    57088      +46     
==========================================
+ Hits        48771    48822      +51     
+ Misses       8271     8266       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@leleliu008
Copy link
Member

OpenBSD has updated from 6.8 to 6.9, the new version of OpenBSD repository have updated the automake version from 1.16.2 to 1.16.3

I have make a PR #3029 to resolve this.

@masatake
Copy link
Member Author

@leleliu008 , thank you!
I will rebase this pull request.

@rickalex21
Copy link

@masatake I did follow your steps but I can't seem to get the same output as you. This is the output that I'm getting:

Command

ctags/ctags --options=mymd.ctags  --fields=+'{extras}' -o - input.md

Output

ANOTHER TITLE LEVEL 1	input.md	/^# ANOTHER TITLE LEVEL 1$/;"	c
ANOTHERTITLE LEVEL 2	input.md	/^## ANOTHERTITLE LEVEL 2$/;"	s	chapter:TITLE LEVEL 1
TITLE LEVEL 1	input.md	/^# TITLE LEVEL 1$/;"	c
TITLE LEVEL 2	input.md	/^## TITLE LEVEL 2$/;"	s	chapter:TITLE LEVEL 1

I created a dummy markdown-test page here and the myTags generated and files are here.

Ideally, I would like to do :tag introduction python or tag python introduction.

Another thing I was thinking about is that, how would one deal with same name files like README.md? For
example look at this layout from this repo.

├── code
│   ├── README.md
│   ├── lua.md
│   └── python.md
└── ssg
    ├── README.md
    ├── hugo.md
    ├── nested
    │   ├── README.md
    │   └── html.md
    └── vuepress.md

In that situation you would have to go based on the folder :tag ssg introduction but that's not a big
deal. I was more concerned about the files like :tag linux python and :tag linux lua. Both ways
are fine, one could also do:

:tag python linux

python.md

## Linux
You can install python in linux via a package manager or source code.

@masatake
Copy link
Member Author

Looks strange.

ctags/ctags --options=mymd.ctags  --fields=+'{extras}' -o - input.md

Could you try the following command line?

ctags/ctags --options=NONE --options=mymd.ctags  --extras=+'{pseudo}' --fields=+'{extras}' -o - input.md 

I expect you will get:

ctags/ctags: Notice: No options will be read from files or environment
!_TAG_FILE_FORMAT	2	/extended format; --format=1 will not append ;" to lines/;"	extras:pseudo
!_TAG_FILE_SORTED	1	/0=unsorted, 1=sorted, 2=foldcase/;"	extras:pseudo
!_TAG_OUTPUT_EXCMD	mixed	/number, pattern, mixed, or combineV2/;"	extras:pseudo
!_TAG_OUTPUT_FILESEP	slash	/slash or backslash/;"	extras:pseudo
!_TAG_OUTPUT_MODE	u-ctags	/u-ctags or e-ctags/;"	extras:pseudo
!_TAG_PATTERN_LENGTH_LIMIT	96	/0 for no limit/;"	extras:pseudo
!_TAG_PROC_CWD	/home/jet/var/ctags-github/	//;"	extras:pseudo
!_TAG_PROGRAM_AUTHOR	Universal Ctags Team	//;"	extras:pseudo
!_TAG_PROGRAM_NAME	Universal Ctags	/Derived from Exuberant Ctags/;"	extras:pseudo
!_TAG_PROGRAM_URL	https://ctags.io/	/official site/;"	extras:pseudo
!_TAG_PROGRAM_VERSION	5.9.0	/583114525/;"	extras:pseudo
ANOTHER TITLE LEVEL 1@input.md	input.md	/^# ANOTHER TITLE LEVEL 1$/;"	c	extras:withfname
ANOTHERTITLE LEVEL 2@input.md	input.md	/^## ANOTHERTITLE LEVEL 2$/;"	s	extras:withfname
TITLE LEVEL 1@input.md	input.md	/^# TITLE LEVEL 1$/;"	c	extras:withfname
TITLE LEVEL 2@input.md	input.md	/^## TITLE LEVEL 2$/;"	s	extras:withfname

@rickalex21
Copy link

rickalex21 commented May 17, 2021

I tested it with my markdown repo and it looks like that works when I do the tag like this:

:tag linux@markdowndir/code/python.md

Is there a way where I can just do:

:tag linux python

OR

:tag python linux

@masatake This does look the same as yours. I did get this message also ctags: Notice: No options will be read from files or environment

Here is my output:

!_TAG_FILE_FORMAT	2	/extended format; --format=1 will not append ;" to lines/;"	extras:pseudo
!_TAG_FILE_SORTED	1	/0=unsorted, 1=sorted, 2=foldcase/;"	extras:pseudo
!_TAG_OUTPUT_EXCMD	mixed	/number, pattern, mixed, or combineV2/;"	extras:pseudo
!_TAG_OUTPUT_FILESEP	slash	/slash or backslash/;"	extras:pseudo
!_TAG_OUTPUT_MODE	u-ctags	/u-ctags or e-ctags/;"	extras:pseudo
!_TAG_PATTERN_LENGTH_LIMIT	96	/0 for no limit/;"	extras:pseudo
!_TAG_PROC_CWD	/Users/ritchie/Public/	//;"	extras:pseudo
!_TAG_PROGRAM_AUTHOR	Universal Ctags Team	//;"	extras:pseudo
!_TAG_PROGRAM_NAME	Universal Ctags	/Derived from Exuberant Ctags/;"	extras:pseudo
!_TAG_PROGRAM_URL	https://ctags.io/	/official site/;"	extras:pseudo
!_TAG_PROGRAM_VERSION	5.9.0	/58311452/;"	extras:pseudo
ANOTHER TITLE LEVEL 1@input.md	input.md	/^# ANOTHER TITLE LEVEL 1$/;"	c	extras:withfname
ANOTHERTITLE LEVEL 2@input.md	input.md	/^## ANOTHERTITLE LEVEL 2$/;"	s	extras:withfname
TITLE LEVEL 1@input.md	input.md	/^# TITLE LEVEL 1$/;"	c	extras:withfname
TITLE LEVEL 2@input.md	input.md	/^## TITLE LEVEL 2$/;"	s	extras:withfname

@masatake
Copy link
Member Author

@masatake This does look the same as yours.

Thank you for trying. Now, we confirmed that my change works expectedly.

You had to add --options=NONE. It implies you did something in your .ctags.d/*.ctags files.
The options you specified in your .ctags files might conflict with mymd.ctags.
Revise your .ctags files.

I don't know well about vim. So I cannot help you with the usage of vim.
Transform your request to "what kind of tags output you need".
I know well about ctags and tags output. So I think I can help you in this aspect.

tags file:

ANOTHER TITLE LEVEL 1	input.md	/^# ANOTHER TITLE LEVEL 1$/;"	c

We call the field where ANOTHER TITLE LEVEL 1 is "name field".
As I showed, ctags has a scripting language. So you can arrange the name field as you want.
Appending @ filename is just an example to show what u-ctags can do for your purpose.

Consider "Linux" is in "foo/bar/baz.md".
Which ones do you want to have in the tags file?

  • "Linux baz"
  • "Linux baz.md"
  • "Linux bar/baz"
  • "Linux bar/maz.md"
  • "Linux foo/bar/baz"
  • "Linux foo/bar/baz.md"

Instead of choosing some of them, you can write down the algorithm for making the extended names.

@rickalex21
Copy link

@masatake

The options you specified in your .ctags files might conflict with mymd.ctags.
Revise your .ctags files.

I'm not sure what I need to revise I used the files that you provided input.md and mymd.ctags? The
only thing that's in my ~/.ctags.d/ is this:

├── defaultmd.ctags
├── frontmatter.ctags
├── markdownNew.ctags
├── md.ctags
├── simple-web.ctags
├── tags
├── txt.ctags
└── web.ctags

I don't think these conflict with anything unless I use them in the command line?

Transform your request to "what kind of tags output you need".
I know well about ctags and tags output. So I think I can help you in this aspect.

Sorry that I was not explicit. What I was trying to say is how can I get these tags?

  • "getting started lua"
  • "getting started python"
  • "installing python"
  • "installing lua"
  • "introduction lua"
  • "introduction python"

I edited the tags file to show the expected result:

!_TAG_FILE_FORMAT	2	/extended format; --format=1 will not append ;" to lines/;"	extras:pseudo
!_TAG_FILE_SORTED	1	/0=unsorted, 1=sorted, 2=foldcase/;"	extras:pseudo
!_TAG_OUTPUT_EXCMD	mixed	/number, pattern, mixed, or combineV2/;"	extras:pseudo
!_TAG_OUTPUT_FILESEP	slash	/slash or backslash/;"	extras:pseudo
!_TAG_OUTPUT_MODE	u-ctags	/u-ctags or e-ctags/;"	extras:pseudo
!_TAG_PATTERN_LENGTH_LIMIT	96	/0 for no limit/;"	extras:pseudo
!_TAG_PROC_CWD	/Users/ritchie/Public/testdir/	//;"	extras:pseudo
!_TAG_PROGRAM_AUTHOR	Universal Ctags Team	//;"	extras:pseudo
!_TAG_PROGRAM_NAME	Universal Ctags	/Derived from Exuberant Ctags/;"	extras:pseudo
!_TAG_PROGRAM_URL	https://ctags.io/	/official site/;"	extras:pseudo
!_TAG_PROGRAM_VERSION	5.9.0	/58311452/;"	extras:pseudo
getting Started lua	code/lua.md	/^## Getting Started$/;"	s	extras:withfname
getting started python	code/python.md	/^## Getting Started$/;"	s	extras:withfname
installing lua	code/lua.md	/^## Installing $/;"	s	extras:withfname
installing python	code/python.md	/^## Installing $/;"	s	extras:withfname
introduction lua	code/lua.md	/^## Introduction$/;"	s	extras:withfname
introduction python	code/python.md	/^## Introduction$/;"	s	extras:withfname
parent: 'code'@code/lua.md	code/lua.md	/^        parent: 'code'$/;"	s	extras:withfname
parent: 'code'@code/python.md	code/python.md	/^        parent: 'code'$/;"	s	extras:withfname

Out of these files:

code/python.md

## Introduction

Python is an interpreted high-level general-purpose programming language.

## Getting Started

In this section we will cover the following topics.

* Installing
* Configuring
* Hello World

## Installing 

Python usually comes pre-installed on most systems but you will need the latest version
of python. There are many ways to install the latest version.

code/lua.md

## Introduction

Lua is is a lightweight, high-level, multi-paradigm programming language designed
primarily for embedded use in applications.

## Getting Started

In this section we will cover the following topics.

* Installing
* Configuring
* Hello World

## Installing 

To get started with Lua, visit the [Lua Download Page](https://www.lua.org/download.html)
and follow the instructions or see the options below.

I don't know why it captured this:

parent: 'code'@code/python.md	code/python.md	/^        parent: 'code'$/;"	s	extras:withfname

I think it had something to do with the frontmatter in the file which I removed.

---
title: Python
draft: false
date: 2021-05-16T13:15:31-05:00
tags: ['code','python']
menu:
    main:
        parent: 'code'
---

Thanks for your help, I appreciate it.

@masatake
Copy link
Member Author

masatake commented May 18, 2021

---
title: Python
draft: false
date: 2021-05-16T13:15:31-05:00
tags: ['code','python']
menu:
    main:
        parent: 'code'
---

Is this written in the standard markdown syntax?
I guess a dialect (or extension) of markdown is used here.
Could you tell me a web page where the syntax is explained?
I want to fix the markdown parser of ctags not to make the tag for "parent: 'code'".


I found it.

https://github.com/jekyll/jekyll/blob/6855200ebda6c0e33f487da69e4e02ec3d8286b7/docs/_docs/step-by-step/03-front-matter.md

@rickalex21
Copy link

@masatake

Is this written in the standard markdown syntax?

No it's not markdown, it's it's frontmatter language yaml. Special information(meta data) used by static site generators like hugo and how you mentioned jekyll. Pandoc, vuepress and many other's use it. This is extra information that is needed to create html documents.

Your default markdown.ctags ignores it because it does not match the patterns.

Could you tell me a web page where the syntax is explained?

You found it that's a good example.

This is an example template:

 ---
title: "{{ replace .Name "-" " " | title }}"
date: {{ .Date }}
draft: false
tags: ['code','{{.Name}}']
menu:
    main:
        parent: 'code'
---

Then this would be created:

---
title: "Example"
date: 2021-05-17T23:20:21-05:00
draft: false
tags: ['code','example']
menu:
    main:
        parent: 'code'
---

## Introduction

Markdown starts here.

I want to fix the markdown parser of ctags not to make the tag for "parent: 'code'".

I agree the command ctags/ctags --options=NONE --options=mymd.ctags --extras=+'{pseudo}' --fields=+'{extras}' -o - input.md should not be capturing the front matter tags, it should ignore front matter, anything
in between the 3 dashes---.

I already have another tags file that I create that captures the frontmatter (the 'code' and 'example' tags).

frontmatter.ctags

--langdef=frontmatter
--map-frontmatter=+.md

--kinddef-frontmatter=t,tags,front matter tags
--_tabledef-frontmatter=toplevel
--_tabledef-frontmatter=tag

--_mtable-regex-frontmatter=toplevel/\ntags:[ \t]*\[[ ]?['"]//{tenter=tag}

--_mtable-regex-frontmatter=toplevel/.//

--_mtable-regex-frontmatter=tag/([a-zA-Z0-9]+)/\1/t/
--_mtable-regex-frontmatter=tag/['"]\]//{tquit}


--_mtable-regex-frontmatter=tag/.//

--exclude=.git
--exclude=vim.md

@masatake
Copy link
Member Author

masatake commented May 18, 2021

Thank you.

Before introducing --_makeTagEntryReflection-<LANG> option that provides features fixing your original issue, I would like to introduce a parser for Frontmatter parser to ctags. Could you consider putting your frontmatter.ctags to your git repository at GitHub with the following header? to

#
#  Copyright (c) 2012, YOUR NAME HERE
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301,
# USA.
#
#

If I can merge your frontmatter.ctags to our ctags source tree, you can remove it from your "~/.ctags.d".

@masatake
Copy link
Member Author

I found frontmatter has two notations: json and yaml.
ctags has a parser for json and yaml, so I wonder we can utilize it to process the frontmatter code part.
Subparser mechanism can be used. However, this is the first time to support writing a sub parser based on multiple base parsers.

@rickalex21
Copy link

rickalex21 commented May 18, 2021

Could you consider putting your frontmatter.ctags to your git repository at GitHub with the following header?

You want to use my frontmatter.ctags in your repo? I'm ok with that but that file is not optimized, do you think you can optimize it to make it faster? My *.ctags files are not as fast as yours.

JSON and yaml

Toml, yaml, and JSON. Toml and yaml are more popular for front matter.

@masatake
Copy link
Member Author

You want to use my frontmatter.ctags in your repo? I'm ok with that but that file is not optimized, do you think you can optimize it to make it faster? My *.ctags files are not as fast as yours.

Yes, I would like to introduce your parser to ctags.
I will arrange the markdown parser to run the Frontmatter parser only on the area between --- and ---.
So even if the Frontmatter parser is slow, the impact is small.

In the future, I will rewrite the Frontmatter parser as a subparser of YAML.
YAML parser is written in C, so I think is much faster.

However, the detailed parser implementation is not important here.
The most important thing in writing a parser is kind design: what we should extract from input files, and what kind we should assign to.
From your .ctags, I learn "tag" is one of the kind. How about title? I think users expect title is also extracted.

@masatake
Copy link
Member Author

Your default markdown.ctags ignores it because it does not match the patterns.

It is not true. The markdown.ctags couldn't ignore effectively.

I opened #3031. With the change, the markdown parser can skip the frontmatter area.
If we merge #3031, ctags never emit

parent: 'code'@code/python.md	code/python.md	/^        parent: 'code'$/;"	s	extras:withfname

You may want to extract "tags:" in the frontmatter area. Lets' find the way to extract them after solving #3020.

@masatake
Copy link
Member Author

I'm not sure what I need to revise I used the files that you provided input.md and mymd.ctags? The
only thing that's in my ~/.ctags.d/ is this:

├── defaultmd.ctags
├── frontmatter.ctags
├── markdownNew.ctags
├── md.ctags
├── simple-web.ctags
├── tags
├── txt.ctags
└── web.ctags

I don't think these conflict with anything unless I use them in the command line?

They are loaded automatically when ctags starts.
From the name of files, I guess they customize the markdown parser. I think anything can happen.
--options=NONE makes ctags not load the .ctags files.

I cannot say much without reading the files.
What I know now the following line of frontmatter.ctags can cause a trouble:

--map-frontmatter=+.md

With this line, we cannot predict whether Markdown parser runs or the frontmatter parser runs on a fie having .md as suffix.

@masatake
Copy link
Member Author

masatake commented May 18, 2021

Here is the new mymd.ctags:

--_extradef-Markdown=withfname,appending input filename
--extras-Markdown=+{withfname}
--_prelude-Markdown={{
    /dropext {
        ?. _strrchr {
            0 exch 0 string _copyinterval
        } if
    } def
    /basename {
        ?/ _strrchr {
            1 add dup 2 index length exch sub
            0 string _copyinterval
        } if
    } def
    /dirname {
        ?/ _strrchr {
            0 exch 0 string _copyinterval
        } {
            pop 0 string
        } ifelse
    } def
}}

--_makeTagEntryReflection-Markdown={{
    /Markdown.withfname _extraenabled {
        . :extras {
            /Markdown.withfname _amember not
        } {
            true
        } ifelse
        {
            % Make the original tag invisible
            . _markplaceholder

            mark
            . :name ( )
            . :input dirname length 0 gt {
                . :input dirname basename
                ?/
            } if
            . :input dropext basename _buildstring
            . :kind
            . _tagloc _tag dup /Markdown.withfname _markextra
            _commit pop
        } if
    } if
}}

tags output:

[jet@living]~/var/ctags-github%  ./ctags --quiet --options=NOE --options=mymd.ctags -R --fields=+'{extras}' -o - /tmp/markdown-test
Child H3 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^### Child H3$/;"	S	extras:withfname
Community code/python	/tmp/markdown-test/code/python.md	/^## Community$/;"	s	extras:withfname
Components ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^## Components$/;"	s	extras:withfname
Custom Markdown nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Custom Markdown$/;"	s	extras:withfname
Deep H5 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^##### Deep H5$/;"	T	extras:withfname
Features markdown-test/README	/tmp/markdown-test/README.md	/^### Features$/;"	S	extras:withfname
Getting Started code/lua	/tmp/markdown-test/code/lua.md	/^## Getting Started$/;"	s	extras:withfname
Getting Started code/python	/tmp/markdown-test/code/python.md	/^## Getting Started$/;"	s	extras:withfname
Getting Started ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^## Getting Started$/;"	s	extras:withfname
Getting Started ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^## Getting Started$/;"	s	extras:withfname
Header H3 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^### Header H3$/;"	S	extras:withfname
Header H4 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^#### Header H4$/;"	t	extras:withfname
Header H5 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^##### Header H5$/;"	T	extras:withfname
Header H6 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^###### Header H6$/;"	u	extras:withfname
Header2 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Header2$/;"	s	extras:withfname
Header3 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^### Header3$/;"	S	extras:withfname
Header4 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^#### Header4$/;"	t	extras:withfname
Header5 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^##### Header5$/;"	T	extras:withfname
Header6 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^###### Header6$/;"	u	extras:withfname
Headers H2 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Headers H2$/;"	s	extras:withfname
Html Test Page nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Html Test Page$/;"	s	extras:withfname
Installing code/lua	/tmp/markdown-test/code/lua.md	/^## Installing $/;"	s	extras:withfname
Installing code/python	/tmp/markdown-test/code/python.md	/^## Installing $/;"	s	extras:withfname
Installing ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^## Installing $/;"	s	extras:withfname
Installing ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^## Installing $/;"	s	extras:withfname
Introduction code/README	/tmp/markdown-test/code/README.md	/^## Introduction$/;"	s	extras:withfname
Introduction code/lua	/tmp/markdown-test/code/lua.md	/^## Introduction$/;"	s	extras:withfname
Introduction code/python	/tmp/markdown-test/code/python.md	/^## Introduction$/;"	s	extras:withfname
Introduction ssg/README	/tmp/markdown-test/ssg/README.md	/^## Introduction$/;"	s	extras:withfname
Introduction ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^## Introduction$/;"	s	extras:withfname
Introduction ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^## Introduction$/;"	s	extras:withfname
Linux code/lua	/tmp/markdown-test/code/lua.md	/^### Linux$/;"	S	extras:withfname
Linux code/python	/tmp/markdown-test/code/python.md	/^### Linux$/;"	S	extras:withfname
Linux ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^### Linux$/;"	S	extras:withfname
Linux ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^### Linux$/;"	S	extras:withfname
List Elements nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## List Elements$/;"	s	extras:withfname
Mac code/lua	/tmp/markdown-test/code/lua.md	/^### Mac$/;"	S	extras:withfname
Mac code/python	/tmp/markdown-test/code/python.md	/^### Mac$/;"	S	extras:withfname
Mac ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^### Mac$/;"	S	extras:withfname
Mac ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^### Mac$/;"	S	extras:withfname
Max Nest H6 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^###### Max Nest H6$/;"	u	extras:withfname
Nested Folder nested/README	/tmp/markdown-test/ssg/nested/README.md	/^## Nested Folder$/;"	s	extras:withfname
Nested H4 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^#### Nested H4$/;"	t	extras:withfname
Pandoc code/lua	/tmp/markdown-test/code/lua.md	/^## Pandoc$/;"	s	extras:withfname
Parent One H2 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Parent One H2$/;"	s	extras:withfname
Parent Three H2 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Parent Three H2$/;"	s	extras:withfname
Parent Two H2 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## Parent Two H2$/;"	s	extras:withfname
Speed ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^## Speed$/;"	s	extras:withfname
Summary code/lua	/tmp/markdown-test/code/lua.md	/^## Summary$/;"	s	extras:withfname
Summary code/python	/tmp/markdown-test/code/python.md	/^## Summary$/;"	s	extras:withfname
Summary ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^## Summary$/;"	s	extras:withfname
Summary ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^## Summary$/;"	s	extras:withfname
The End H2 nested/html	/tmp/markdown-test/ssg/nested/html.md	/^## The End H2$/;"	s	extras:withfname
Topic code/README	/tmp/markdown-test/code/README.md	/^## Topic$/;"	s	extras:withfname
Topic ssg/README	/tmp/markdown-test/ssg/README.md	/^## Topic$/;"	s	extras:withfname
Windows code/lua	/tmp/markdown-test/code/lua.md	/^### Windows$/;"	S	extras:withfname
Windows code/python	/tmp/markdown-test/code/python.md	/^### Windows$/;"	S	extras:withfname
Windows ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^### Windows$/;"	S	extras:withfname
Windows ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^### Windows$/;"	S	extras:withfname
parent: "code" nested/html	/tmp/markdown-test/ssg/nested/html.md	/^        parent: "code"$/;"	s	extras:withfname
parent: 'code' code/lua	/tmp/markdown-test/code/lua.md	/^        parent: 'code'$/;"	s	extras:withfname
parent: 'code' code/python	/tmp/markdown-test/code/python.md	/^        parent: 'code'$/;"	s	extras:withfname
parent: 'code' ssg/README	/tmp/markdown-test/ssg/README.md	/^        parent: 'code'$/;"	s	extras:withfname
parent: 'ssg' nested/README	/tmp/markdown-test/ssg/nested/README.md	/^        parent: 'ssg'$/;"	s	extras:withfname
parent: 'ssg' ssg/hugo	/tmp/markdown-test/ssg/hugo.md	/^        parent: 'ssg'$/;"	s	extras:withfname
parent: 'ssg' ssg/vuepress	/tmp/markdown-test/ssg/vuepress.md	/^        parent: 'ssg'$/;"	s	extras:withfname
tags: ['code'] code/README	/tmp/markdown-test/code/README.md	/^tags: ['code']$/;"	s	extras:withfname
[jet@living]~/var/ctags-github% 

parent: lines at the end may be disappeared when we merge #3031.

@masatake masatake mentioned this pull request May 18, 2021
@rickalex21
Copy link

rickalex21 commented May 18, 2021

@masatake

From your .ctags, I learn "tag" is one of the kind. How about title? I think users expect title is also extracted.

I think it's up to the users to decide what they want to capture from a front matter.
I only use it to keep track of my tags: ['lua','python'] The thing about front
matter is that it's custom, anyone can put anything. There could be unlimited kinds as shown in
this example. The link is a yaml file but it can be a markdown file by adding ---.

---
title: Computer Services
pricing:
  enable : true
  title : Our Plans
  pricing_table :
    # pricing table loop
    - name : Free
      price : $99
      unit : month
      services:
        - 1GB Disk Space
        - 10 Email Account
        - Script Installer
        - 1 GB Storage
        - 10 GB Bandwidth
        - 24/7 Tech Support
      button:
        label : "Signup"
        link : "#"
---

## Welcome

Hello, welcome to our pricing plans...

Look at this
front matter example as well. It shows can people can add a lot of data to a markdown document,
ignoring this information can speed up ctags for sure.

It is not true. The markdown.ctags couldn't ignore effectively.

You're right, what I meant is that it's not captured by markdown.ctags because it's not valid
markdown. However, ignoring the front matter (yaml, toml, json) could speed up the
markdown.ctags. What is important is ignoring the first line which could be --- or +++
it can also be json but things get tricky because I don't think it's a standard. I can only speak
for hugo, you can read more about hugo front matter here.
Json is not commonly used in front matter but I can see how it could be of benfit pulling data from an api.
The best way to ignore would be starting at line 1, every front matter usually starts at line one. That is
the convention.

This is in yaml:

---
title: "Testing"
date: 2021-05-18T10:12:21-05:00
# How about a comment here? Don't confuse me with markdown.
draft: false
tags: ['code','testing']
menu:
    main:
        parent: 'code'
---

## Testing

This is a test.

This is in json:

{
   "title": "I'm in Json",
   "date": "2021-05-18T15:12:21.000Z",
   "draft": false,
   "tags": [
      "code",
      "testing"
   ],
   "menu": {
      "main": {
         "parent": "code"
      }
   }
}

## Testing

This is a test.

This is in toml:

+++
title = 'Testing in Toml'
date = 2021-05-18T15:12:21.000Z
draft = false
tags = [ "code", "testing" ]
#comment here, don't confuse me with markdown

[menu.main]
parent = "code"
+++

## Testing

This is a test.

I confirm that toml, json, and yaml work in my single.html template with this:

{{ .Params.tags }}

Output:

[code testing] 

I cannot say much without reading the files.

The files are here.

frontmatter.ctags - This captures tags in front matter. For example if you had this front
matter.

pandoc.md

---
title: "Pandoc"
date: 2021-03-07T22:30:18-05:00
draft: false
tags: ['code','tutorials','pandoc']
menu:
    main:
        parent: "code"
---

It would output this in your tags file.

tags-fm

code	     /some/path/pandoc.md /^tags: ['code','tutorials','pandoc']$/;"	t	line:5
pandoc	    /somepath/pandoc.md /^tags: ['code','tutorials','pandoc']$/;"	t	line:5
turorials	/some/path/pandoc.md /^tags: ['code','tutorials','pandoc']$/;"	t	line:5

The command to run this is like this:

ctags --fields=+n -f /some/path/content/en/.tags-fm --languages=frontmatter -R /some/path/content/en/

md.ctags - This should be taken out of ~/.ctags.d/ to avoid conflicts. This was created because
at the time I created a custom markdown to try to fix #2965. Now that is has been fixed I don't need
it no more. This is too slow, the default markdown.ctags is faster.

web.ctags - At the time I wanted to tag my files html,css,scss, and js with comments. Not sure
if this was the best approach to navigate. I haven't been using them, I've been on Vscode doing
web development.

Let's say you want to tag 'main' in all your files.

// main.js main function
function main(){

This would produce a tags file like this:

main /path/to/main.js	/^\/\/ main.js main funciton$/;"	t	line:11

The command to generate the tags is this:

ctags --fields=+n -f /some/path/.tags --languages=web -R /some/path/layouts -R /some/path/assets/sass -R /some/path/static

The other files I deleted I don't need anymore.

The file default.md which I deleted. At the time I wanted to add pandoc markdown with extension .pdc
to the list so I copied your markdown.ctags and added it like this.

--langdef=defaultmd
--map-defaultmd=+.md
--map-defaultmd=+.pdc

You had to add --options=NONE. It implies you did something in your .ctags.d/*.ctags files.

The problem as to why this command was was not working ctags/ctags --options=mymd.ctags --fields=+'{extras}' -o - input.md is being caused by the ~/.ctags.d/frontmatter.md. So basically
any time I want to create my custom.ctags, I have to run it with --options=NONE --languages=custom?
Does my frontmatter.ctags also affect the built in markdown.ctags ??

options=NOE

I think you meant NONE.

Here is the new mymd.ctags:

It looks good but I see 3 issues with this.

  1. Why is it finding my Makefile? I'm sure there's a way to only include certain files (e.g., .markdown, .md, .pdc, .pandoc )
  2. The tag should change from Installing code/lua to Installing lua this is something
    that can be manipulated in the mymd.ctags huh? I'm not familiar with optscript syntax.
  3. Front matter should be ignored, you're going to work on this in Markdown: skip frontmatter area #3031.
Getting Started code/lua	code/lua.md	/^## Getting Started$/;"	s	extras:withfname
Getting Started code/python	code/python.md	/^## Getting Started$/;"	s	extras:withfname
Installing code/lua	code/lua.md	/^## Installing $/;"	s	extras:withfname
Installing code/python	code/python.md	/^## Installing $/;"	s	extras:withfname
Introduction README	README.md	/^## Introduction$/;"	s	extras:withfname
Introduction code/lua	code/lua.md	/^## Introduction$/;"	s	extras:withfname
Introduction code/python	code/python.md	/^## Introduction$/;"	s	extras:withfname
Topic README	README.md	/^## Topic$/;"	s	extras:withfname
all	Makefile	/^all:$/;"	t
tags: ['code'] README	README.md	/^tags: ['code']$/;"	s	extras:withfname

I don't know too much about programming masatake but maybe it's worth taking a look at a
syntax tree. Ctags is limited by how fast it can process a regex but it gets the job done for
sure and refactoring would be a lot of work. There are already built parsers out there like
tree-sitter to get ideas from. Perhaps create the tags
from a syntax tree, I don't know... ... ...

The downside to that is that you're limited by how fast a program can create a syntax tree to
extract the information that you want, in this case the tags.

Every time ctags runs, the entire project is parsed again. I'm not sure if there is a work around to
this? One option is to make ctags faster which is what you have been working on. Another option
would be to think of how ctags can only work on the changes made to files and not the entire project.
If I only change 3 files out of 100 files and I run ctags, It runs on 100 files right?

If you want to use my frontmatter.ctags it's right here.

Thanks

@masatake masatake mentioned this pull request May 18, 2021
19 tasks
@masatake
Copy link
Member Author

Too many topics are included in this discussion. I cannot handle everything at once.
Let's focus on the original issue.
About the way to parse the frontmatter area, use #3032 (after merging #3031).

  1. The tag should change from Installing code/lua to Installing lua this is something
    that can be manipulated in the mymd.ctags huh? I'm not familiar with optscript syntax.

Here is the new mymd.ctags.

--_extradef-Markdown=withfname,appending input filename
--extras-Markdown=+{withfname}
--_prelude-Markdown={{
    /dropext {
        ?. _strrchr {
            0 exch 0 string _copyinterval
        } if
    } def
    /basename {
        ?/ _strrchr {
            1 add dup 2 index length exch sub
            0 string _copyinterval
        } if
    } def
}}

--_makeTagEntryReflection-Markdown={{
    /Markdown.withfname _extraenabled {
        . :extras {
            /Markdown.withfname _amember not
        } {
            true
        } ifelse
        {
            % Make the original tag invisible
            . _markplaceholder

            mark
            . :name ( )
            . :input dropext basename _buildstring
            . :kind
            . _tagloc _tag dup /Markdown.withfname _markextra
            _commit pop
        } if
    } if
}}

@rickalex21
Copy link

@masatake Thanks, it looks like that will work but there are a few issues.

  • As I mentioned ealier it's parsing my Makefile instead of markdown only files. See the all Makefile tag.
  • Another thing is that it's not getting the full path, the full path is going to be needed to reference the file. Like this:
Introduction python	/Users/ritchie/Public/testdir/code/python.md	/^## Introduction$/;"	s	extras:withfname

This is my output:

Getting Started lua	code/lua.md	/^## Getting Started$/;"	s	extras:withfname
Getting Started python	code/python.md	/^## Getting Started$/;"	s	extras:withfname
Installing lua	code/lua.md	/^## Installing $/;"	s	extras:withfname
Installing python	code/python.md	/^## Installing $/;"	s	extras:withfname
Introduction README	README.md	/^## Introduction$/;"	s	extras:withfname
Introduction lua	code/lua.md	/^## Introduction$/;"	s	extras:withfname
Introduction python	code/python.md	/^## Introduction$/;"	s	extras:withfname
Topic README	README.md	/^## Topic$/;"	s	extras:withfname
all	Makefile	/^all:$/;"	t
tags: ['code'] README	README.md	/^tags: ['code']$/;"	s	extras:withfname

@masatake
Copy link
Member Author

As I mentioned ealier it's parsing my Makefile instead of markdown only files. See the all Makefile tag.

Add --exclude=Makefile to the command line before input files or directories.

Another thing is that it's not getting the full path, the full path is going to be needed to reference the file. Like this:

Add --tag-relative=no to the command line before input files or directories.

ctags(1) man page explains both options.

@masatake
Copy link
Member Author

find command and -L - option may also be useful for giving specified input files.

@rickalex21
Copy link

@masatake I'm still getting relative paths, but that's ok because it doesn't matter because
the tags file will always be in the root directory and it can find it with the relative path. A full
path is not needed. However, in the markdown.ctags a full path is given. This is my makefile
command:

all:
	../ctags/ctags --options=NONE --exclude=Makefile --tag-relative=no --quiet  --options=mymd.ctags -R --fields=+'{extras}'

@masatake
Copy link
Member Author

I'm sorry. I read ctags(1) of Exuberant Ctags. What we should read is that of Universal Ctags.
You can use --tag-relative=never instead of --tag-relative=no.

@rickalex21
Copy link

@masatake Thanks, that works. 👍

@masatake
Copy link
Member Author

masatake commented May 23, 2021

I would like to add more code and documents.

main/script.c Outdated Show resolved Hide resolved
@masatake
Copy link
Member Author

It will take more time for merging this pull request.
My concerns are:

  • performance overhead; this pull requires extra code execution every time when making a tag for all kinds of all languages.
  • the name of option: in the future, --_makeTagEntryReflection-<LANG> will be part of an interface using users, not developers. So we cannot change it later.

@@ -29,8 +29,6 @@ extern langType getSubparserLanguage (subparser *s);

/* A base parser doesn't have to call the following three functions.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commet must be updated.

@@ -348,6 +349,14 @@ static void cppInitCommon(langType clientLang,
: clientLang) & CORK_SYMTAB))
? makeMacroTable ()
: NULL;

if (Cpp.lang != Cpp.clientLang
Copy link
Member Author

@masatake masatake Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are needed.
Performance evaluation is needed for C input.

@@ -381,6 +390,14 @@ static void cppClearMacroInUse (cppMacroInfo **pM)

extern void cppTerminate (void)
{
if (Cpp.lang != Cpp.clientLang
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are needed.

main/options.c Outdated Show resolved Hide resolved
@@ -159,4 +159,28 @@ const char ctagsCommonPrelude []=
" } forall\n"
" pop\n"
"} __bddef\n"
"\n"
"(string end:string _ENDWITH boolean)\n"
"/_endwith {\n"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_strendwith may be better name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using _strrstr is one of the ways to shorten the proc.
Writing in C is another choice.

@@ -385,10 +386,17 @@ extern parserDefinition* RpmSpecParser (void)
"rpm-spec", /* the mode name in Emacs */
NULL };
parserDefinition* const def = parserNew ("RpmSpec");

static parserDependency dependencies [] = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for change must be explained in the commit log.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Signed-off-by: Masatake YAMATO <yamato@redhat.com>
TODO:
- write about the new option in docs/optlib.rst.
- add --_makeTagEntryNotification-<LANG>={{...}}.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants