Skip to content
This repository has been archived by the owner on Aug 21, 2024. It is now read-only.

add option use-git-cache which makes single git log ... call #87

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,31 @@ last-modified-at:
date-format: '%d-%b-%y'
```

For sites with lots of documents using `last_modified_at`, there may be render
performance improvement via:

```yml
plugins:
- jekyll-last-modified-at

last-modified-at:
use-git-cache: true
```

If `use-git-cache` is `false` (the default), every committed file using
`last_modified_at` will generate a separate spawned process to check the git log
for time data. So if you have 10 documents, this will result in 10 spawned calls.

If `use-git-cache` is `true`, a single spawned process is generated that reads
the entire git log history and caches the time data. This cache is then read
from during the rest of the site generation process. So if you have 10 (or 1000)
documents, this will result in 1 spawned call. The cache is flushed on site
reset, allowing for a long-lived server to correctly reflect `last_modified_at`
of files modified and committed while it has been running.

Note: there may be performance issues for repositories with very large
histories, in which case the default behavior is likely preferred.

## Usage

There are a few ways to use this gem.
Expand Down
3 changes: 0 additions & 3 deletions lib/jekyll-last-modified-at.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,5 @@ module LastModifiedAt
autoload :Executor, 'jekyll-last-modified-at/executor'
autoload :Determinator, 'jekyll-last-modified-at/determinator'
autoload :Git, 'jekyll-last-modified-at/git'

PATH_CACHE = {} # rubocop:disable Style/MutableConstant
REPO_CACHE = {} # rubocop:disable Style/MutableConstant
end
end
48 changes: 23 additions & 25 deletions lib/jekyll-last-modified-at/determinator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,49 +3,47 @@
module Jekyll
module LastModifiedAt
class Determinator
attr_reader :site_source, :page_path
@repo_cache = {}
@path_cache = {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a last modified time cache keyed on paths. @last_mod_cache perhaps?

class << self
# attr_accessor so we can flush externally
attr_accessor :repo_cache
attr_accessor :path_cache
end

attr_reader :site_source, :page_path, :use_git_cache
attr_accessor :format

def initialize(site_source, page_path, format = nil)
@site_source = site_source
@page_path = page_path
@format = format || '%d-%b-%y'
def initialize(site_source, page_path, format = nil, use_git_cache = false) # rubocop:disable Style/OptionalBooleanParameter
@site_source = site_source
@page_path = page_path
@format = format || '%d-%b-%y'
@use_git_cache = use_git_cache
end

def git
return REPO_CACHE[site_source] unless REPO_CACHE[site_source].nil?
return self.class.repo_cache[site_source] unless self.class.repo_cache[site_source].nil?

REPO_CACHE[site_source] = Git.new(site_source)
REPO_CACHE[site_source]
self.class.repo_cache[site_source] = Git.new(site_source)
self.class.repo_cache[site_source]
end

def formatted_last_modified_date
return PATH_CACHE[page_path] unless PATH_CACHE[page_path].nil?

last_modified = last_modified_at_time.strftime(@format)
PATH_CACHE[page_path] = last_modified
last_modified
last_modified_at_time.strftime(@format)
end

def last_modified_at_time
return self.class.path_cache[page_path] unless self.class.path_cache[page_path].nil?

raise Errno::ENOENT, "#{absolute_path_to_article} does not exist!" unless File.exist? absolute_path_to_article

Time.at(last_modified_at_unix.to_i)
self.class.path_cache[page_path] = Time.at(last_modified_at_unix.to_i)
self.class.path_cache[page_path]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ruby returns the last value by default, so this last line is redundant, no?

end

def last_modified_at_unix
if git.git_repo?
last_commit_date = Executor.sh(
'git',
'--git-dir',
git.top_level_directory,
'log',
'-n',
'1',
'--format="%ct"',
'--',
relative_path_from_git_dir
)[/\d+/]
last_commit_date = git.last_commit_date(relative_path_from_git_dir, use_git_cache)
# last_commit_date can be nil iff the file was not committed.
last_commit_date.nil? || last_commit_date.empty? ? mtime(absolute_path_to_article) : last_commit_date
else
Expand Down
61 changes: 61 additions & 0 deletions lib/jekyll-last-modified-at/git.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ class Git
def initialize(site_source)
@site_source = site_source
@is_git_repo = nil
@lcd_cache = {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "lcd" stand for here? As opposed to what other kind of cache?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon it stands for "last commit date", per the next hunk

end

def top_level_directory
Expand All @@ -33,6 +34,66 @@ def git_repo?
false
end
end

def last_commit_date(path, use_git_cache = false) # rubocop:disable Style/OptionalBooleanParameter
if use_git_cache
build_lcd_cache if @lcd_cache.empty?
@lcd_cache[path]
else
Executor.sh(
'git',
'--git-dir',
top_level_directory,
'log',
'-n',
'1',
'--format="%ct"',
'--',
path
)[/\d+/]
end
end

private

# generates hash of `path => unix time stamp (string)`
def build_lcd_cache
# example output:
#
# %jekyll-last-modified-at:1621042992
#
# Dockerfile.production
# %jekyll-last-modified-at:1621041929
#
# assets/css/style.52513a5600efd4015668ccb9b702256e.css
# assets/css/style.52513a5600efd4015668ccb9b702256e.css.gz
lines = Executor.sh(
'git',
'--git-dir',
top_level_directory,
'log',
'--name-only',
'--date=unix',
'--pretty=%%jekyll-last-modified-at:%ct'
)

lcd = nil
lines.split("\n").each do |line|
next if line.empty?

if line.start_with?('%jekyll-last-modified-at:')
# new record
lcd = line.split(':')[1]
next
end

# we already have it
next if @lcd_cache[line]

# we don't have it
@lcd_cache[line] = lcd
end
end
end
end
end
14 changes: 12 additions & 2 deletions lib/jekyll-last-modified-at/hook.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,21 @@ module Hook
def self.add_determinator_proc
proc { |item|
format = item.site.config.dig('last-modified-at', 'date-format')
item.data['last_modified_at'] = Determinator.new(item.site.source, item.path,
format)
use_git_cache = item.site.config.dig('last-modified-at', 'use-git-cache')
item.data['last_modified_at'] = Determinator.new(item.site.source, item.relative_path,
format, use_git_cache)
}
end

Jekyll::Hooks.register :site, :after_reset do |site|
Copy link

@khemarato khemarato Nov 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally didn't find this hook very helpful: it triggers a cache clear every file change while I'm developing, slowing my incremental builds and with no benefit, as uncommitted changes don't show up in git log

use_git_cache = site.config.dig('last-modified-at', 'use-git-cache')
if use_git_cache
# flush the caches so we can detect commits while server is running
Determinator.repo_cache = {}
Determinator.path_cache = {}
end
end

Jekyll::Hooks.register :posts, :post_init, &Hook.add_determinator_proc
Jekyll::Hooks.register :pages, :post_init, &Hook.add_determinator_proc
Jekyll::Hooks.register :documents, :post_init, &Hook.add_determinator_proc
Expand Down