Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
  • Loading branch information
newren committed Nov 23, 2024
1 parent 9ae4ae6 commit dbe0265
Showing 1 changed file with 231 additions and 0 deletions.
231 changes: 231 additions & 0 deletions Documentation/curated-examples-from-issues.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
# Curated examples from issues

Lots of people have filed issues against git-filter-repo, and many times it
boils down into questions of "How do I?" or "Why doesn't this work?"

I thought I'd collect a bunch of these as example repository filterings
that others may be interested in.

## Table of Contents

* [Adding files to root commits](#adding-files-to-root-commits)
* [Purge a large list of files](#purge-a-large-list-of-files)

## Adding files to root commits

<!-- https://github.com/newren/git-filter-repo/issues/21 -->

Here's an example that will take `/path/to/existing/README.md` and
store it as `README.md` in the repository, and take
`/home/myusers/mymodule.gitignore` and store it as `src/.gitignore` in
the repository:

```
git filter-repo --commit-callback "if not commit.parents: commit.file_changes += [
FileChange(b'M', b'README.md', b'$(git hash-object -w '/path/to/existing/README.md')', b'100644'),
FileChange(b'M', b'src/.gitignore', b'$(git hash-object -w '/home/myusers/mymodule.gitignore')', b'100644')]"
```

Alternatively, you could also use the [insert-beginning contrib script](../contrib/filter-repo-demos/insert-beginning).

## Purge a large list of files

<!-- https://github.com/newren/git-filter-repo/issues/63 -->

Stick all the files in some file (one per line),
e.g. ../DELETED_FILENAMES.txt, and then run

```
git filter-repo --invert-paths --paths-from-file ../DELETED_FILENAMES.txt
```

## Extracting a libary to a separate repo

<!-- https://github.com/newren/git-filter-repo/issues/80 -->

```
git filter-repo \
--path src/some-folder/some-feature \
--path-rename src/some-folder/some-feature/:src/
```

## Replace words in all commit messages

<!-- https://github.com/newren/git-filter-repo/issues/83 -->

```
git-filter-repo --message-callback 'return message.replace(b"stuff", b"task")'
```

## Only keep files from two branches

<!-- https://github.com/newren/git-filter-repo/issues/91 -->

Let's say you know that the files currently present on two branches
are the only files that matter. Files that used to exist in either of
these branches, or files that only exist on some other branch, should
all be deleted from all versions of history. This can be accomplished
by getting a list of files from each branch, combining them, sorting
the list and picking out just the unique entries, then passing to
`--paths-from-file`:

```
git ls-tree -r ${BRANCH1} >../my-files
git ls-tree -r ${BRANCH2} >>../my-files
sort ../my-files | uniq >../my-relevant-files
git filter-repo --paths-from-file ../my-relevant-files
```

## Renormalize end-of-line characters and add a .gitattributes

<!-- https://github.com/newren/git-filter-repo/issues/122 -->

```
contrib/filter-repo-demos/lint-history dos2unix
[edit .gitattributes]
contrib/filter-repo-demos/insert-beginning .gitattributes
```

## Remove spaces at the end of lines

<!-- https://github.com/newren/git-filter-repo/issues/145 -->

Removing all spaces at the end of lines of non-binary files, including
stripping trailing carriage returns:

```
git filter-repo --replace-text <(echo 'regex:[\r\t ]+(\n|$)==>\n')
```

## Having both exclude and include rules for filenames

<!-- https://github.com/newren/git-filter-repo/issues/230 -->

If you want to have rules to both include and exclude filenames, you
can simply invoke `git filter-repo` multiple times. Alternatively,
you can dispense with `--path` arguments and instead use the more
generic `--filename-callback`. For example to include all files under
`src/` except for `src/README.md`:

```
git filter-repo --filename-callback '
if filename == b"src/README.md":
return None
if filename.startswith(b"src/"):
return filename
return None'
```
## Removing paths with a certain extension
<!-- https://github.com/newren/git-filter-repo/issues/274 -->
```
git filter-repo --invert-paths --path-glob '*.xsa'
```
or
```
git filter-repo --filename-callback '
if filename.endswith(b".xsa"):
return None
return filename'
```
## Removing a directory
<!-- https://github.com/newren/git-filter-repo/issues/278 -->
```
git filter-repo --path node_modules/electron/dist/ --invert-paths
```
## Convert from NFD filenames to NFC
<!-- https://github.com/newren/git-filter-repo/issues/296 -->
Given that Mac does utf-8 normalization of filenames, and has
historically switched which kind of normalization it does, users may
have committed files with alternative normalizations to their
repository. If someone wants to convert filenames in NFD form to NFC,
they could run
```
git filter-repo --filename-callback '
try:
return subprocess.check_output("iconv -f utf-8-mac -t utf-8".split(),
input=filename)
except:
return filename
'
```
or
```
git filter-repo --filename-callback '
import unicodedata
try:
return bytearray(unicodedata.normalize('NFC', filename.decode('utf-8')), 'utf-8')
except:
return filename
'
```
## Set the committer of the last few commits to myself
<!-- https://github.com/newren/git-filter-repo/issues/379 -->
```
git filter-repo --refs main~5..main --commit-callback '
commit.commiter_name = b"My Wonderful Self"
commit.committer_email = b"my@self.org"
'

## Handling special characters, e.g. accents in names

<!-- https://github.com/newren/git-filter-repo/issues/383 -->

Since characters like ë and á are multi-byte characters and python
won't allow you to directly place those in a bytestring
(e.g. b"Raphaël González"), you just need to make a normal string and
then convert to a bytestring to handle these. For example, changing
the author name and email where the author email is currently
`example@test.com`:

```
git filter-repo --refs main~5..main --commit-callback '
if commit.author_email = b"example@test.com":
commit.author_name = "Raphaël González".encode()
commit.author_email = b"rgonzalez@test.com"
'
```

<!-- https://github.com/newren/git-filter-repo/issues/420 -->
handling repository corruption (old original objects are corrupt)

<!-- https://github.com/newren/git-filter-repo/issues/427 -->
removing all files with a backslash in them (final example is best)

<!-- https://github.com/newren/git-filter-repo/issues/436 -->
replace a binary blob in history

<!-- https://github.com/newren/git-filter-repo/pull/542 -->
callback for lint-history

<!-- https://github.com/newren/git-filter-repo/issues/300 -->
using replace refs to delete old history

<!-- https://github.com/newren/git-filter-repo/issues/492 -->
replacing pngs with compressed alternative
(#537 also used a change.blob_id thingy)

<!-- https://github.com/newren/git-filter-repo/issues/490 -->
<!-- https://github.com/newren/git-filter-repo/issues/504 -->
need for a multi-step filtering to avoid path collisions or ordering issues

<!-- https://lore.kernel.org/git/CABPp-BFqbiS8xsbLouNB41QTc5p0hEOy-EoV0Sjnp=xJEShkTw@mail.gmail.com/ -->
Two things:
textwrap.dedent
easier example of using git-filter-repo as a library

0 comments on commit dbe0265

Please sign in to comment.