The tool git filter-repo
and BFG Repo-Cleaner overwrite the repository log, which changes the SYSTEM Health Agents for existing commits that you are modifying and all dependent commits. Modified commit health agents can affect open pull requests in the repository. Before deleting files from the repository, it is recommended that you merge or close all open pull requests.
You can remove a file from the last commit by using git rm
the Last Commit file for information about deleting a file added with the last commit, see About large files on GitHub.
Warning. This article describes how to make sensitive data commits inaccessible from any branches or tags in a repository in GitHub.com. However, these commits can still be accessed in any clones or repository forks directly through SHA-1 hashes in cached views in GitHub, as well as through any pull requests referencing them. You can't remove sensitive data from clones of other users' repository, but you can permanently remove cached views and links to sensitive data in pull requests on GitHub by contacting GitHub Support.
If the commit that entered sensitive data exists in any repository forks, it will still be available unless the owner of the fork removes the sensitive data from the fork or removes the fork entirely.
After the commit is sent to GitHub, any sensitive data in the commit should be treated as compromised. If you've fixed your password, you'll need to change it. If you have fixed the key, create a new key. Deleting compromised data does not eliminate the initial vulnerability, especially in existing clones or repository forks.
Consider these limitations when deciding to overwrite the repository log.
Clearing a File from the Repository
Log You can clear the file from the repository log using the open source BFG Repo-Cleaner tool git filter-repo
or tool.
Using BFG
BFG Repo-Cleaner is a tool created and maintained by the open source community. It is a faster and easier alternative git filter-repo
for removing unwanted data.
To delete a file with sensitive data and leave the last commit unchanged, run the following command:
$ bfg --delete-files YOUR-FILE-WITH-SENSITIVE-DATA
Чтобы заменить все вхождения текста, представленного в passwords.txt
, в журнале репозитория, выполните:
$ bfg --replace-text passwords.txt
После удаления конфиденциальных данных необходимо принудительно отправить изменения в GitHub. В этом случае выполняется перезапись журнала репозитория, в результате которой конфиденциальные данные удаляются из журнала фиксации. При принудительной отправке могут быть перезаписаны фиксации, над которыми работают другие пользователи.
$ git push --force
Полные инструкции по использованию и загрузке см. в документации по BFG Repo-Cleaner.
Using git filter-repo
Important! If you have executed git filter-repo
after you have hidden the changes, you will not be able to retrieve the changes using other commands to hide. Before executinggit filter-repo
, it is recommended that you show all the changes you made. To show the last set of hidden changes, run the command git stash show -p | git apply -R
. For more information, see Git Tools - Hiding and Cleaning.
To demonstrate how it worksgit filter-repo
, we'll show you how to remove a sensitive data file from the repository log, add it to.gitignore
, and make sure it hasn't been accidentally re-committed.
-
Install the latest version of the git filter-repo tool. You can install
git-filter-repo
manually or by using Package Manager. For example, to install the tool by using HomeBrew, use the commandbrew install
.brew install git-filter-repo
Дополнительные сведения см. в файле INSTALL.md в репозитории
newren/git-filter-repo
. -
If you don't already have a local copy of the repository with sensitive data in the log, clone the repository to your local machine.
$ git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY > Initialized empty Git repository in /Users/YOUR-FILE-PATH/YOUR-REPOSITORY/.git/ > remote: Counting objects: 1301, done. > remote: Compressing objects: 100% (769/769), done. > remote: Total 1301 (delta 724), reused 910 (delta 522) > Receiving objects: 100% (1301/1301), 164.39 KiB, done. > Resolving deltas: 100% (724/724), done.
-
Перейдите в рабочую папку репозитория.
$ cd YOUR-REPOSITORY
-
Выполните следующую команду, заменив
PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA
путем к файлу, который необходимо удалить, а не только именем файла. Эти аргументы:- Force Git to process, but not extract, the entire log of each branch and tag.
- Delete the specified file, as well as any empty commits generated as a result.
- Delete some configurations, such as the remote URL stored in the .git/config file. You may need to back up this file in advance for later recovery.
-
Overwrite existing tags
$ git filter-repo --invert-paths --path PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA Parsed 197 commits New history written in 0.11 seconds; now repacking/cleaning... Repacking your repo and cleaning out old unneeded objects Enumerating objects: 210, done. Counting objects: 100% (210/210), done. Delta compression using up to 12 threads Compressing objects: 100% (127/127), done. Writing objects: 100% (210/210), done. Building bitmaps: 100% (48/48), done. Total 210 (delta 98), reused 144 (delta 75), pack-reused 0 Completely finished after 0.64 seconds.
Note. If the sensitive data file existed in other paths (as a result of being moved or renamed), you must also run this command on those paths.
-
Add a sensitive data file to
.gitignore
to prevent accidental commit.$ echo "YOUR-FILE-WITH-SENSITIVE-DATA" >> .gitignore $ git add .gitignore $ git commit -m "Add YOUR-FILE-WITH-SENSITIVE-DATA to .gitignore" > [main 051452f] Add YOUR-FILE-WITH-SENSITIVE-DATA to .gitignore > 1 files changed, 1 insertions(+), 0 deletions(-)
-
Внимательно проверьте, удалили ли вы все, что требуется, из журнала репозитория и все ли ветви извлечены.
-
When you are satisfied with the state of the repository, push local changes to overwrite the repository in the GitHub.com, as well as in all branches you have added. Removing sensitive data from the commit log requires pushing.
$ git push origin --force --all > Counting objects: 1074, done. > Delta compression using 2 threads. > Compressing objects: 100% (677/677), done. > Writing objects: 100% (1058/1058), 148.85 KiB, done. > Total 1058 (delta 590), reused 602 (delta 378) > To https://github.com/YOUR-USERNAME.YOUR-REPOSITORY.git > + 48dc599...051452f main -> main (forced update)
-
Чтобы удалить конфиденциальный файл из выпусков с тегами, потребуется также выполнить принудительную отправку в теги Git:
$ git push origin --force --tags > Counting objects: 321, done. > Delta compression using up to 8 threads. > Compressing objects: 100% (166/166), done. > Writing objects: 100% (321/321), 331.74 KiB | 0 bytes/s, done. > Total 321 (delta 124), reused 269 (delta 108) > To https://github.com/YOUR-USERNAME/YOUR-REPOSITORY.git > + 48dc599...051452f main -> main (forced update)
Полное удаление данных с сайта GitHub
After you delete sensitive data using the BFG tool or git filter-repo
send changes to GitHub, there are a few things you need to do to completely remove the data from GitHub.
-
Contact GitHub Support to remove cached views and links to sensitive data in pull requests on GitHub. Specify the repository name and/or commit reference required for deletion.
-
Inform collaborators to move changes from one branch to another and not merge any branches created from the old (corrupted) repository log. One merge commit may return some or all of the corrupted logs that you just had to clean up.
-
After a while, when you are sure that there were no unforeseen side effects in the BFG/
git filter-repo
tool, you can force dereferencing all objects in the local repository and collect garbage using the following commands (in Git 1.8.5 or later):$ git for-each-ref --format="delete %(refname)" refs/original | git update-ref --stdin $ git reflog expire --expire=now --all $ git gc --prune=now > Counting objects: 2437, done. > Delta compression using up to 4 threads. > Compressing objects: 100% (1378/1378), done. > Writing objects: 100% (2437/2437), done. > Total 2437 (delta 1461), reused 1802 (delta 1048)
Примечание: Для этого можно также отправить отфильтрованный журнал в новый или пустой репозиторий, а затем сделать новый клон из GitHub.
Preventing accidental commits in the future
There are a few simple tricks to avoid committing data that you do not want to commit:
- Use a visual program such as GitHub Desktop or gitk to commit changes. As a rule, visual programs make it easier to view files that will be added, deleted, and changed with each commit.
- Avoid using catch-all
git add .
commands in the command line andgit commit -a
— usegit add filename
andgit rm filename
to prepare each file individually. - Use
git add --interactive
to validate and prepare each individual change in each file. - Use
git diff --cached
to verify changes prepared for commit. This is an exact mismatch thatgit commit
will occur as long as you do not use the .-a