Commit graph

9 commits

Author SHA1 Message Date
Gusted aa7346c007 Refactor LFS GC functions
- Remove options that currently aren't set
on `GarbageCollectLFSMetaObjectsOptions` and
`IterateLFSMetaObjectsForRepoOptions`.
- Simplify `IterateRepositoryIDsWithLFSMetaObjects` and
`IterateLFSMetaObjectsForRepo`.
- `IterateLFSMetaObjectsForRepo` was previously able to get in a
loop (`gc-lfs` doctor check was able to reproduce this) because the code
expected that the records would be updated to not match the SQL query,
but that wasn't the case. Simply enforce that only records higher than
the latest `id` from the previous iteration are allowed.
- For `gc-lfs` doctor check this was because `UpdatedLessRecentlyThan`
option was not set, which caused that records just marked as active in
the iteration weren't being filtered.
- Add unit tests
- Most likely a regression from 2cc3a6381c.
- The bug with `gc-lfs` was found on Codeberg.

(cherry picked from commit 7ffa7f5bce)
2024-04-06 07:41:40 +00:00
Lunny Xiao 6905540088
Use the database object format name but not read from git repoisitory everytime and fix possible migration wrong objectformat when migrating a sha256 repository (#29294)
Now we can get object format name from git command line or from the
database repository table. Assume the column is right, we don't need to
read from git command line every time.

This also fixed a possible bug that the object format is wrong when
migrating a sha256 repository from external.

<img width="658" alt="image"
src="https://github.com/go-gitea/gitea/assets/81045/6e9a9dcf-13bf-4267-928b-6bf2c2560423">

(cherry picked from commit b79c30435f439af8243ee281310258cdf141e27b)

Conflicts:
	routers/web/repo/blame.go
	services/agit/agit.go
	context
2024-02-26 22:30:26 +01:00
Lunny Xiao 5f82ead13c
Simplify how git repositories are opened (#28937)
## Purpose
This is a refactor toward building an abstraction over managing git
repositories.
Afterwards, it does not matter anymore if they are stored on the local
disk or somewhere remote.

## What this PR changes
We used `git.OpenRepository` everywhere previously.
Now, we should split them into two distinct functions:

Firstly, there are temporary repositories which do not change:

```go
git.OpenRepository(ctx, diskPath)
```

Gitea managed repositories having a record in the database in the
`repository` table are moved into the new package `gitrepo`:

```go
gitrepo.OpenRepository(ctx, repo_model.Repo)
```

Why is `repo_model.Repository` the second parameter instead of file
path?
Because then we can easily adapt our repository storage strategy.
The repositories can be stored locally, however, they could just as well
be stored on a remote server.

## Further changes in other PRs
- A Git Command wrapper on package `gitrepo` could be created. i.e.
`NewCommand(ctx, repo_model.Repository, commands...)`. `git.RunOpts{Dir:
repo.RepoPath()}`, the directory should be empty before invoking this
method and it can be filled in the function only. #28940
- Remove the `RepoPath()`/`WikiPath()` functions to reduce the
possibility of mistakes.

---------

Co-authored-by: delvh <dev.lh@web.de>
2024-01-27 21:09:51 +01:00
Adam Majer cbf923e87b
Abstract hash function usage (#28138)
Refactor Hash interfaces and centralize hash function. This will allow
easier introduction of different hash function later on.

This forms the "no-op" part of the SHA256 enablement patch.
2023-12-13 21:02:00 +00:00
Zettat123 f3ed0ef692
Fix bugs in LFS meta garbage collection (#26122)
This PR

- Fix #26093. Replace `time.Time` with `timeutil.TimeStamp`
- Fix #26135. Add missing `xorm:"extends"` to `CountLFSMetaObject` for
LFS meta object query
- Add a unit test for LFS meta object garbage collection
2023-07-26 07:02:53 +00:00
wxiaoguang f0bde0e4f9
Simplify the LFS GC logger usage (#25717)
Remove unnecessary `if opts.Logger != nil` checks.

* For "CLI doctor" mode, output to the console's "logger.Info".
* For "Web Task" mode, output to the default "logger.Debug", to avoid
flooding the server's log in a busy production instance.

Co-authored-by: Giteabot <teabot@gitea.io>
2023-07-06 16:52:41 +00:00
zeripath 2cc3a6381c
Add cron method to gc LFS MetaObjects (#22385)
This PR adds a task to the cron service to allow garbage collection of
LFS meta objects. As repositories may have a large number of
LFSMetaObjects, an updated column is added to this table and it is used
to perform a generational GC to attempt to reduce the amount of work.
(There may need to be a bit more work here but this is probably enough
for the moment.)

Fix #7045

Signed-off-by: Andrew Thornton <art27@cantab.net>
2023-01-16 13:50:53 -06:00
Jason Song 7adc2de464
Use context parameter in models/git (#22367)
After #22362, we can feel free to use transactions without
`db.DefaultContext`.

And there are still lots of models using `db.DefaultContext`, I think we
should refactor them carefully and one by one.

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2023-01-09 11:50:54 +08:00
zeripath 651fe4bb7d
Add doctor command for full GC of LFS (#21978)
The recent PR adding orphaned checks to the LFS storage is not
sufficient to completely GC LFS, as it is possible for LFSMetaObjects to
remain associated with repos but still need to be garbage collected.

Imagine a situation where a branch is uploaded containing LFS files but
that branch is later completely deleted. The LFSMetaObjects will remain
associated with the Repository but the Repository will no longer contain
any pointers to the object.

This PR adds a second doctor command to perform a full GC.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-12-15 20:44:16 +00:00