Support compression for Actions logs to save storage space and
bandwidth. Inspired by
https://github.com/go-gitea/gitea/issues/24256#issuecomment-1521153015
The biggest challenge is that the compression format should support
[seekable](https://github.com/facebook/zstd/blob/dev/contrib/seekable_format/zstd_seekable_compression_format.md).
So when users are viewing a part of the log lines, Gitea doesn't need to
download the whole compressed file and decompress it.
That means gzip cannot help here. And I did research, there aren't too
many choices, like bgzip and xz, but I think zstd is the most popular
one. It has an implementation in Golang with
[zstd](https://github.com/klauspost/compress/tree/master/zstd) and
[zstd-seekable-format-go](https://github.com/SaveTheRbtz/zstd-seekable-format-go),
and what is better is that it has good compatibility: a seekable format
zstd file can be read by a regular zstd reader.
This PR introduces a new package `zstd` to combine and wrap the two
packages, to provide a unified and easy-to-use API.
And a new setting `LOG_COMPRESSION` is added to the config, although I
don't see any reason why not to use compression, I think's it's a good
idea to keep the default with `none` to be consistent with old versions.
`LOG_COMPRESSION` takes effect for only new log files, it adds `.zst` as
an extension to the file name, so Gitea can determine if it needs
decompression according to the file name when reading. Old files will
keep the format since it's not worth converting them, as they will be
cleared after #31735.
<img width="541" alt="image"
src="https://github.com/user-attachments/assets/e9598764-a4e0-4b68-8c2b-f769265183c9">
(cherry picked from commit 33cc5837a655ad544b936d4d040ca36d74092588)
Conflicts:
assets/go-licenses.json
go.mod
go.sum
resolved with make tidy
Part of #24256.
Clear up old action logs to free up storage space.
Users will see a message indicating that the log has been cleared if
they view old tasks.
<img width="1361" alt="image"
src="https://github.com/user-attachments/assets/9f0f3a3a-bc5a-402f-90ca-49282d196c22">
Docs: https://gitea.com/gitea/docs/pulls/40
---------
Co-authored-by: silverwind <me@silverwind.io>
(cherry picked from commit 687c1182482ad9443a5911c068b317a91c91d586)
Conflicts:
custom/conf/app.example.ini
routers/web/repo/actions/view.go
trivial context conflict
Fix#30243
We only checking unit disabled when detecting workflows, but not in
runner `FetchTask`.
So if a workflow was detected when action unit is enabled, but disabled
later, `FetchTask` will still return these detected actions.
Global setting: repo.ENABLED and repository.`DISABLED_REPO_UNITS` will
not effect this.
(cherry picked from commit d872ce006c0400edb10a05f7555f9b08070442e3)
When we pick up a job, all waiting jobs should firstly be ordered by
update time,
otherwise when there's a running job, if I rerun an older job, the older
job will run first, as it's id is smaller.
Close#24544
Changes:
- Create `action_tasks_version` table to store the latest version of
each scope (global, org and repo).
- When a job with the status of `waiting` is created, the tasks version
of the scopes it belongs to will increase.
- When the status of a job already in the database is updated to
`waiting`, the tasks version of the scopes it belongs to will increase.
- On Gitea side, in `FeatchTask()`, will try to query the
`action_tasks_version` record of the scope of the runner that call
`FetchTask()`. If the record does not exist, will insert a row. Then,
Gitea will compare the version passed from runner to Gitea with the
version in database, if inconsistent, try pick task. Gitea always
returns the latest version from database to the runner.
Related:
- Protocol: https://gitea.com/gitea/actions-proto-def/pulls/10
- Runner: https://gitea.com/gitea/act_runner/pulls/219
Fix#25451.
Bugfixes:
- When stopping the zombie or endless tasks, set `LogInStorage` to true
after transferring the file to storage. It was missing, it could write
to a nonexistent file in DBFS because `LogInStorage` was false.
- Always update `ActionTask.Updated` when there's a new state reported
by the runner, even if there's no change. This is to avoid the task
being judged as a zombie task.
Enhancement:
- Support `Stat()` for DBFS file.
- `WriteLogs` refuses to write if it could result in content holes.
---------
Co-authored-by: Giteabot <teabot@gitea.io>
close#24540
related:
- Protocol: https://gitea.com/gitea/actions-proto-def/pulls/9
- Runner side: https://gitea.com/gitea/act_runner/pulls/201
changes:
- Add column of `labels` to table `action_runner`, and combine the value
of `agent_labels` and `custom_labels` column to `labels` column.
- Store `labels` when registering `act_runner`.
- Update `labels` when `act_runner` starting and calling `Declare`.
- Users cannot modify the `custom labels` in edit page any more.
other changes:
- Store `version` when registering `act_runner`.
- If runner is latest version, parse version from `Declare`. But older
version runner still parse version from request header.
The name of the job or step comes from the workflow file, while the name
of the runner comes from its registration. If the strings used for these
names are too long, they could cause db issues.