Merging PR may fail because of various problems. The pull request may
have a dirty state because there is no transaction when merging a pull
request. ref
https://github.com/go-gitea/gitea/pull/25741#issuecomment-2074126393
This PR moves all database update operations to post-receive handler for
merging a pull request and having a database transaction. That means if
database operations fail, then the git merging will fail, the git client
will get a fail result.
There are already many tests for pull request merging, so we don't need
to add a new one.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
(cherry picked from commit ebf0c969403d91ed80745ff5bd7dfbdb08174fc7)
Conflicts:
modules/private/hook.go
routers/private/hook_post_receive.go
trivial conflicts because
263a716cb5 * Performance optimization for git push (#30104)
was not cherry-picked and because of
998a431747 Do not update PRs based on events that happened before they existed
(cherry picked from commit eb792d9f8a)
(cherry picked from commit ec3f5f9992d7ff8250c044a4467524d53bd50210)
It is possible to change some repo settings (its visibility, and
template status) via `git push` options: `-o repo.private=true`, `-o
repo.template=true`.
Previously, there weren't sufficient permission checks on these, and
anyone who could `git push` to a repository - including via an AGit
workflow! - was able to change either of these settings. To guard
against this, the pre-receive hook will now check if either of these
options are present, and if so, will perform additional permission
checks to ensure that these can only be set by a repository owner or
an administrator. Additionally, changing these settings is disabled for
forks, even for the fork's owner.
There's still a case where the owner of a repository can change the
visibility of it, and it will not propagate to forks (it propagates to
forks when changing the visibility via the API), but that's an
inconsistency, not a security issue.
Signed-off-by: Gergely Nagy <forgejo@gergo.csillger.hu>
Signed-off-by: Earl Warren <contact@earl-warren.org>
(cherry picked from commit 8eba631f8d)
`log.Xxx("%v")` is not ideal, this PR adds necessary context messages.
Remove some unnecessary logs.
Co-authored-by: Giteabot <teabot@gitea.io>
(cherry picked from commit 83f83019ef3471b847a300f0821499b3896ec987)
Conflicts:
- modules/util/util.go
Conflict resolved by picking `util.Iif` from 654cfd1dfbd3f3f1d94addee50b6fe2b018a49c3
(cherry picked from commit 492d116b2a468991f44d6d37ec33f918ccbe4514)
Conflicts:
modules/util/util.go
trivial context conflict as the commit is picked from https://codeberg.org/forgejo/forgejo/pulls/3212
* Split TestPullRequest out of AddTestPullRequestTask
* A Created field is added to the Issue table
* The Created field is set to the time (with nano resolution) on creation
* Record the nano time repo_module.PushUpdateOptions is created by the hook
* The decision to update a pull request created before a commit was
pushed is based on the time (with nano resolution) the git hook
was run and the Created field
It ensures the following happens:
* commit C is pushed
* the git hook queues AddTestPullRequestTask for processing and returns with success
* TestPullRequest is not called yet
* a pull request P with commit C as the head is created
* TestPullRequest runs and ignores P because it was created after the commit was received
When the "created" column is NULL, no verification is done, pull
requests that were created before the column was created in the
database cannot be newer than the latest call to a git hook.
Fixes: https://codeberg.org/forgejo/forgejo/issues/2009
(cherry picked from commit 998a431747)
Conflicts:
models/forgejo_migrations/migrate.go
see https://codeberg.org/forgejo/forgejo/pulls/3165#issuecomment-1755941
services/pull/pull.go
trivial conflicts
- Currently protected branch rules do not apply to admins, however in
some cases (like in the case of Forgejo project) you might also want to
apply these rules to admins to avoid accidental merges.
- Add new option to configure this on a per-rule basis.
- Adds integration tests.
- Resolves#65
Regression of #29493. If a branch has been deleted, repushing it won't
restore it.
Lunny may have noticed that, but I didn't delve into the comment then
overlooked it:
https://github.com/go-gitea/gitea/pull/29493#discussion_r1509046867
The additional comments added are to explain the issue I found during
testing, which are unrelated to the fixes.
(cherry picked from commit f371f84fa3456c2a71470632b6458d81e4892a54)
Unlike other async processing in the queue, we should sync branches to
the DB immediately when handling git hook calling. If it fails, users
can see the error message in the output of the git command.
It can avoid potential inconsistency issues, and help #29494.
---------
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
(cherry picked from commit 25b842df261452a29570ba89ffc3a4842d73f68c)
Conflicts:
routers/web/repo/wiki.go
services/repository/branch.go
services/repository/migrate.go
services/wiki/wiki.go
also apply to Forgejo specific usage of the refactored functions
Since `modules/context` has to depend on `models` and many other
packages, it should be moved from `modules/context` to
`services/context` according to design principles. There is no logic
code change on this PR, only move packages.
- Move `code.gitea.io/gitea/modules/context` to
`code.gitea.io/gitea/services/context`
- Move `code.gitea.io/gitea/modules/contexttest` to
`code.gitea.io/gitea/services/contexttest` because of depending on
context
- Move `code.gitea.io/gitea/modules/upload` to
`code.gitea.io/gitea/services/context/upload` because of depending on
context
(cherry picked from commit 29f149bd9f517225a3c9f1ca3fb0a7b5325af696)
Conflicts:
routers/api/packages/alpine/alpine.go
routers/api/v1/repo/issue_reaction.go
routers/install/install.go
routers/web/admin/config.go
routers/web/passkey.go
routers/web/repo/search.go
routers/web/repo/setting/default_branch.go
routers/web/user/home.go
routers/web/user/profile.go
tests/integration/editor_test.go
tests/integration/integration_test.go
tests/integration/mirror_push_test.go
trivial context conflicts
also modified all other occurrences in Forgejo specific files
Now we can get object format name from git command line or from the
database repository table. Assume the column is right, we don't need to
read from git command line every time.
This also fixed a possible bug that the object format is wrong when
migrating a sha256 repository from external.
<img width="658" alt="image"
src="https://github.com/go-gitea/gitea/assets/81045/6e9a9dcf-13bf-4267-928b-6bf2c2560423">
(cherry picked from commit b79c30435f439af8243ee281310258cdf141e27b)
Conflicts:
routers/web/repo/blame.go
services/agit/agit.go
context
## Purpose
This is a refactor toward building an abstraction over managing git
repositories.
Afterwards, it does not matter anymore if they are stored on the local
disk or somewhere remote.
## What this PR changes
We used `git.OpenRepository` everywhere previously.
Now, we should split them into two distinct functions:
Firstly, there are temporary repositories which do not change:
```go
git.OpenRepository(ctx, diskPath)
```
Gitea managed repositories having a record in the database in the
`repository` table are moved into the new package `gitrepo`:
```go
gitrepo.OpenRepository(ctx, repo_model.Repo)
```
Why is `repo_model.Repository` the second parameter instead of file
path?
Because then we can easily adapt our repository storage strategy.
The repositories can be stored locally, however, they could just as well
be stored on a remote server.
## Further changes in other PRs
- A Git Command wrapper on package `gitrepo` could be created. i.e.
`NewCommand(ctx, repo_model.Repository, commands...)`. `git.RunOpts{Dir:
repo.RepoPath()}`, the directory should be empty before invoking this
method and it can be filled in the function only. #28940
- Remove the `RepoPath()`/`WikiPath()` functions to reduce the
possibility of mistakes.
---------
Co-authored-by: delvh <dev.lh@web.de>
The 4 functions are duplicated, especially as interface methods. I think
we just need to keep `MustID` the only one and remove other 3.
```
MustID(b []byte) ObjectID
MustIDFromString(s string) ObjectID
NewID(b []byte) (ObjectID, error)
NewIDFromString(s string) (ObjectID, error)
```
Introduced the new interfrace method `ComputeHash` which will replace
the interface `HasherInterface`. Now we don't need to keep two
interfaces.
Reintroduced `git.NewIDFromString` and `git.MustIDFromString`. The new
function will detect the hash length to decide which objectformat of it.
If it's 40, then it's SHA1. If it's 64, then it's SHA256. This will be
right if the commitID is a full one. So the parameter should be always a
full commit id.
@AdamMajer Please review.
- Remove `ObjectFormatID`
- Remove function `ObjectFormatFromID`.
- Use `Sha1ObjectFormat` directly but not a pointer because it's an
empty struct.
- Store `ObjectFormatName` in `repository` struct
Refactor Hash interfaces and centralize hash function. This will allow
easier introduction of different hash function later on.
This forms the "no-op" part of the SHA256 enablement patch.
This PR removed `unittest.MainTest` the second parameter
`TestOptions.GiteaRoot`. Now it detects the root directory by current
working directory.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Partially Fix#25041
This PR redefined the meaning of column `is_active` in table
`action_runner_token`.
Before this PR, `is_active` means whether it has been used by any
runner. If it's true, other runner cannot use it to register again.
In this PR, `is_active` means whether it's validated to be used to
register runner. And if it's true, then it can be used to register
runners until it become false. When creating a new `is_active` register
token, any previous tokens will be set `is_active` to false.
Part of #27065
This reduces the usage of `db.DefaultContext`. I think I've got enough
files for the first PR. When this is merged, I will continue working on
this.
Considering how many files this PR affect, I hope it won't take to long
to merge, so I don't end up in the merge conflict hell.
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
> ### Description
> If a new branch is pushed, and the repository has a rule that would
require signed commits for the new branch, the commit is rejected with a
500 error regardless of whether it's signed.
>
> When pushing a new branch, the "old" commit is the empty ID
(0000000000000000000000000000000000000000). verifyCommits has no
provision for this and passes an invalid commit range to git rev-list.
Prior to 1.19 this wasn't an issue because only pre-existing individual
branches could be protected.
>
> I was able to reproduce with
[try.gitea.io/CraigTest/test](https://try.gitea.io/CraigTest/test),
which is set up with a blanket rule to require commits on all branches.
Fix#25565
Very thanks to @Craig-Holmquist-NTI for reporting the bug and suggesting
an valid solution!
---------
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
To avoid deadlock problem, almost database related functions should be
have ctx as the first parameter.
This PR do a refactor for some of these functions.
This PR replaces all string refName as a type `git.RefName` to make the
code more maintainable.
Fix#15367
Replaces #23070
It also fixed a bug that tags are not sync because `git remote --prune
origin` will not remove local tags if remote removed.
We in fact should use `git fetch --prune --tags origin` but not `git
remote update origin` to do the sync.
Some answer from ChatGPT as ref.
> If the git fetch --prune --tags command is not working as expected,
there could be a few reasons why. Here are a few things to check:
>
>Make sure that you have the latest version of Git installed on your
system. You can check the version by running git --version in your
terminal. If you have an outdated version, try updating Git and see if
that resolves the issue.
>
>Check that your Git repository is properly configured to track the
remote repository's tags. You can check this by running git config
--get-all remote.origin.fetch and verifying that it includes
+refs/tags/*:refs/tags/*. If it does not, you can add it by running git
config --add remote.origin.fetch "+refs/tags/*:refs/tags/*".
>
>Verify that the tags you are trying to prune actually exist on the
remote repository. You can do this by running git ls-remote --tags
origin to list all the tags on the remote repository.
>
>Check if any local tags have been created that match the names of tags
on the remote repository. If so, these local tags may be preventing the
git fetch --prune --tags command from working properly. You can delete
local tags using the git tag -d command.
---------
Co-authored-by: delvh <dev.lh@web.de>
## ⚠️ Breaking
The `log.<mode>.<logger>` style config has been dropped. If you used it,
please check the new config manual & app.example.ini to make your
instance output logs as expected.
Although many legacy options still work, it's encouraged to upgrade to
the new options.
The SMTP logger is deleted because SMTP is not suitable to collect logs.
If you have manually configured Gitea log options, please confirm the
logger system works as expected after upgrading.
## Description
Close#12082 and maybe more log-related issues, resolve some related
FIXMEs in old code (which seems unfixable before)
Just like rewriting queue #24505 : make code maintainable, clear legacy
bugs, and add the ability to support more writers (eg: JSON, structured
log)
There is a new document (with examples): `logging-config.en-us.md`
This PR is safer than the queue rewriting, because it's just for
logging, it won't break other logic.
## The old problems
The logging system is quite old and difficult to maintain:
* Unclear concepts: Logger, NamedLogger, MultiChannelledLogger,
SubLogger, EventLogger, WriterLogger etc
* Some code is diffuclt to konw whether it is right:
`log.DelNamedLogger("console")` vs `log.DelNamedLogger(log.DEFAULT)` vs
`log.DelLogger("console")`
* The old system heavily depends on ini config system, it's difficult to
create new logger for different purpose, and it's very fragile.
* The "color" trick is difficult to use and read, many colors are
unnecessary, and in the future structured log could help
* It's difficult to add other log formats, eg: JSON format
* The log outputer doesn't have full control of its goroutine, it's
difficult to make outputer have advanced behaviors
* The logs could be lost in some cases: eg: no Fatal error when using
CLI.
* Config options are passed by JSON, which is quite fragile.
* INI package makes the KEY in `[log]` section visible in `[log.sub1]`
and `[log.sub1.subA]`, this behavior is quite fragile and would cause
more unclear problems, and there is no strong requirement to support
`log.<mode>.<logger>` syntax.
## The new design
See `logger.go` for documents.
## Screenshot
<details>
![image](https://github.com/go-gitea/gitea/assets/2114189/4462d713-ba39-41f5-bb08-de912e67e1ff)
![image](https://github.com/go-gitea/gitea/assets/2114189/b188035e-f691-428b-8b2d-ff7b2199b2f9)
![image](https://github.com/go-gitea/gitea/assets/2114189/132e9745-1c3b-4e00-9e0d-15eaea495dee)
</details>
## TODO
* [x] add some new tests
* [x] fix some tests
* [x] test some sub-commands (manually ....)
---------
Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: Giteabot <teabot@gitea.io>
The old code is unnecessarily complex, and has many misuses.
Old code "wraps" a lot, wrap wrap wrap, it's difficult to understand
which kind of handler is used.
The new code uses a general approach, we do not need to write all kinds
of handlers into the "wrapper", do not need to wrap them again and
again.
New code, there are only 2 concepts:
1. HandlerProvider: `func (h any) (handlerProvider func (next)
http.Handler)`, it can be used as middleware
2. Use HandlerProvider to get the final HandlerFunc, and use it for
`r.Get()`
And we can decouple the route package from context package (see the
TODO).
# FAQ
## Is `reflect` safe?
Yes, all handlers are checked during startup, see the `preCheckHandler`
comment. If any handler is wrong, developers could know it in the first
time.
## Does `reflect` affect performance?
No. https://github.com/go-gitea/gitea/pull/24080#discussion_r1164825901
1. This reflect code only runs for each web handler call, handler is far
more slower: 10ms-50ms
2. The reflect is pretty fast (comparing to other code): 0.000265ms
3. XORM has more reflect operations already
# Why this PR comes
At first, I'd like to help users like #23636 (there are a lot)
The unclear "Internal Server Error" is quite anonying, scare users,
frustrate contributors, nobody knows what happens.
So, it's always good to provide meaningful messages to end users (of
course, do not leak sensitive information).
When I started working on the "response message to end users", I found
that the related code has a lot of technical debt. A lot of copy&paste
code, unclear fields and usages.
So I think it's good to make everything clear.
# Tech Backgrounds
Gitea has many sub-commands, some are used by admins, some are used by
SSH servers or Git Hooks. Many sub-commands use "internal API" to
communicate with Gitea web server.
Before, Gitea server always use `StatusCode + Json "err" field` to
return messages.
* The CLI sub-commands: they expect to show all error related messages
to site admin
* The Serv/Hook sub-commands (for git clients): they could only show
safe messages to end users, the error log could only be recorded by
"SSHLog" to Gitea web server.
In the old design, it assumes that:
* If the StatusCode is 500 (in some functions), then the "err" field is
error log, shouldn't be exposed to git client.
* If the StatusCode is 40x, then the "err" field could be exposed. And
some functions always read the "err" no matter what the StatusCode is.
The old code is not strict, and it's difficult to distinguish the
messages clearly and then output them correctly.
# This PR
To help to remove duplicate code and make everything clear, this PR
introduces `ResponseExtra` and `requestJSONResp`.
* `ResponseExtra` is a struct which contains "extra" information of a
internal API response, including StatusCode, UserMsg, Error
* `requestJSONResp` is a generic function which can be used for all
cases to help to simplify the calls.
* Remove all `map["err"]`, always use `private.Response{Err}` to
construct error messages.
* User messages and error messages are separated clearly, the `fail` and
`handleCliResponseExtra` will output correct messages.
* Replace all `Internal Server Error` messages with meaningful (still
safe) messages.
This PR saves more than 300 lines, while makes the git client messages
more clear.
Many gitea-serv/git-hook related essential functions are covered by
tests.
---------
Co-authored-by: delvh <dev.lh@web.de>
Some bugs caused by less unit tests in fundamental packages. This PR
refactor `setting` package so that create a unit test will be easier
than before.
- All `LoadFromXXX` files has been splited as two functions, one is
`InitProviderFromXXX` and `LoadCommonSettings`. The first functions will
only include the code to create or new a ini file. The second function
will load common settings.
- It also renames all functions in setting from `newXXXService` to
`loadXXXSetting` or `loadXXXFrom` to make the function name less
confusing.
- Move `XORMLog` to `SQLLog` because it's a better name for that.
Maybe we should finally move these `loadXXXSetting` into the `XXXInit`
function? Any idea?
---------
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: delvh <dev.lh@web.de>
To avoid duplicated load of the same data in an HTTP request, we can set
a context cache to do that. i.e. Some pages may load a user from a
database with the same id in different areas on the same page. But the
code is hidden in two different deep logic. How should we share the
user? As a result of this PR, now if both entry functions accept
`context.Context` as the first parameter and we just need to refactor
`GetUserByID` to reuse the user from the context cache. Then it will not
be loaded twice on an HTTP request.
But of course, sometimes we would like to reload an object from the
database, that's why `RemoveContextData` is also exposed.
The core context cache is here. It defines a new context
```go
type cacheContext struct {
ctx context.Context
data map[any]map[any]any
lock sync.RWMutex
}
var cacheContextKey = struct{}{}
func WithCacheContext(ctx context.Context) context.Context {
return context.WithValue(ctx, cacheContextKey, &cacheContext{
ctx: ctx,
data: make(map[any]map[any]any),
})
}
```
Then you can use the below 4 methods to read/write/del the data within
the same context.
```go
func GetContextData(ctx context.Context, tp, key any) any
func SetContextData(ctx context.Context, tp, key, value any)
func RemoveContextData(ctx context.Context, tp, key any)
func GetWithContextCache[T any](ctx context.Context, cacheGroupKey string, cacheTargetID any, f func() (T, error)) (T, error)
```
Then let's take a look at how `system.GetString` implement it.
```go
func GetSetting(ctx context.Context, key string) (string, error) {
return cache.GetWithContextCache(ctx, contextCacheKey, key, func() (string, error) {
return cache.GetString(genSettingCacheKey(key), func() (string, error) {
res, err := GetSettingNoCache(ctx, key)
if err != nil {
return "", err
}
return res.SettingValue, nil
})
})
}
```
First, it will check if context data include the setting object with the
key. If not, it will query from the global cache which may be memory or
a Redis cache. If not, it will get the object from the database. In the
end, if the object gets from the global cache or database, it will be
set into the context cache.
An object stored in the context cache will only be destroyed after the
context disappeared.
And also the other way around, it would show an non-working URL in the
message when pull requests are disabled on the base repository but
enabled on the fork.
Change the mailer interface to prevent leaking of possible hidden email
addresses when sending to multiple recipients.
Co-authored-by: Gusted <williamzijl7@hotmail.com>
This PR introduce glob match for protected branch name. The separator is
`/` and you can use `*` matching non-separator chars and use `**` across
separator.
It also supports input an exist or non-exist branch name as matching
condition and branch name condition has high priority than glob rule.
Should fix#2529 and #15705
screenshots
<img width="1160" alt="image"
src="https://user-images.githubusercontent.com/81045/205651179-ebb5492a-4ade-4bb4-a13c-965e8c927063.png">
Co-authored-by: zeripath <art27@cantab.net>
After #22362, we can feel free to use transactions without
`db.DefaultContext`.
And there are still lots of models using `db.DefaultContext`, I think we
should refactor them carefully and one by one.
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Change all license headers to comply with REUSE specification.
Fix#16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
This PR adds a context parameter to a bunch of methods. Some helper
`xxxCtx()` methods got replaced with the normal name now.
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>