This actually was already intended before to eradict all future
path-traversal-style exploits and to fix issues with some
characters like akkoma#610 in 0b2ec0ccee. However, Dedupe and
AnonymizeFilename got mixed up. The latter only anonymises the name
in Content-Disposition headers GET parameters (with link_name),
_not_ the upload path.
Even without Dedupe, the upload path is prefixed by an UUID,
so it _should_ already be hard to guess for attackers. But now
we actually can be sure no path shenanigangs occur, uploads
reliably work and save some disk space.
While this makes the final path predictable, this prediction is
not exploitable. Insertion of a back-reference to the upload
itself requires pulling off a successfull preimage attack against
SHA-256, which is deemed infeasible for the foreseeable futures.
Dedupe was already included in the default list in config.exs
since 28cfb2c37a, but this will get overridde by whatever the
config generated by the "pleroma.instance gen" task chose.
Upload+delete tests running in parallel using Dedupe might be flaky, but
this was already true before and needs its own commit to fix eventually.
The lack thereof enables spoofing ActivityPub objects.
A malicious user could upload fake activities as attachments
and (if having access to remote search) trick local and remote
fedi instances into fetching and processing it as a valid object.
If uploads are hosted on the same domain as the instance itself,
it is possible for anyone with upload access to impersonate(!)
other users of the same instance.
If uploads are exclusively hosted on a different domain, even the most
basic check of domain of the object id and fetch url matching should
prevent impersonation. However, it may still be possible to trick
servers into accepting bogus users on the upload (sub)domain and bogus
notes attributed to such users.
Instances which later migrated to a different domain and have a
permissive redirect rule in place can still be vulnerable.
If — like Akkoma — the fetching server is overly permissive with
redirects, impersonation still works.
This was possible because Plug.Static also uses our custom
MIME type mappings used for actually authentic AP objects.
Provided external storage providers don’t somehow return ActivityStream
Content-Types on their own, instances using those are also safe against
their users being spoofed via uploads.
Akkoma instances using the OnlyMedia upload filter
cannot be exploited as a vector in this way — IF the
fetching server validates the Content-Type of
fetched objects (Akkoma itself does this already).
However, restricting uploads to only multimedia files may be a bit too
heavy-handed. Instead this commit will restrict the returned
Content-Type headers for user uploaded files to a safe subset, falling
back to generic 'application/octet-stream' for anything else.
This will also protect against non-AP payloads as e.g. used in
past frontend code injection attacks.
It’s a slight regression in user comfort, if say PDFs are uploaded,
but this trade-off seems fairly acceptable.
(Note, just excluding our own custom types would offer no protection
against non-AP payloads and bear a (perhaps small) risk of a silent
regression should MIME ever decide to add a canonical extension for
ActivityPub objects)
Now, one might expect there to be other defence mechanisms
besides Content-Type preventing counterfeits from being accepted,
like e.g. validation of the queried URL and AP ID matching.
Inserting a self-reference into our uploads is hard, but unfortunately
*oma does not verify the id in such a way and happily accepts _anything_
from the same domain (without even considering redirects).
E.g. Sharkey (and possibly other *keys) seem to attempt to guard
against this by immediately refetching the object from its ID, but
this is easily circumvented by just uploading two payloads with the
ID of one linking to the other.
Unfortunately *oma is thus _both_ a vector for spoofing and
vulnerable to those spoof payloads, resulting in an easy way
to impersonate our users.
Similar flaws exists for emoji and media proxy.
Subsequent commits will fix this by rigorously sanitising
content types in more areas, hardening our checks, improving
the default config and discouraging insecure config options.
Currently our own frontend doesn’t show backgrounds of other users, this
property is already publicly readable via REST API and likely was always
intended to be shown and federated.
Recently Sharkey added support for profile backgrounds and
immediately made them federate and be displayed to others.
We use the same AP field as Sharkey here which should make
it interoperable both ways out-of-the-box.
Ref.: 4e64397635
OTP builds to 1.15
Changelog entry
Ensure policies are fully loaded
Fix :warn
use main branch for linkify
Fix warn in tests
Migrations for phoenix 1.17
Revert "Migrations for phoenix 1.17"
This reverts commit 6a3b2f15b7.
Oban upgrade
Add default empty whitelist
mix format
limit test to amd64
OTP 26 tests for 1.15
use OTP_VERSION tag
baka
just 1.15
Massive deps update
Update locale, deps
Mix format
shell????
multiline???
?
max cases 1
use assert_recieve
don't put_env in async tests
don't async conn/fs tests
mix format
FIx some uploader issues
Fix tests
When no image description is filled in, Pleroma allowed fallbacks.
Those were (based on a setting) either the filename, or a fixed description.
Neither are good options for image descriptions imo, so here we remove this.
Note that there's two tests removed who supposedly tested something else.
But examining closer, they didn't seem to test what they claimed to test,
so I removed them rather than try to "fix" them.
This adds an option to the prune_objects mix task.
The original way deleted all non-local public posts older than a certain time frame.
Here we add a different query which you can call using the option --keep-threads.
We query from the activities table all context id's where
1. the newest activity with this context is still old
2. none of the activities with this context is is local
3. none of the activities with this context is bookmarked
and delete all objects with these contexts.
The idea is that posts with local activities (posts, replies, likes, repeats...) may be interesting to keep.
Besides that, a post lives in a certain context (the thread), so we keep the whole thread as well.
Caveats:
* ~~Quotes have a different context. Therefore, when someone quotes a post, it's possible the quoted post will still be deleted.~~ fixed in https://akkoma.dev/AkkomaGang/akkoma/pulls/379
* Although undocumented (in docs/docs/administration/CLI_tasks/database.md/#prune-old-remote-posts-from-the-database), the 'normal' delete action still kept old remote non-public posts. I added an option to keep this behaviour, but this also means that you now have to explicitly provide that option. **This could be considered a breaking change!**
* ~~Note that this removes from the objects table, but not from the activities.~~ See https://akkoma.dev/AkkomaGang/akkoma/pulls/427 for that.
Some statistics from explain analyse:
(cost=1402845.92..1933782.00 rows=3810907 width=62) (actual time=2562455.486..2562455.495 rows=0 loops=1)
Planning Time: 505.327 ms
Trigger for constraint chat_message_references_object_id_fkey: time=651939.797 calls=921740
Trigger for constraint deliveries_object_id_fkey: time=52036.009 calls=921740
Trigger for constraint hashtags_objects_object_id_fkey: time=20665.778 calls=921740
Execution Time: 3287933.902 ms
***
**TODO**
1. [x] **Question:** Is it OK to keep it like this in regard to quote posts? If not (ie post quoted by local users should also be kept), should we give quotes the same context as the post they are quoting? (If we don't want to give them the same context, I'll have to see how/if I can do it without being too costly)
* See https://akkoma.dev/AkkomaGang/akkoma/pulls/379
2. [x] **Question:** the "original" query only deletes public posts (this is undocumented, but you can check the code). This new one doesn't care for scope. From the docs I get that the idea is that posts can be refetched when needed. But I have from a trusted source that Pleroma can't refetch non-public posts. I assume that's the reason why they are kept here. I see different options to deal with this
1. ~~We keep it as currently implemented and just don't care about scope with this option~~
2. ~~We add logic to not delete non-public posts either (I'll have to see how costly that becomes)~~
3. We add an extra --keep-non-public parameter. This is technically speaking breakage (you didn't have to provide a param before for this, now you do), but I'm inclined to not care much because it wasn't documented nor tested in the first place.
3. [x] See if we can do the query using Elixir
4. [x] Test on a bigger DB to see that we don't run into a timeout
5. [x] Add docs
Co-authored-by: ilja <git@ilja.space>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/350
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
See https://akkoma.dev/AkkomaGang/akkoma/pulls/350#issuecomment-6109
When making quotes through Mast-API, they will now have the same context as the quoted post. This also results in them being showed when fetching the thread. I checked Misskey to see how it's there, and they show the quotes there as well, see e.g. <https://mk.toast.cafe/notes/98u1g0tulg>.
An example from Akkoma:
Co-authored-by: ilja <git@ilja.space>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/379
Reviewed-by: floatingghost <hannah@coffee-and-dreams.uk>
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
Argos Translate is a Python module for translation and can be used as a command line tool.
This is also the engine for LibreTranslate, for which we already have a module.
Here we can use the engine directly from our server without doing requests to a third party or having to install our own LibreTranslate webservice (obviously you do have to install Argos Translate).
One thing that's currently still missing from Argos Translate is auto-detection of languages (see <https://github.com/argosopentech/argos-translate/issues/9>). For now, when no source language is provided, we just return the text unchanged, supposedly translated from the target language. That way you get a near immediate response in pleroma-fe when clicking Translate, after which you can select the source language from a dropdown.
Argos Translate also doesn't seem to handle html very well. Therefore we give admins the option to strip the html before translating. I made this an option because I'm unsure if/how this will change in the future.
Co-authored-by: ilja <git@ilja.space>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/351
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>