Commit graph

15529 commits

Author SHA1 Message Date
floatingghost e2e4f53585 Merge pull request 'Use standard-compliant Accept header when fetching' (#740) from Oneric/akkoma:fetch_std-accept-hdr into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/740
2024-04-12 18:28:26 +00:00
floatingghost 4887df12d7 Merge pull request 'Allow for url to be a list' (#718) from helge/akkoma:develop into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/718
2024-04-12 17:39:38 +00:00
floatingghost e6ca2b4d2a Merge pull request 'Fix array-less EmojiReacts' (#739) from Oneric/akkoma:tag-arrayless into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/739
2024-04-12 17:26:07 +00:00
floatingghost 6ba80aaff5 Merge pull request 'Check if data is visible before embedding it in OG tags' (#741) from ograph-restrictions into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/741
2024-04-12 17:22:59 +00:00
floatingghost 8e60177466 Merge pull request 'MRF.InlineQuotePolicy: Add link to post URL, not ID' (#733) from erincandescent/akkoma:quote-url into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/733
2024-04-12 17:02:52 +00:00
Erin Shepherd 75d9e2b375 MRF.InlineQuotePolicy: Add link to post URL, not ID
"id" is used for the canonical link to the AS2 representation of an object.
"url" is typically used for the canonical link to the HTTP representation.
It is what we use, for example, when following the "external source" link
in the frontend. However, it's not the link we include in the post contents
for quote posts.

Using URL instead means we include a more user-friendly URL for Mastodon,
and a working (in the browser) URL for Threads
2024-04-12 13:23:50 +02:00
Floatingghost 05f8179d08 check if data is visible before embedding it in OG tags
previously we would uncritically take data and format it into
tags for static-fe and the like - however, instances can be
configured to disallow unauthenticated access to these resources.

this means that OG tags as a vector for information leakage.

_technically_ this should only occur if you have both
restrict_unauthenticated *AND* you run static-fe, which makes no
sense since static-fe is for unauthenticated people in particular,
but hey ho.
2024-04-12 05:16:47 +01:00
Oneric fae0a14ee8 Use standard-compliant Accept header when fetching
Spec says clients MUST use this header and servers MUST respond to it,
while servers merely SHOULD respond to the one we used before.
https://www.w3.org/TR/activitypub/#retrieving-objects

The old value is kept as a fallback since at least two years ago
not every implementation correctly dealt with the spec-compliant
variant, see: https://github.com/owncast/owncast/issues/1827

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/730
2024-04-12 00:22:37 +02:00
Floatingghost 1135935cbe Merge remote-tracking branch 'oneric/ipv6' into develop 2024-04-11 20:59:49 +01:00
floatingghost 090a77d1af Merge pull request 'static-fe: don’t squeeze non-square images' (#705) from Oneric/akkoma:staticfe-nonsquare-img into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/705
2024-04-11 18:43:03 +00:00
floatingghost 0e066bddae Merge pull request 'Drop base_url special casing in test env' (#737) from Oneric/akkoma:testenv_drop_baseurl_specialcase into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/737
2024-04-11 18:24:09 +00:00
Oneric 462225880a Accept EmojiReacts with non-array tag
JSON-LD compaction strips the array since it’s just one object

Fixes: https://akkoma.dev/AkkomaGang/akkoma/issues/720
2024-04-09 04:04:16 +02:00
Oneric debd686418 Add tests for our own custom emoji format 2024-04-09 03:52:22 +02:00
Oneric 9598137d32 Drop base_url special casing in test env
61621ebdbc already explicitly added
the uploader base url to config/test.exs and it reduces differences
from prod.
2024-04-07 00:20:12 +02:00
floatingghost b8393ad9ed Merge pull request 'context: add featured definition' (#717) from erincandescent/akkoma:context-featured into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/717
2024-04-03 10:22:09 +00:00
floatingghost 554f19a9ed Merge pull request 'Refresh Users much more aggressively when processing Move activities' (#714) from erincandescent/akkoma:move-bust-cache into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/714
2024-04-03 10:03:14 +00:00
FloatingGhost 9c53a3390e Ensure we have the emoji base path 2024-04-02 14:12:03 +01:00
FloatingGhost 795524daf1 bump version 2024-04-02 11:36:47 +01:00
FloatingGhost b5d97e7d85 Don't error out if we're not using the local uploader 2024-04-02 11:36:26 +01:00
FloatingGhost f592090206 Fix tests that relied on no base_url in the uploader 2024-04-02 11:23:57 +01:00
FloatingGhost 61621ebdbc Add tests for extra warnings about media subdomains 2024-04-02 10:54:53 +01:00
FloatingGhost 4cd299bd83 Add extra warnings if the uploader is on the same domain as the main application 2024-04-02 10:20:59 +01:00
Erin Shepherd 8fbd771d6e context: add featured & backgroundUrl definitions
These were missing from our context, which caused interoperability issues with
people who do context processing
2024-04-01 13:39:38 +02:00
FloatingGhost 2d439034ca Ensure that spoof-inserted does not time out 2024-03-30 12:55:22 +00:00
FloatingGhost 087d88f787 bump version 2024-03-30 11:45:07 +00:00
FloatingGhost 3650bb0370 Changelog entry 2024-03-30 11:44:34 +00:00
Oneric ee7d98b093 Update Changelog 2024-03-29 08:35:15 -01:00
Oneric 0648d9ebaa Add mix tasks to detect spoofed posts and users
At least as far as we can
2024-03-26 16:05:20 -01:00
Oneric d441101200 Add mix task to detect uploaded spoof payloads 2024-03-26 16:05:20 -01:00
Oneric 31f90bbb52 Register APNG MIME type
The newest git HEAD of MIME already knows about APNG, but this
hasn’t been released yet. Without this, APNG attachments from
remote posts won’t display as images in frontends.

Fixes: akkoma#657
2024-03-26 15:44:44 -01:00
Oneric 61ec592d66 Drop obsolete pixelfed workaround
This pixelfed issue was fixed in 2022-12 in
https://github.com/pixelfed/pixelfed/pull/3932

Co-authored-by: FloatingGhost <hannah@coffee-and-dreams.uk>
2024-03-26 15:11:06 -01:00
Oneric 8684964c5d Only allow exact id matches
This protects us from falling for obvious spoofs as from the current
upload exploit (unfortunately we can’t reasonably do anything about
spoofs with exact matches as was possible via emoji and proxy).

Such objects being invalid is supported by the spec, sepcifically
sections 3.1 and 3.2: https://www.w3.org/TR/activitypub/#obj-id

Anonymous objects are not relevant here (they can only exists within
parent objects iiuc) and neither is client-to-server or transient objects
(as those cannot be fetched in the first place).
This leaves us with the requirement for `id` to (a) exist and
(b) be a publicly dereferencable URI from the originating server.
This alone does not yet demand strict equivalence, but the spec then
further explains objects ought to be fetchable _via their ID_.
Meaning an object not retrievable via its ID, is invalid.

This reading is supported by the fact, e.g. GoToSocial (recently) and
Mastodon (for 6+ years) do already implement such strict ID checks,
additionally proving this doesn’t cause federation issues in practice.

However, apart from canonical IDs there can also be additional display
URLs. *omas first redirect those to their canonical location, but *keys
and Mastodon directly serve the AP representation without redirects.

Mastodon and GTS deal with this in two different ways,
but both constitute an effective countermeasure:
 - Mastodon:
   Unless it already is a known AP id, two fetches occur.
   The first fetch just reads the `id` property and then refetches from
   the id. The last fetch requires the returned id to exactly match the
   URL the content was fetched from. (This can be optimised by skipping
   the second fetch if it already matches)
   05eda8d193/app/helpers/jsonld_helper.rb (L168)
   63f0979799

 - GTS:
   Only does a single fetch and then checks if _either_ the id
   _or_ url property (which can be an object) match the original fetch
   URL. This relies on implementations always including their display URL
   as "url" if differing from the id. For actors this is true for all
   investigated implementations, for posts only Mastodon includes an
   "url", but it is also the only one with a differing display URL.
   2bafd7daf5 (diff-943bbb02c8ac74ac5dc5d20807e561dcdfaebdc3b62b10730f643a20ac23c24fR222)

Albeit Mastodon’s refetch offers higher compatibility with theoretical
implmentations using either multiple different display URL or not
denoting any of them as "url" at all, for now we chose to adopt a
GTS-like refetch-free approach to avoid additional implementation
concerns wrt to whether redirects should be allowed when fetching a
canonical AP id and potential for accidentally loosening some checks
(e.g. cross-domain refetches) for one of the fetches.
This may be reconsidered in the future.
2024-03-25 14:05:05 -01:00
Oneric 48b3a35793 Update user reference after fetch
Since we always followed redirects (and until recently allowed fuzzy id
matches), the ap_id of the received object might differ from the iniital
fetch url. This lead to us mistakenly trying to insert a new user with
the same nickname, ap_id, etc as an existing user (which will fail due
to uniqueness constraints) instead of updating the existing one.
2024-03-25 14:05:05 -01:00
Oneric 9061d148be Ensure object id doesn’t change on refetch 2024-03-25 14:05:05 -01:00
Oneric 3e134b07fa fetcher: return final URL after redirects from get_object
Since we reject cross-domain redirects, this doesn’t yet
make a difference, but it’s requried for stricter checking
subsequent commits will introduce.

To make sure (and in case we ever decide to reallow
cross-domain redirects) also use the final location
for containment and reachability checks.
2024-03-25 14:05:05 -01:00
Oneric f07eb4cb55 Sanity check fetched user data
In order to properly process incoming notes we need
to be able to map the key id back to an actor.
Also, check collections actually belong to the same server.

Key ids of Hubzilla and Bridgy samples were updated to what
modern versions of those output. If anything still uses the
old format, we would not be able to verify their posts anyway.
2024-03-25 14:05:05 -01:00
Oneric 59a142e0b0 Never fetch resource from ourselves
If it’s not already in the database,
it must be counterfeit (or just not exists at all)

Changed test URLs were only ever used from "local: false" users anyway.
2024-03-25 14:05:05 -01:00
Oneric fee57eb376 Move actor check into fetch_and_contain_remote_object_from_id
This brings it in line with its name and closes an,
in practice harmless, verification hole.

This was/is the only user of contain_origin making it
safe to change the behaviour on actor-less objects.

Until now refetched objects did not ensure the new actor matches the
domain of the object. We refetch polls occasionally to retrieve
up-to-date vote counts. A malicious AP server could have switched out
the poll after initial posting with a completely different post
attribute to an actor from another server.
While we indeed fell for this spoof before the commit,
it fortunately seems to have had no ill effect in practice,
since the asociated Create activity is not changed. When exposing the
actor via our REST API, we read this info from the activity not the
object.

This at first thought still keeps one avenue for exploit open though:
the updated actor can be from our own domain and a third server be
instructed to fetch the object from us. However this is foiled by an
id mismatch. By necessity of being fetchable and our longstanding
same-domain check, the id must still be from the attacker’s server.
Even the most barebone authenticity check is able to sus this out.
2024-03-25 14:05:05 -01:00
Oneric c4cf4d7f0b Reject cross-domain redirects when fetching AP objects
Such redirects on AP queries seem most likely to be a spoofing attempt.
If the object is legit, the id should match the final domain anyway and
users can directly use the canonical URL.

The lack of such a check (and use of the initially queried domain’s
authority instead of the final domain) was enabling the current exploit
to even affect instances which already migrated away from a same-domain
upload/proxy setup in the past, but retained a redirect to not break old
attachments.

(In theory this redirect could, with some effort, have been limited to
 only old files, but common guides employed a catch-all redirect, which
 allows even future uploads to be reachable via an initial query to the
 main domain)

Same-domain redirects are valid and also used by ourselves,
e.g. for redirecting /notice/XXX to /objects/YYY.
2024-03-25 14:05:05 -01:00
Oneric baaeffdebc Update spoofed activity test
Turns out we already had a test for activities spoofed via upload due
to an exploit several years. Back then *oma did not verify content-type
at all and doing so was the only adopted countermeasure.
Even the added test sample though suffered from a mismatching id, yet
nobody seems to have thought it a good idea to tighten id checks, huh

Since we will add stricter id checks later, make id and URL match
and also add a testcase for no content type at all. The new section
will be expanded in subsequent commits.
2024-03-25 14:05:05 -01:00
Oneric 2bcf633dc2 Document Pleroma.Object.Fetcher 2024-03-25 14:05:05 -01:00
Oneric 93ab6a018e mix: fix docs task 2024-03-18 22:40:43 -01:00
Oneric c806adbfdb Refactor Fetcher.get_object for readability
Apart from slightly different error reasons wrt content-type,
this does not change functionality in any way.
2024-03-18 22:40:43 -01:00
Oneric ddd79ff22d Proactively harden emoji pack against path traversal
No new path traversal attacks are known. But given the many entrypoints
and code flow complexity inside pack.ex, it unfortunately seems
possible a future refactor or addition might reintroduce one.
Furthermore, some old packs might still contain traversing path entries
which could trigger undesireable actions on rename or delete.

To ensure this can never happen, assert safety during path construction.

Path.safe_relative was introduced in Elixir 1.14, but
fortunately, we already require at least 1.14 anyway.
2024-03-18 22:33:10 -01:00
Oneric d6d838cbe8 StealEmoji: check remote size before downloading
To save on bandwith and avoid OOMs with large files.
Ofc, this relies on the remote server
 (a) sending a content-length header and
 (b) being honest about the size.

Common fedi servers seem to provide the header and (b) at least raises
the required privilege of an malicious actor to a server infrastructure
admin of an explicitly allowed host.

A more complete defense which still works when faced with
a malicious server requires changes in upstream Finch;
see https://github.com/sneako/finch/issues/224
2024-03-18 22:33:10 -01:00
Oneric 6d003e1acd test/steal_emoji: consolidate configuration setup 2024-03-18 22:33:10 -01:00
Oneric d1ce5fd911 test/steal_emoji: reduce code duplication with mock macro 2024-03-18 22:33:10 -01:00
Oneric a4fa2ec9af StealEmoji: make final paths infeasible to predict
Certain attacks rely on predictable paths for their payloads.
If we weren’t so overly lax in our (id, URL) check, the current
counterfeit activity exploit would be one of those.
It seems plausible for future attacks to hinge on
or being made easier by predictable paths too.

In general, letting remote actors place arbitrary data at
a path within our domain of their choosing (sans prefix)
just doesn’t seem like a good idea.

Using fully random filenames would have worked as well, but this
is less friendly for admins checking emoji dirs.
The generated suffix should still be more than enough;
an attacker needs on average 140 trillion attempts to
correctly guess the final path.
2024-03-18 22:33:10 -01:00
Oneric ee5ce87825 test: use pack functions to check for emoji
The hardocded path and filenames assumptions
will be broken with the next commit.
2024-03-18 22:33:10 -01:00
Oneric d1c4d07404 Convert StealEmoji to pack.json
This will decouple filenames from shortcodes and
allow more image formats to work instead of only
those included in the auto-load glob. (Albeit we
still saved other formats to disk, wasting space)

Furthermore, this will allow us to make
final URL paths infeasible to predict.
2024-03-18 22:33:10 -01:00