Commit graph

18 commits

Author SHA1 Message Date
floatingghost 59bfdf2ca4 Merge pull request 'Add limit CLI flags to prune jobs' (#655) from Oneric/akkoma:prune-batch into develop
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/655
2024-06-17 20:47:53 +00:00
Floatingghost 16bed0562d Fix tests 2024-06-09 18:28:00 +01:00
Oneric 1d4c212441 dbprune: shortcut array activity search
This brought down query costs from 7,953,740.90 to 47,600.97
2024-05-31 17:16:40 +02:00
Oneric 6e7cbf1885 Test both standalone and flag mode for pruning orphaned activities 2024-05-31 17:16:40 +02:00
Floatingghost 66b3248dd3 mix tests probably shouldn't be async 2024-05-27 04:03:13 +01:00
Oneric 8cf183cb42 Drop Chat tables
Chats were removed in 0f132b802d
2024-02-11 05:15:08 +01:00
ilja f49e9e6d4c Clean up bookmarks after prune_objects
When doing prune_objects, it's possible that bookmarked objects are deleted.
This gave problems when fetching the bookmark TL.
Here we clean up the bookmarks during pruning in the case were it's possible that bookmarked objects are deleted.
2023-05-21 13:02:28 +02:00
ilja c7fb78cc32 Move deadline and old_insert_date to setup
Several tests for prune_objetcs need a date older than the deadline for pruning, so I moved that to the setup
2023-05-21 12:01:54 +02:00
ilja 328b4d93b7 Changelog + remove some unneeded comments from the tests 2023-02-26 14:43:19 +01:00
ilja 57eef6d764 prune_objects can prune orphaned activities who reference an array of objects
E.g. Flag activities have an array of objects

We prune the activity when NONE of the objects can be found

Note that the cost of finding and deleting these is ~4x higher than finding and deleting the non-array ones

Only string:
Delete on activities  (cost=506573.48..506580.38 rows=0 width=0)

Only Array:
Delete on activities  (cost=3570359.68..4276365.34 rows=0 width=0)

(They are still executed separately, so the total cost is the sum of the two)
2023-02-26 14:41:50 +01:00
ilja a7ec6e039c prune_objects can prune orphaned activities
We add an option to also prune remote activities who don't have existing objects any more they reference.
Rn, we only check for activities who only reference one object, not an array or embeded object.
2023-02-26 14:41:50 +01:00
ilja 7695010268 Prune Objects --keep-threads option (#350)
This adds an option to the prune_objects mix task.
The original way deleted all non-local public posts older than a certain time frame.
Here we add a different query which you can call using the option --keep-threads.

We query from the activities table all context id's where
    1. the newest activity with this context is still old
    2. none of the activities with this context is is local
    3. none of the activities with this context is bookmarked
and delete all objects with these contexts.

The idea is that posts with local activities (posts, replies, likes, repeats...) may be interesting to keep.
Besides that, a post lives in a certain context (the thread), so we keep the whole thread as well.

Caveats:
* ~~Quotes have a different context. Therefore, when someone quotes a post, it's possible the quoted post will still be deleted.~~ fixed in https://akkoma.dev/AkkomaGang/akkoma/pulls/379
* Although undocumented (in docs/docs/administration/CLI_tasks/database.md/#prune-old-remote-posts-from-the-database), the 'normal' delete action still kept old remote non-public posts. I added an option to keep this behaviour, but this also means that you now have to explicitly provide that option. **This could be considered a breaking change!**
* ~~Note that this removes from the objects table, but not from the activities.~~ See https://akkoma.dev/AkkomaGang/akkoma/pulls/427 for that.

Some statistics from explain analyse:
(cost=1402845.92..1933782.00 rows=3810907 width=62) (actual time=2562455.486..2562455.495 rows=0 loops=1)
 Planning Time: 505.327 ms
 Trigger for constraint chat_message_references_object_id_fkey: time=651939.797 calls=921740
 Trigger for constraint deliveries_object_id_fkey: time=52036.009 calls=921740
 Trigger for constraint hashtags_objects_object_id_fkey: time=20665.778 calls=921740
 Execution Time: 3287933.902 ms

***
**TODO**
1. [x] **Question:** Is it OK to keep it like this in regard to quote posts? If not (ie post quoted by local users should also be kept), should we give quotes the same context as the post they are quoting? (If we don't want to give them the same context, I'll have to see how/if I can do it without being too costly)
    * See https://akkoma.dev/AkkomaGang/akkoma/pulls/379
2. [x] **Question:** the "original" query only deletes public posts (this is undocumented, but you can check the code). This new one doesn't care for scope. From the docs I get that the idea is that posts can be refetched when needed. But I have from a trusted source that Pleroma can't refetch non-public posts. I assume that's the reason why they are kept here. I see different options to deal with this
    1. ~~We keep it as currently implemented and just don't care about scope with this option~~
    2. ~~We add logic to not delete non-public posts either (I'll have to see how costly that becomes)~~
    3. We add an extra --keep-non-public parameter. This is technically speaking breakage (you didn't have to provide a param before for this, now you do), but I'm inclined to not care much because it wasn't documented nor tested in the first place.
3. [x] See if we can do the query using Elixir
4. [x] Test on a bigger DB to see that we don't run into a timeout
5. [x] Add docs

Co-authored-by: ilja <git@ilja.space>
Reviewed-on: https://akkoma.dev/AkkomaGang/akkoma/pulls/350
Co-authored-by: ilja <akkoma.dev@ilja.space>
Co-committed-by: ilja <akkoma.dev@ilja.space>
2023-01-09 22:15:41 +00:00
Haelwenn (lanodan) Monnier c4439c630f
Bump Copyright to 2021
grep -rl '# Copyright © .* Pleroma' * | xargs sed -i 's;Copyright © .* Pleroma .*;Copyright © 2017-2021 Pleroma Authors <https://pleroma.social/>;'
2021-01-13 07:49:50 +01:00
lain e4f1d8f48c Merge branch 'cachex-test' into 'develop'
Test framework overhaul (speed, reliability)

See merge request pleroma/pleroma!3209
2020-12-26 10:26:35 +00:00
lain 9ba60f70d2 Tests: Make as many tests as possible async.
In general, tests that match these criteria can be made async:

- Doesn't use real Cachex.
- Doesn't write to the Config / Application Environment.
- Uses Mock. Using Mox is fine.
- Uses the streamer.
2020-12-21 12:21:40 +01:00
Alexander Strizhakov cebe3c7def Fix for dropping posts/notifs in WS when mix task is executed
- start oban in mix tasks with empty queues, plugins and crontab
- fix for update_users_following_followers_counts
- fix for removed logo.png
- typo in resend confirmation emails mix task docs
- fix for uploads mix task (start Majic.Pool)
- fix for creating user mix task (start :fast_html app)
2020-12-14 11:02:32 -06:00
Egor Kislitsyn 35ba48494f
Stream follow updates 2020-12-02 00:18:58 +04:00
Alexander Strizhakov 6bf85440b3
mix tasks consistency 2020-10-13 16:33:24 +03:00
Renamed from test/tasks/database_test.exs (Browse further)