From 9105b0c95428e3bbecd6f8ad106508095eed5643 Mon Sep 17 00:00:00 2001 From: David Yip Date: Sat, 10 Feb 2018 02:32:39 -0600 Subject: [PATCH] Introduce html2text for extracting plaintext from statuses. #236. Unlike strip_tags, html2text will preserve text present in other nodes, e.g. anchor tags: [1] pry(main)> str = 'A link' => "A link" [2] pry(main)> Html2Text.convert(str) => "[A link](http://www.example.com)" [3] pry(main)> include ActionView::Helpers::SanitizeHelper => Object [4] pry(main)> strip_tags(str) => "A link" Preserving the href of an anchor allows keyword mutes to also match on URLs, which is something that the frontend regex filter can currently do. --- Gemfile | 1 + Gemfile.lock | 3 +++ 2 files changed, 4 insertions(+) diff --git a/Gemfile b/Gemfile index 1d128d6573..d2cd3b42db 100644 --- a/Gemfile +++ b/Gemfile @@ -42,6 +42,7 @@ gem 'fast_blank', '~> 1.0' gem 'goldfinger', '~> 2.1' gem 'hiredis', '~> 0.6' gem 'redis-namespace', '~> 1.5' +gem 'html2text' gem 'htmlentities', '~> 4.3' gem 'http', '~> 3.0' gem 'http_accept_language', '~> 2.1' diff --git a/Gemfile.lock b/Gemfile.lock index 3a65f35a53..3400b1a0fd 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -205,6 +205,8 @@ GEM highline (1.7.10) hiredis (0.6.1) hkdf (0.3.0) + html2text (0.2.1) + nokogiri (~> 1.6) htmlentities (4.3.4) http (3.0.0) addressable (~> 2.3) @@ -601,6 +603,7 @@ DEPENDENCIES goldfinger (~> 2.1) hamlit-rails (~> 0.2) hiredis (~> 0.6) + html2text htmlentities (~> 4.3) http (~> 3.0) http_accept_language (~> 2.1)