Commit graph

3 commits

Author SHA1 Message Date
Jason Song 77c89572e9
Fix isAllowed of escapeStreamer (#22814) (#22837)
Backport #22814.

The use of `sort.Search` is wrong: The slice should be sorted, and
`return >= 0` doen't mean it exists, see the
[manual](https://pkg.go.dev/sort#Search).

Could be fixed like this if we really need it:

```diff
diff --git a/modules/charset/escape_stream.go b/modules/charset/escape_stream.go
index 823b63513..fcf1ffbc1 100644
--- a/modules/charset/escape_stream.go
+++ b/modules/charset/escape_stream.go
@@ -20,6 +20,9 @@ import (
 var defaultWordRegexp = regexp.MustCompile(`(-?\d*\.\d\w*)|([^\` + "`" + `\~\!\@\#\$\%\^\&\*\(\)\-\=\+\[\{\]\}\\\|\;\:\'\"\,\.\<\>\/\?\s\x00-\x1f]+)`)

 func NewEscapeStreamer(locale translation.Locale, next HTMLStreamer, allowed ...rune) HTMLStreamer {
+       sort.Slice(allowed, func(i, j int) bool {
+               return allowed[i] < allowed[j]
+       })
        return &escapeStreamer{
                escaped:                 &EscapeStatus{},
                PassthroughHTMLStreamer: *NewPassthroughStreamer(next),
@@ -284,14 +287,8 @@ func (e *escapeStreamer) runeTypes(runes ...rune) (types []runeType, confusables
 }

 func (e *escapeStreamer) isAllowed(r rune) bool {
-       if len(e.allowed) == 0 {
-               return false
-       }
-       if len(e.allowed) == 1 {
-               return e.allowed[0] == r
-       }
-
-       return sort.Search(len(e.allowed), func(i int) bool {
+       i := sort.Search(len(e.allowed), func(i int) bool {
                return e.allowed[i] >= r
-       }) >= 0
+       })
+       return i < len(e.allowed) && e.allowed[i] == r
 }
```

But I don't think so, a map is better to do it.
2023-02-10 11:36:58 +08:00
Jason Song 15b189b570
Avoid frequent string2bytes conversions (#20940)
Fix #20939
2022-08-24 12:50:13 +01:00
zeripath 99efa02edf
Switch Unicode Escaping to a VSCode-like system (#19990)
This PR rewrites the invisible unicode detection algorithm to more
closely match that of the Monaco editor on the system. It provides a
technique for detecting ambiguous characters and relaxes the detection
of combining marks.

Control characters are in addition detected as invisible in this
implementation whereas they are not on monaco but this is related to
font issues.

Close #19913

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-08-13 19:32:34 +01:00