mirror of
https://github.com/discourse/discourse.git
synced 2025-09-05 08:59:27 +08:00
More entropy for foreign titles
* Treat strings with non-ASCII characters as having more entropy
This commit is contained in:
parent
5217602ec3
commit
bb77d2c38b
2 changed files with 12 additions and 2 deletions
|
@ -21,8 +21,10 @@ class TextSentinel
|
|||
end
|
||||
|
||||
# Entropy is a number of how many unique characters the string needs.
|
||||
# Non-ASCII characters are weighted heavier since they contain more "information"
|
||||
def entropy
|
||||
@entropy ||= @text.to_s.strip.split('').uniq.size
|
||||
chars = @text.to_s.strip.split('')
|
||||
@entropy ||= chars.pack('M*'*chars.size).gsub("\n",'').split('=').uniq.size
|
||||
end
|
||||
|
||||
def valid?
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue