mirror of
https://gh.wpcy.net/https://github.com/discourse/discourse.git
synced 2026-06-18 18:00:27 +08:00
GitHub oneboxes and the discourse-github plugin talked to GitHub's REST
and
GraphQL API with no rate-limit awareness. On busy instances this
exhausted
GitHub's limits (60 requests/hour unauthenticated, 5000 authenticated),
and
because there was no backoff every render kept hitting GitHub and
re-failing
-- which GitHub's docs warn can get an integration banned. The recently
added PR-status onebox multiplied the number of calls and made it far
worse.
GitHub access was also fragmented: the core onebox engines used OpenURI,
the
discourse-github plugin used Octokit, and the discourse-ai bot tools
used
FinalDestination::HTTP -- three HTTP stacks, three tokens, and
inconsistent
(or entirely missing) error and rate-limit handling.
This introduces a single client, Discourse::GithubApi, that every GitHub
data-API request now flows through. It is built on Faraday with the
SSRF-safe
FinalDestination adapter and:
- authenticates per token (Bearer) and returns plain string-keyed Hashes
(get/post) or raw bodies (raw_get) -- one response shape, no
Octokit/Sawyer
- only ever sends the access token to api.github.com and
raw.githubusercontent.com, rejecting any other absolute URL, so a
user-derived path can never leak a token to an arbitrary host
- backs off on rate limits both reactively (403/429) and proactively
(when
X-RateLimit-Remaining hits 0), honouring Retry-After /
X-RateLimit-Reset,
via a shared Redis flag (GithubRateLimit) keyed per token so each
token's
budget and the shared unauthenticated/IP budget back off independently
- short-circuits while backing off without ever sleeping, so onebox
rendering
and post baking degrade to a plain link instead of blocking a request
- caches ETags and sends If-None-Match, so unchanged resources return
304s
that do not count against the rate limit
Every caller was moved onto it:
- the 6 core GitHub onebox engines, via a slimmed
Onebox::Mixins::GithubApi
adapter that keeps their public methods and translates client errors
back
to the OpenURI::HTTPError vocabulary they already rescue (engines
unchanged)
- the github_blob raw.githubusercontent.com fetch
- the discourse-github plugin (badges, linkback, permalinks, token
validator),
which no longer uses the octokit and sawyer gems (they stay in the
Gemfile for
the discourse-code-review official plugin, which still depends on them)
- the discourse-ai bot's GitHub tools (search code, diff, file content,
search files)
Also adds a GithubOneboxBackoff admin problem check that surfaces while
one of
the onebox token identities is backing off -- scoped to the tokens
resolved by
Onebox::GithubAccess (each configured github_onebox_access_tokens entry
plus the
unauthenticated client) so a backoff on the AI bot or linkback token is
not
misattributed to onebox. Its message points admins at the relevant
setting with
the {{setting:...}} link marker, which problem-check messages now expand
too.
Onebox token resolution is centralised in Onebox::GithubAccess, and the
onebox
cache TTL for transient GitHub failures is shortened so they recover
quickly.
GitHub OAuth login, theme git-clone, the inbound webhook, and the
Oneboxer
FinalDestination URL-resolution special-cases for github.com are
intentionally
out of scope -- they are different concerns, not the rate-limited data
API.
54 lines
1.3 KiB
Ruby
Vendored
54 lines
1.3 KiB
Ruby
Vendored
# frozen_string_literal: true
|
|
|
|
require "digest"
|
|
|
|
class GithubRateLimit
|
|
KEY_PREFIX = "onebox_github_backoff_"
|
|
MAX_SECONDS = 1.hour.to_i
|
|
|
|
class << self
|
|
def backing_off?(token = nil)
|
|
redis.get(key(token)).present?
|
|
end
|
|
|
|
def active?
|
|
redis.scan_each(match: "#{KEY_PREFIX}*").first.present?
|
|
end
|
|
|
|
def note_rate_limit(token: nil, retry_after: nil, remaining: nil, reset_at: nil)
|
|
seconds =
|
|
if retry_after.present?
|
|
retry_after.to_i
|
|
elsif remaining.to_s == "0" && reset_at.present?
|
|
reset_at.to_i - Time.now.to_i
|
|
end
|
|
return if seconds.nil?
|
|
|
|
seconds = seconds.clamp(1, MAX_SECONDS)
|
|
redis.setex(key(token), seconds, "1")
|
|
Rails.logger.warn("GitHub API rate limited; backing off for #{seconds}s")
|
|
seconds
|
|
end
|
|
|
|
def note_response_headers(headers, token: nil)
|
|
headers = headers.to_h.transform_keys { |k| k.to_s.downcase }
|
|
note_rate_limit(
|
|
token:,
|
|
retry_after: headers["retry-after"],
|
|
remaining: headers["x-ratelimit-remaining"],
|
|
reset_at: headers["x-ratelimit-reset"],
|
|
)
|
|
end
|
|
|
|
def key(token)
|
|
identity = token.present? ? Digest::SHA1.hexdigest(token) : "unauthenticated"
|
|
"#{KEY_PREFIX}#{identity}"
|
|
end
|
|
|
|
private
|
|
|
|
def redis
|
|
Discourse.redis.without_namespace
|
|
end
|
|
end
|
|
end
|