How to Audit Your GitHub Repo Access and Secrets After the 4K Repo Breach

AI Usage (89%)

What the report actually claims and what is still uncertain

The reported 4,000-repo theft and why that number matters

The report says GitHub confirmed a breach and that roughly 4,000 internal repositories were stolen. The number gets repeated fast because it sounds exact, but the practical issue is bigger than the headline. A repository is rarely just source code. It often includes CI config, deployment logic, environment names, service endpoints, and enough context to help someone find live secrets elsewhere.

When I hear “4,000 repos,” I do not think of 4,000 separate apps. I think of shared trust: shared workflows, shared tokens, and a lot of assumptions that “internal” means “safe.”

That is enough reason to run a repo and secrets audit even if your org was not named. The direct compromise may have happened elsewhere, but the failure mode is common: repository access is often broader than teams realize, and secrets tend to show up in more places than the secret manager.

Distinguishing confirmed facts from inferred risk

Based on the public material available here, the confirmed facts are limited:

GitHub was reported as confirming a breach.
The report says about 4,000 internal repositories were stolen.
The public summary does not include a full compromise chain, affected org names, or confirmation that secrets were accessed in addition to source code.

Everything past that needs to stay in the realm of risk analysis, not fact. For example:

If internal repos were copied, they may include hardcoded credentials.
If CI workflows were copied, they may expose token scopes and deployment paths.
If audit logs or workflow metadata were exposed, they may help map high-value accounts.

Those are plausible outcomes, not confirmed details. That distinction matters because incident response gets sloppy when teams chase the rumor and skip the boring work of checking their own exposure.

How to treat a news report as an incident trigger, not final evidence

My rule is straightforward: a credible breach report is enough to start three things right away.

Freeze the easy damage paths.
Inventory what you actually own.
Check whether any secrets or access paths overlap with the reported failure mode.

You do not need final attribution to reduce exposure. You also do not need to rotate every credential in the company on minute one. That usually causes outages before it creates safety.

The right move is to treat the report as a signal that your trust boundaries need a fresh look. If the story turns out to be narrower than it first sounded, the review still paid off. If it is broader, you have already started containment.

First 30 minutes: freeze the easy damage paths

Identify the repos, orgs, and environments in scope

Start with a list, not a password reset.

I usually write down:

affected organization names
business-critical repositories
repos with deploy workflows
repos with environment secrets
repos that publish packages or container images
repos owned by service accounts or automation users

If the report names a vendor, subsidiary, or team, include any repo that shares account boundaries with that group. Internal incidents often spread across orgs through shared SSO, shared GitHub Apps, or reused machine users.

A simple inventory table helps keep the work grounded:

Scope item	Why it matters	Owner
Production app repo	May contain deployment secrets or release automation	App team
Infrastructure repo	May hold cloud access and Terraform state links	Platform team
Reusable workflow repo	Can fan out permissions across many repos	DevEx team
Package publishing repo	May contain registry tokens or signing material	Build team

Pause risky automation before changing credentials

Before rotating anything, pause the automation that depends on those credentials. Otherwise you create a loop: expired token, failed deployment, emergency rollback, more people copying secrets into temporary places.

The riskiest pieces are usually:

scheduled deploy jobs
self-hosted runners
release pipelines
webhook-driven bots
sync jobs that write back to GitHub

If you can pause them safely, do it first. If you cannot, at least note which jobs will fail so you do not misread the errors later.

A practical rule: stop write paths before revoking credentials. Read-only access is usually less dangerous and easier to keep alive while you assess scope.

Decide what must be rotated immediately versus later

Not every secret needs the same urgency. I rank them by blast radius and reuse potential:

cloud provider keys and session brokers
GitHub PATs with repo, workflow, or admin scopes
package registry tokens
deployment and webhook signing secrets
test and sandbox credentials

Rotate first the credentials that can create or widen access outside GitHub. Cloud keys are especially important because a leaked repo often leads to infrastructure, not just source exposure.

For later rotation, use the team that owns the dependency. If a secret is shared across multiple repos or services, you need sequencing, not a scramble.

Build a repo inventory before you touch secrets

Enumerate public, private, archived, forked, and internal repositories

Do not assume you know what is in the org until you list it. I have seen incidents where the dangerous repo was archived months earlier and nobody remembered it still had active deploy keys.

You want to separate:

public repositories
private repositories
internal repositories
archived repositories
forks
template repos

Why? Archived repos and forks often keep old secrets, old branch rules, or old deploy keys. Templates can leak workflow patterns into new repos. Internal repos may be less visible, but they can still be highly privileged.

If you are using GitHub’s CLI, a quick inventory pass might look like this:

gh repo list ORG --limit 500 --json name,nameWithOwner,isPrivate,isArchived,isFork,updatedAt \
  --jq '.[] | [.nameWithOwner, .isPrivate, .isArchived, .isFork, .updatedAt] | @tsv'

This does not answer everything, but it gives you a list you can sort by age, visibility, and likely risk.

Map owners, teams, and service accounts to each repo

A repo inventory without ownership is just a dump. You need to know who can change it, who deploys from it, and who is allowed to approve changes.

Map each repo to:

owning team
primary maintainer
service account or bot user
production environment owner
on-call contact

If the same service account owns too many repos, that is already a finding. Shared machine users are convenient until one token turns into a skeleton key.

This is also where I look for orphaned repos. If nobody can name the owner, the repo is probably carrying stale access that nobody is watching.

Find dormant repos that still hold live credentials

Dormant repos worry me the most. They do not get attention, but they often keep old CI jobs alive. A repo that has not changed in a year may still:

run scheduled workflows
store deploy keys
publish packages
have collaborators who left the company
reference secrets that were never cleaned up

A useful pattern is to sort by last push and then inspect anything older than your normal rotation window. For GitHub, you can pull activity metadata and compare it to secret age.

If a repo has no recent commits but still has Actions enabled, treat it as live until proven otherwise.

Audit repository access like an attacker would

Review organization roles, team membership, and outside collaborators

If I were trying to abuse repo access, I would start with the broadest privilege path. That means org owners, repo admins, write teams, and any outside collaborator still hanging around.

Check for:

too many org owners
teams with write access when they only need read
outside collaborators on sensitive repos
service accounts in human-maintained teams
stale user accounts with SSO access but no current assignment

The question is not “who can see the repo?” It is “who can change the repo or its automation?” That is where damage begins.

Check branch protection, required reviews, and bypass permissions

A protected branch is only useful if the bypass list is tight. I have seen repos with strong-looking branch protection that still allowed a broad admin group to push directly.

Verify:

direct pushes are blocked on default branches
force-pushes are blocked
required reviews are actually enforced
CODEOWNERS paths are current
status checks map to real deployment gates
bypass permissions are limited and documented

If a bot or admin can skip the checks, the branch rule is just a speed bump.

Verify deploy keys, environment reviewers, and workflow write access

Deploy keys, environment approvals, and workflow permissions live in different layers, so they are easy to miss.

Check these separately:

deploy keys on repositories and servers
environment protection rules and reviewer lists
workflow permissions: blocks
reusable workflow call permissions
pull_request_target usage
self-hosted runner permissions

If a workflow has write access to contents or packages, ask why. If an environment secret is available to a workflow that does not need deployment rights, trim it. If a deploy key is old, replace it with a credential that is easier to track.

A small table helps during review:

Access path	Common mistake	Safer default
Branch write	Too many admins can bypass rules	Limit bypass to a small, named group
Environment secrets	Secrets exposed to all jobs	Require reviewers and narrow scopes
Deploy keys	Reused across servers	One key per system, rotated regularly
Workflow tokens	Broad default permissions	Explicit least-privilege permissions

Trace where secrets can leak in GitHub workflows

Inspect Actions secrets, variables, and reusable workflow inputs

GitHub Actions is a common leak path because secrets often get used in several jobs with different trust levels. Look at:

repository secrets
environment secrets
organization secrets
Actions variables
reusable workflow inputs
secrets passed through workflow_call

The real issue is not whether a secret exists. It is whether a lower-trust job can reach it indirectly. A reusable workflow can look fine until the caller passes a sensitive value as an input and then echoes it into a shell script.

A simple review question helps: “Could a pull request author influence this job and see a side effect from a secret?”

If the answer is yes, keep digging.

Review token scopes for PATs, GitHub App tokens, and machine users

Token scope is where many incidents quietly grow. A personal access token may have started with repo access and later picked up workflow, admin:org, or package privileges. A GitHub App token may be correctly scoped for one repo but reused elsewhere. A machine user may have access to multiple orgs because nobody wanted to rebuild the integration.

For each token, record:

who owns it
where it is stored
what scopes it has
which repos or orgs use it
whether it can write, approve, or publish

If you cannot explain why a token needs workflow or admin:org, assume it is too broad.

Look for secrets exposed through logs, artifacts, caches, and pull requests

Secrets do not only leak through committed files. They also leak through the surfaces around CI.

Check:

workflow logs
build artifacts
dependency caches
PR comments
job summaries
test failure output
debug mode output
release notes generated by automation

A lot of teams forget that a secret redacted in logs can still end up in an artifact, a cache key, or a downstream script output. Artifacts are often worse because they live longer and are easier to download.

If your workflows run on pull requests from forks, verify that secrets are not available to those jobs. If they are, treat that as a high-priority bug.

Search the codebase for hardcoded and reachable credentials

Grep for obvious secret patterns and config files

Start with the dumb search. It catches more than people expect.

Look for:

.env, .env.*
cloud credential filenames
private key markers
API key prefixes
token-like strings in config files
secrets. references in code and workflow files

A quick grep pass can be safe and effective:

git grep -nE '(AKIA|ASIA|ghp_|github_pat_|-----BEGIN (RSA|OPENSSH|EC) PRIVATE KEY-----|xox[baprs]-)' .

That will not find everything, but it is a good first sweep. Pair it with a secrets scanner if you have one. The goal is to find reachable credentials, not to convince yourself the repo is clean by eye.

Check history, tags, and release artifacts, not just the latest commit

One of the most common mistakes is scanning only HEAD. Secrets buried three commits back are still secrets. So are secrets in tags, release bundles, and generated artifacts.

Check:

full git history
annotated and lightweight tags
release attachments
vendored archives
compressed exports
old deployment manifests

If you have to choose, search history before you search the current tree. Attackers do not care whether a secret is still in the latest commit if they can clone the repository and walk back through its history.

Useful commands include:

git log -p --all -- . ':!package-lock.json'
git log -S 'SECRET_VALUE' --all
git rev-list --objects --all | grep -Ei '\.(zip|tar|tgz|pem|key)$'

Identify credentials that were copied into test fixtures or examples

This is the kind of issue teams dislike because it looks harmless. It is not harmless. I have seen real credentials copied into:

mock API responses
test fixtures
example .env files
README snippets
sample payloads
postmortem docs committed to the repo

The risk is not only accidental exposure. It is also normalization. Once a secret appears in a fixture, future reviewers stop noticing it.

If you find a secret in a test file, ask whether the fixture needs to exist at all. If the answer is yes, replace it with a synthetic value and add a check to keep the real pattern from coming back.

Decide what to rotate first and how to do it safely

Rank secrets by blast radius: cloud keys, CI tokens, package registry tokens, webhook signatures

Not all leaks are equal. I usually rotate in this order:

cloud provider credentials
CI/CD tokens with write access
package registry publishing tokens
webhook signing secrets
third-party API keys tied to production

The reason is simple: cloud keys and CI tokens can often mint more access, not just use existing access. A package token can publish a malicious release. A webhook secret can let an attacker impersonate trusted events.

If a secret is only used in a non-production sandbox, it still matters, but it should not block the highest-risk rotations.

Rotate in a dependency-safe order so one broken token does not stall the response

Rotation often fails because teams do it in the wrong sequence. For example, if a deploy job depends on both a GitHub token and a cloud token, revoking the cloud token first may break the job before the replacement exists.

A safer sequence is:

generate replacement secret
update the downstream consumer
verify the new secret is active
revoke the old secret
monitor for unexpected failures

If there are multiple downstream systems, rotate the most tightly coupled one first so you can test the pattern before scaling it.

Revoke old credentials, then verify the new ones are actually in use

Rotation is not done when the secret changes in the vault. It is done when the old secret no longer works and the new path is confirmed in production.

Verify by checking:

failed auth attempts for the old token
successful calls from the new token
CI job logs for the updated credential path
package publishing or deployment status
cloud audit logs for the new principal

If the old secret still works, something is wrong. Either it exists in another place, or you missed a consumer.

Validate branch and token permissions after the cleanup

Re-test least privilege on write, deploy, and admin paths

After cleanup, do not assume the permission model is fixed. Test it.

Ask these questions:

Can a normal developer still write where they should not?
Can a workflow publish artifacts without human review?
Can a service account modify settings it does not own?
Can an external collaborator reach any sensitive action?

This is where a small access matrix helps. Keep it boring and explicit.

Actor	Allowed	Not allowed
Developer	Open PRs, read docs	Push to protected branch
Release bot	Publish signed release	Change branch rules
Ops bot	Deploy approved build	Read unrelated repo secrets
Security reviewer	Audit logs, policy review	Use production deploy tokens

Confirm protected branches still block direct pushes and force-pushes

This seems basic, but it is one of the easiest controls to misconfigure during emergency cleanup. If you changed branch rules or recreated a repo environment, re-check the guardrails.

Test:

direct push rejection
force-push rejection
required review enforcement
status check enforcement
admin bypass behavior
signed commit requirements if enabled

A branch rule that passes policy review but fails in real GitHub behavior is not a control. It is documentation.

Verify workflows only receive the secrets they need

GitHub Actions tends to drift toward convenience. A job gets one secret, then another, then a whole environment because it was easier than splitting the workflow.

Review each job and confirm:

the secret is required
the scope is minimum
the secret is not inherited by unrelated steps
reusable workflows do not widen exposure
forked PR jobs cannot read write credentials

If a workflow is doing build, test, and deploy in one job, split it. Security gets better when trust boundaries are visible in the YAML.

Hunt for signs of exposure and misuse

Review audit logs, workflow runs, and repository access history

Once you know what should have been exposed, look for what actually happened.

Review:

GitHub audit logs
repository access logs
workflow run history
token creation and revocation events
branch protection changes
collaborator additions and removals

You are looking for access changes that cluster around the suspected breach window. New token creation, unusual runner activity, or a sudden collaborator change is worth a closer look.

Check for unusual clone activity, new deploy keys, or unexpected collaborator changes

If an attacker had access to repos, they may have used normal-looking Git operations rather than noisy exploits. Search for:

clone bursts from unfamiliar IP ranges
new SSH deploy keys
repo transfers
sudden team permission changes
release tag creation outside normal release windows

If you do not have clone telemetry, that is a gap to document. Many GitHub teams can see what changed but not who pulled the code. That limits certainty, so you compensate by checking everything else.

Correlate secret rotation times with failed auth and suspicious API calls

This is the part that separates cleanup noise from real abuse. After rotation, some failed auth is expected. The question is whether the failures line up with unusual access attempts.

Correlate:

old token failures
retries from automation
API calls from unknown principals
login failures on connected services
package publish attempts

If you see a credential fail and then a different token or account succeed in the same window, that is worth escalation.

Harden the repo so the same mistake is harder to repeat

Replace long-lived tokens with GitHub App or OIDC-based access where possible

Long-lived tokens are convenient and painful to govern. Where possible, move toward short-lived, federated access:

GitHub Apps for repo-bound automation
OIDC for cloud access
ephemeral credentials for deploy jobs
short-lived session brokers for privileged operations

This does not eliminate compromise, but it shrinks the useful lifetime of a leaked token. That is a real defense, not just policy language.

Split sensitive automation into separate repos and environments

If one repo controls build, deploy, and secrets, you have created a single point of failure. Separate them.

A better pattern is:

application repo for code
workflow repo for shared automation
environment-specific deployment controls
dedicated repos for privileged scripts

That separation makes review clearer and keeps one compromise from automatically turning into full pipeline access.

Add secret scanning, commit signing, and branch rules that match reality

The controls need to match how your team actually works.

Use:

secret scanning on push and history
commit signing for trusted release paths
branch rules that enforce review
CODEOWNERS for sensitive paths
environment approvals for production
repository rules that block obvious policy escapes

A common mistake is adding a rule nobody can comply with, then disabling it later. Better to enforce a narrow rule that is real than a broad rule that looks good on paper.

A practical verification checklist for developers and security teams

What to test in a single repo

For one sensitive repo, verify:

owner and maintainer list is current
branch protection blocks direct changes
external collaborators are justified
workflow permissions are minimal
secrets are not in code, history, or artifacts
deploy keys are unique and rotated
environment reviewers are enabled where needed

If you only have an hour, start here. It gives you the highest-confidence signal fastest.

What to test across an organization

At org level, check:

all repos are inventoried
no orphaned repos retain live secrets
team membership matches job function
service accounts are limited and named
secret rotation has an owner
audit logs are retained and searchable
GitHub Apps and PATs are reviewed periodically

This is where you find the structural problems that made the breach report relevant in the first place.

What evidence to preserve for incident review

Do not destroy the trail while cleaning up. Preserve:

audit log exports
workflow run records
token creation and revocation timestamps
branch protection diffs
collaborator changes
secret scanning alerts
screenshots or exports of the original access state

You need the before-and-after view. Otherwise the post-incident review turns into a memory exercise, and those are usually wrong.

Closing the loop: what a good post-incident review should produce

The concrete fixes to carry into policy and automation

A useful review should not end with “be more careful.” It should produce changes you can enforce:

tighter repo ownership
reduced org-owner counts
shorter-lived credentials
stricter workflow permissions
better branch rules
scheduled secret rotation
mandatory inventory of internal repos
a clear process for paused automation during incidents

If a fix cannot be automated, at least make it visible in one checklist that every team uses.

The metrics that show the repo is safer than before

I like metrics that reflect real reduction in blast radius:

number of repos with no owner
number of long-lived PATs
number of workflow jobs with write permissions
number of deploy keys older than the rotation window
number of repos with secret scanning enabled
number of privileged bypassers on protected branches
time to revoke and replace a production token

Those numbers tell you whether the cleanup changed the security posture or just created activity.

The report about 4,000 stolen repos is a reminder that repository access is itself an attack surface. If you audit nothing else, audit the places where code, credentials, and automation meet. That is usually where the breach becomes real.