
How to Audit Your GitHub Repo Access and Secrets After the 4K Repo Breach
What the report actually claims and what is still uncertain
The reported 4,000-repo theft and why that number matters
The report says GitHub confirmed a breach and that roughly 4,000 internal repositories were stolen. The number gets repeated fast because it sounds exact, but the practical issue is bigger than the headline. A repository is rarely just source code. It often includes CI config, deployment logic, environment names, service endpoints, and enough context to help someone find live secrets elsewhere.
When I hear “4,000 repos,” I do not think of 4,000 separate apps. I think of shared trust: shared workflows, shared tokens, and a lot of assumptions that “internal” means “safe.”
That is enough reason to run a repo and secrets audit even if your org was not named. The direct compromise may have happened elsewhere, but the failure mode is common: repository access is often broader than teams realize, and secrets tend to show up in more places than the secret manager.
Distinguishing confirmed facts from inferred risk
Based on the public material available here, the confirmed facts are limited:
- GitHub was reported as confirming a breach.
- The report says about 4,000 internal repositories were stolen.
- The public summary does not include a full compromise chain, affected org names, or confirmation that secrets were accessed in addition to source code.
Everything past that needs to stay in the realm of risk analysis, not fact. For example:
- If internal repos were copied, they may include hardcoded credentials.
- If CI workflows were copied, they may expose token scopes and deployment paths.
- If audit logs or workflow metadata were exposed, they may help map high-value accounts.
Those are plausible outcomes, not confirmed details. That distinction matters because incident response gets sloppy when teams chase the rumor and skip the boring work of checking their own exposure.
How to treat a news report as an incident trigger, not final evidence
My rule is straightforward: a credible breach report is enough to start three things right away.
- Freeze the easy damage paths.
- Inventory what you actually own.
- Check whether any secrets or access paths overlap with the reported failure mode.
You do not need final attribution to reduce exposure. You also do not need to rotate every credential in the company on minute one. That usually causes outages before it creates safety.
The right move is to treat the report as a signal that your trust boundaries need a fresh look. If the story turns out to be narrower than it first sounded, the review still paid off. If it is broader, you have already started containment.
First 30 minutes: freeze the easy damage paths
Identify the repos, orgs, and environments in scope
Start with a list, not a password reset.
I usually write down:
- affected organization names
- business-critical repositories
- repos with deploy workflows
- repos with environment secrets
- repos that publish packages or container images
- repos owned by service accounts or automation users
If the report names a vendor, subsidiary, or team, include any repo that shares account boundaries with that group. Internal incidents often spread across orgs through shared SSO, shared GitHub Apps, or reused machine users.
A simple inventory table helps keep the work grounded:
| Scope item | Why it matters | Owner |
|---|---|---|
| Production app repo | May contain deployment secrets or release automation | App team |
| Infrastructure repo | May hold cloud access and Terraform state links | Platform team |
| Reusable workflow repo | Can fan out permissions across many repos | DevEx team |
| Package publishing repo | May contain registry tokens or signing material | Build team |
Pause risky automation before changing credentials
Before rotating anything, pause the automation that depends on those credentials. Otherwise you create a loop: expired token, failed deployment, emergency rollback, more people copying secrets into temporary places.
The riskiest pieces are usually:
- scheduled deploy jobs
- self-hosted runners
- release pipelines
- webhook-driven bots
- sync jobs that write back to GitHub
If you can pause them safely, do it first. If you cannot, at least note which jobs will fail so you do not misread the errors later.
A practical rule: stop write paths before revoking credentials. Read-only access is usually less dangerous and easier to keep alive while you assess scope.
Decide what must be rotated immediately versus later
Not every secret needs the same urgency. I rank them by blast radius and reuse potential:
- cloud provider keys and session brokers
- GitHub PATs with repo, workflow, or admin scopes
- package registry tokens
- deployment and webhook signing secrets
- test and sandbox credentials
Rotate first the credentials that can create or widen access outside GitHub. Cloud keys are especially important because a leaked repo often leads to infrastructure, not just source exposure.
For later rotation, use the team that owns the dependency. If a secret is shared across multiple repos or services, you need sequencing, not a scramble.
Build a repo inventory before you touch secrets
Enumerate public, private, archived, forked, and internal repositories
Do not assume you know what is in the org until you list it. I have seen incidents where the dangerous repo was archived months earlier and nobody remembered it still had active deploy keys.
You want to separate:
- public repositories
- private repositories
- internal repositories
- archived repositories
- forks
- template repos
Why? Archived repos and forks often keep old secrets, old branch rules, or old deploy keys. Templates can leak workflow patterns into new repos. Internal repos may be less visible, but they can still be highly privileged.
If you are using GitHub’s CLI, a quick inventory pass might look like this:
gh repo list ORG --limit 500 --json name,nameWithOwner,isPrivate,isArchived,isFork,updatedAt \
--jq '.[] | [.nameWithOwner, .isPrivate, .isArchived, .isFork, .updatedAt] | @tsv'
This does not answer everything, but it gives you a list you can sort by age, visibility, and likely risk.
Map owners, teams, and service accounts to each repo
A repo inventory without ownership is just a dump. You need to know who can change it, who deploys from it, and who is allowed to approve changes.
Map each repo to:
- owning team
- primary maintainer
- service account or bot user
- production environment owner
- on-call contact
If the same service account owns too many repos, that is already a finding. Shared machine users are convenient until one token turns into a skeleton key.
This is also where I look for orphaned repos. If nobody can name the owner, the repo is probably carrying stale access that nobody is watching.
Find dormant repos that still hold live credentials
Dormant repos worry me the most. They do not get attention, but they often keep old CI jobs alive. A repo that has not changed in a year may still:
- run scheduled workflows
- store deploy keys
- publish packages
- have collaborators who left the company
- reference secrets that were never cleaned up
A useful pattern is to sort by last push and then inspect anything older than your normal rotation window. For GitHub, you can pull activity metadata and compare it to secret age.
If a repo has no recent commits but still has Actions enabled, treat it as live until proven otherwise.
Audit repository access like an attacker would
Review organization roles, team membership, and outside collaborators
If I were trying to abuse repo access, I would start with the broadest privilege path. That means org owners, repo admins, write teams, and any outside collaborator still hanging around.
Check for:
- too many org owners
- teams with write access when they only need read
- outside collaborators on sensitive repos
- service accounts in human-maintained teams
- stale user accounts with SSO access but no current assignment
The question is not “who can see the repo?” It is “who can change the repo or its automation?” That is where damage begins.
Check branch protection, required reviews, and bypass permissions
A protected branch is only useful if the bypass list is tight. I have seen repos with strong-looking branch protection that still allowed a broad admin group to push directly.
Verify:
- direct pushes are blocked on default branches
- force-pushes are blocked
- required reviews are actually enforced
- CODEOWNERS paths are current
- status checks map to real deployment gates
- bypass permissions are limited and documented
If a bot or admin can skip the checks, the branch rule is just a speed bump.
Verify deploy keys, environment reviewers, and workflow write access
Deploy keys, environment approvals, and workflow permissions live in different layers, so they are easy to miss.
Check these separately:
- deploy keys on repositories and servers
- environment protection rules and reviewer lists
- workflow
permissions:blocks - reusable workflow call permissions
pull_request_targetusage- self-hosted runner permissions
If a workflow has write access to contents or packages, ask why. If an environment secret is available to a workflow that does not need deployment rights, trim it. If a deploy key is old, replace it with a credential that is easier to track.
A small table helps during review:
| Access path | Common mistake | Safer default |
|---|---|---|
| Branch write | Too many admins can bypass rules | Limit bypass to a small, named group |
| Environment secrets | Secrets exposed to all jobs | Require reviewers and narrow scopes |
| Deploy keys | Reused across servers | One key per system, rotated regularly |
| Workflow tokens | Broad default permissions | Explicit least-privilege permissions |
Trace where secrets can leak in GitHub workflows
Inspect Actions secrets, variables, and reusable workflow inputs
GitHub Actions is a common leak path because secrets often get used in several jobs with different trust levels. Look at:
- repository secrets
- environment secrets
- organization secrets
- Actions variables
- reusable workflow inputs
- secrets passed through
workflow_call
The real issue is not whether a secret exists. It is whether a lower-trust job can reach it indirectly. A reusable workflow can look fine until the caller passes a sensitive value as an input and then echoes it into a shell script.
A simple review question helps: “Could a pull request author influence this job and see a side effect from a secret?”
If the answer is yes, keep digging.
Review token scopes for PATs, GitHub App tokens, and machine users
Token scope is where many incidents quietly grow. A personal access token may have started with repo access and later picked up workflow, admin:org, or package privileges. A GitHub App token may be correctly scoped for one repo but reused elsewhere. A machine user may have access to multiple orgs because nobody wanted to rebuild the integration.
For each token, record:
- who owns it
- where it is stored
- what scopes it has
- which repos or orgs use it
- whether it can write, approve, or publish
If you cannot explain why a token needs workflow or admin:org, assume it is too broad.
Look for secrets exposed through logs, artifacts, caches, and pull requests
Secrets do not only leak through committed files. They also leak through the surfaces around CI.
Check:
- workflow logs
- build artifacts
- dependency caches
- PR comments
- job summaries
- test failure output
- debug mode output
- release notes generated by automation
A lot of teams forget that a secret redacted in logs can still end up in an artifact, a cache key, or a downstream script output. Artifacts are often worse because they live longer and are easier to download.
If your workflows run on pull requests from forks, verify that secrets are not available to those jobs. If they are, treat that as a high-priority bug.
Search the codebase for hardcoded and reachable credentials
Grep for obvious secret patterns and config files
Start with the dumb search. It catches more than people expect.
Look for:
.env,.env.*- cloud credential filenames
- private key markers
- API key prefixes
- token-like strings in config files
secrets.references in code and workflow files
A quick grep pass can be safe and effective:
git grep -nE '(AKIA|ASIA|ghp_|github_pat_|-----BEGIN (RSA|OPENSSH|EC) PRIVATE KEY-----|xox[baprs]-)' .
That will not find everything, but it is a good first sweep. Pair it with a secrets scanner if you have one. The goal is to find reachable credentials, not to convince yourself the repo is clean by eye.
Check history, tags, and release artifacts, not just the latest commit
One of the most common mistakes is scanning only HEAD. Secrets buried three commits back are still secrets. So are secrets in tags, release bundles, and generated artifacts.
Check:
- full git history
- annotated and lightweight tags
- release attachments
- vendored archives
- compressed exports
- old deployment manifests
If you have to choose, search history before you search the current tree. Attackers do not care whether a secret is still in the latest commit if they can clone the repository and walk back through its history.
Useful commands include:
git log -p --all -- . ':!package-lock.json'
git log -S 'SECRET_VALUE' --all
git rev-list --objects --all | grep -Ei '\.(zip|tar|tgz|pem|key)$'
Identify credentials that were copied into test fixtures or examples
This is the kind of issue teams dislike because it looks harmless. It is not harmless. I have seen real credentials copied into:
- mock API responses
- test fixtures
- example
.envfiles - README snippets
- sample payloads
- postmortem docs committed to the repo
The risk is not only accidental exposure. It is also normalization. Once a secret appears in a fixture, future reviewers stop noticing it.
If you find a secret in a test file, ask whether the fixture needs to exist at all. If the answer is yes, replace it with a synthetic value and add a check to keep the real pattern from coming back.
Decide what to rotate first and how to do it safely
Rank secrets by blast radius: cloud keys, CI tokens, package registry tokens, webhook signatures
Not all leaks are equal. I usually rotate in this order:
- cloud provider credentials
- CI/CD tokens with write access
- package registry publishing tokens
- webhook signing secrets
- third-party API keys tied to production
The reason is simple: cloud keys and CI tokens can often mint more access, not just use existing access. A package token can publish a malicious release. A webhook secret can let an attacker impersonate trusted events.
If a secret is only used in a non-production sandbox, it still matters, but it should not block the highest-risk rotations.
Rotate in a dependency-safe order so one broken token does not stall the response
Rotation often fails because teams do it in the wrong sequence. For example, if a deploy job depends on both a GitHub token and a cloud token, revoking the cloud token first may break the job before the replacement exists.
A safer sequence is:
- generate replacement secret
- update the downstream consumer
- verify the new secret is active
- revoke the old secret
- monitor for unexpected failures
If there are multiple downstream systems, rotate the most tightly coupled one first so you can test the pattern before scaling it.
Revoke old credentials, then verify the new ones are actually in use
Rotation is not done when the secret changes in the vault. It is done when the old secret no longer works and the new path is confirmed in production.
Verify by checking:
- failed auth attempts for the old token
- successful calls from the new token
- CI job logs for the updated credential path
- package publishing or deployment status
- cloud audit logs for the new principal
If the old secret still works, something is wrong. Either it exists in another place, or you missed a consumer.
Validate branch and token permissions after the cleanup
Re-test least privilege on write, deploy, and admin paths
After cleanup, do not assume the permission model is fixed. Test it.
Ask these questions:
- Can a normal developer still write where they should not?
- Can a workflow publish artifacts without human review?
- Can a service account modify settings it does not own?
- Can an external collaborator reach any sensitive action?
This is where a small access matrix helps. Keep it boring and explicit.
| Actor | Allowed | Not allowed |
|---|---|---|
| Developer | Open PRs, read docs | Push to protected branch |
| Release bot | Publish signed release | Change branch rules |
| Ops bot | Deploy approved build | Read unrelated repo secrets |
| Security reviewer | Audit logs, policy review | Use production deploy tokens |
Confirm protected branches still block direct pushes and force-pushes
This seems basic, but it is one of the easiest controls to misconfigure during emergency cleanup. If you changed branch rules or recreated a repo environment, re-check the guardrails.
Test:
- direct push rejection
- force-push rejection
- required review enforcement
- status check enforcement
- admin bypass behavior
- signed commit requirements if enabled
A branch rule that passes policy review but fails in real GitHub behavior is not a control. It is documentation.
Verify workflows only receive the secrets they need
GitHub Actions tends to drift toward convenience. A job gets one secret, then another, then a whole environment because it was easier than splitting the workflow.
Review each job and confirm:
- the secret is required
- the scope is minimum
- the secret is not inherited by unrelated steps
- reusable workflows do not widen exposure
- forked PR jobs cannot read write credentials
If a workflow is doing build, test, and deploy in one job, split it. Security gets better when trust boundaries are visible in the YAML.
Hunt for signs of exposure and misuse
Review audit logs, workflow runs, and repository access history
Once you know what should have been exposed, look for what actually happened.
Review:
- GitHub audit logs
- repository access logs
- workflow run history
- token creation and revocation events
- branch protection changes
- collaborator additions and removals
You are looking for access changes that cluster around the suspected breach window. New token creation, unusual runner activity, or a sudden collaborator change is worth a closer look.
Check for unusual clone activity, new deploy keys, or unexpected collaborator changes
If an attacker had access to repos, they may have used normal-looking Git operations rather than noisy exploits. Search for:
- clone bursts from unfamiliar IP ranges
- new SSH deploy keys
- repo transfers
- sudden team permission changes
- release tag creation outside normal release windows
If you do not have clone telemetry, that is a gap to document. Many GitHub teams can see what changed but not who pulled the code. That limits certainty, so you compensate by checking everything else.
Correlate secret rotation times with failed auth and suspicious API calls
This is the part that separates cleanup noise from real abuse. After rotation, some failed auth is expected. The question is whether the failures line up with unusual access attempts.
Correlate:
- old token failures
- retries from automation
- API calls from unknown principals
- login failures on connected services
- package publish attempts
If you see a credential fail and then a different token or account succeed in the same window, that is worth escalation.
Harden the repo so the same mistake is harder to repeat
Replace long-lived tokens with GitHub App or OIDC-based access where possible
Long-lived tokens are convenient and painful to govern. Where possible, move toward short-lived, federated access:
- GitHub Apps for repo-bound automation
- OIDC for cloud access
- ephemeral credentials for deploy jobs
- short-lived session brokers for privileged operations
This does not eliminate compromise, but it shrinks the useful lifetime of a leaked token. That is a real defense, not just policy language.
Split sensitive automation into separate repos and environments
If one repo controls build, deploy, and secrets, you have created a single point of failure. Separate them.
A better pattern is:
- application repo for code
- workflow repo for shared automation
- environment-specific deployment controls
- dedicated repos for privileged scripts
That separation makes review clearer and keeps one compromise from automatically turning into full pipeline access.
Add secret scanning, commit signing, and branch rules that match reality
The controls need to match how your team actually works.
Use:
- secret scanning on push and history
- commit signing for trusted release paths
- branch rules that enforce review
- CODEOWNERS for sensitive paths
- environment approvals for production
- repository rules that block obvious policy escapes
A common mistake is adding a rule nobody can comply with, then disabling it later. Better to enforce a narrow rule that is real than a broad rule that looks good on paper.
A practical verification checklist for developers and security teams
What to test in a single repo
For one sensitive repo, verify:
- owner and maintainer list is current
- branch protection blocks direct changes
- external collaborators are justified
- workflow permissions are minimal
- secrets are not in code, history, or artifacts
- deploy keys are unique and rotated
- environment reviewers are enabled where needed
If you only have an hour, start here. It gives you the highest-confidence signal fastest.
What to test across an organization
At org level, check:
- all repos are inventoried
- no orphaned repos retain live secrets
- team membership matches job function
- service accounts are limited and named
- secret rotation has an owner
- audit logs are retained and searchable
- GitHub Apps and PATs are reviewed periodically
This is where you find the structural problems that made the breach report relevant in the first place.
What evidence to preserve for incident review
Do not destroy the trail while cleaning up. Preserve:
- audit log exports
- workflow run records
- token creation and revocation timestamps
- branch protection diffs
- collaborator changes
- secret scanning alerts
- screenshots or exports of the original access state
You need the before-and-after view. Otherwise the post-incident review turns into a memory exercise, and those are usually wrong.
Closing the loop: what a good post-incident review should produce
The concrete fixes to carry into policy and automation
A useful review should not end with “be more careful.” It should produce changes you can enforce:
- tighter repo ownership
- reduced org-owner counts
- shorter-lived credentials
- stricter workflow permissions
- better branch rules
- scheduled secret rotation
- mandatory inventory of internal repos
- a clear process for paused automation during incidents
If a fix cannot be automated, at least make it visible in one checklist that every team uses.
The metrics that show the repo is safer than before
I like metrics that reflect real reduction in blast radius:
- number of repos with no owner
- number of long-lived PATs
- number of workflow jobs with write permissions
- number of deploy keys older than the rotation window
- number of repos with secret scanning enabled
- number of privileged bypassers on protected branches
- time to revoke and replace a production token
Those numbers tell you whether the cleanup changed the security posture or just created activity.
The report about 4,000 stolen repos is a reminder that repository access is itself an attack surface. If you audit nothing else, audit the places where code, credentials, and automation meet. That is usually where the breach becomes real.


