Crawlzo  /  Products  /  App Stores & Software  /  GitHub

GitHub Scraper API

Turn any GitHub repository, topic, or trending page into structured JSON: stars, forks, open issues, language breakdown, latest release, contributors, and topics.

App listingsRatings & reviewsCharts & rankingsDeveloper data
▸ Overview

GitHub is the center of open-source software, where a project's stars, release cadence, and trending placement signal its momentum better than any store rank. The GitHub Scraper API resolves repository pages, release lists, topic feeds, and the daily trending board into validated JSON keyed to owner and repo.

Developer-tooling vendors, VCs scouting open source, and OSS maintainers use it to watch star growth, catch new releases, and discover rising projects by language. Star and fork counts are read at request time, and the trending board is captured per language and time window.

GitHub Scraper API · request
# POST a target — get validated JSON back
curl https://api.crawlzo.com/v4/scrape \
  -H "Authorization: Bearer $CRAWLZO_KEY" \
  -d '{
  "url": "https://www.github.com/",
  "type": "app",
  "geo": "us"
  }'

// ← response
{
  "status": "ok",
  "data": {
    "title": "...",
    "developer": "...",
    "rating": 4.7,
    "ratings_count": 184220,
    "price": "Free",
    "category": "Productivity",
    "rank": 12
  }
}
"type": "app", "geo": "us"
▸ What you can extract

Every public field, structured for you.

GitHub data parsed into clean, validated JSON. Pull any group below on its own, or combine them in a single request.

Listing details

  • Title, developer, description, icon
  • Category, price, in-app purchases
  • Version, size, update date
  • Screenshots and preview media

Ratings & reviews

  • Aggregate rating and ratings count
  • Review text, rating, author, version
  • Rating distribution and developer replies
  • Full review pagination

Charts & rankings

  • Top free / paid / grossing rank
  • Category and country charts
  • Rank history over time

Developer details

  • Developer name and other titles
  • Website and support links
  • Privacy and data-use labels

Search results

  • Ranked apps per keyword + country
  • Ad vs. organic placement
  • Keyword visibility signals
▸ Built on the Crawlzo engine

The hard parts, already solved.

▸ What teams build with it

Common use cases.

[ 01 ]

Open-source star and fork growth tracking

[ 02 ]

Release and tag monitoring for dependencies

[ 03 ]

Trending-repo discovery by language

[ 04 ]

Developer-tool and ecosystem market research

▸ FAQ

GitHub scraping, answered.

Structured JSON straight from the API, or pushed to your stack natively — S3, BigQuery, Snowflake, Postgres, Kafka, or any HTTPS webhook. Call it from Python, Node, Go, Rust, or any HTTP client. The data lands where your pipeline already lives.

No. You pay for valid, schema-passing rows only. Retries, blocks, CAPTCHAs, and 5xxs are on us. If a run doesn't return data that conforms to the schema, it isn't billed.

Every request routes through the same engine behind our Web Unblocker API: compliant residential IPs, real browser fingerprints, TLS-level evasion, behaviour modelling, and built-in CAPTCHA solving. Hard targets become routine.

Yes. We respect robots policies, rate budgets, and ToS-aware allow/deny lists. We deliver and move on — no row-level retention beyond your replay window. GDPR DPA, PII redaction, and custom data residency available on request.

GITHUB DATA · ON TAP

Start pulling GitHub data this week.

Tell us the GitHub surface you need and the shape you want it in. We'll come back in 24 hours with a sampled output, a scoped plan, and a price. Pilot in week one.

Pay only for data delivered99.99% uptime SLA99% success rate100M+ proxies