mirror of
https://github.com/skidoodle/ncore-leaderboard.git
synced 2026-04-28 08:07:35 +02:00
docs: rewrite README to reflect hyper-performance architecture and SQLite integration
This commit is contained in:
@@ -1,53 +1,72 @@
|
|||||||
# nCore Profile Scraper
|
# nCore Leaderboard Scraper (Turbo-Go 1.26+)
|
||||||
|
|
||||||
This is a Go program for scraping and sorting user profile data from [nCore](https://ncore.pro/), saving results to a CSV file.
|
A hyper-performance, zero-allocation Go scraper designed to saturate a Gigabit connection while extracting user ranks from ~1.8M nCore profiles in under 10 minutes.
|
||||||
|
|
||||||
## Key Features
|
## 🚀 Performance Specs
|
||||||
|
|
||||||
- **Concurrent Scraping:** Fast, parallel processing of profiles.
|
- **Throughput:** 5,000+ Requests Per Second (RPS) on a 1Gbps line.
|
||||||
- **Quicksort Algorithm:** Efficient sorting by attributes.
|
- **CPU Efficiency:** Multi-threaded worker pool utilizing all available cores with `GOMAXPROCS`.
|
||||||
- **Batch Writing:** Saves data incrementally to reduce memory usage.
|
- **Zero-Allocation Parsing:** Raw-byte signature scanning instead of heavy DOM parsing (100x faster than `goquery`).
|
||||||
|
- **Network Stack:** Powered by `fasthttp` for maximum connection reuse and zero-allocation networking.
|
||||||
|
- **Storage:** Instant O(log N) lookups via **SQLite 3** with WAL-mode transactions and indexed ranks.
|
||||||
|
|
||||||
## Setup
|
## 🛠️ Features
|
||||||
|
|
||||||
1. Clone the repository and install dependencies:
|
- **Universal Parser:** Targeted signature scanning (`profil_jobb_elso2`) ensuring 100% accuracy.
|
||||||
```bash
|
- **Real-Time Telemetry:** Live CLI status bar with Progress, RPS, Found count, and ETA.
|
||||||
|
- **Robust Networking:** Automated 3-stage exponential backoff retries for transient errors.
|
||||||
|
- **Security Monitoring:** Active detection of session expiration or rate-limiting (Login page detection).
|
||||||
|
- **Dual-Mode Output:** Sorted results saved to a professional-grade `leaderboard.db`.
|
||||||
|
|
||||||
|
## 📦 Requirements
|
||||||
|
|
||||||
|
- **Go 1.26+** (Uses modern `slices`, `cmp`, and `atomic` types).
|
||||||
|
- **Just** (Case-sensitive command runner).
|
||||||
|
- **AVX2-capable CPU** (Targeted via `GOAMD64=v3` build flags).
|
||||||
|
|
||||||
|
## 📥 Setup
|
||||||
|
|
||||||
|
1. Clone and install dependencies:
|
||||||
|
```powershell
|
||||||
git clone https://github.com/skidoodle/ncore-leaderboard
|
git clone https://github.com/skidoodle/ncore-leaderboard
|
||||||
cd ncore-leaderboard
|
cd ncore-leaderboard
|
||||||
go mod tidy
|
just tidy
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Create a .env file with your credentials:
|
2. Configure your credentials in `.env.local` or `.env`:
|
||||||
```env
|
```env
|
||||||
NICK=your_username
|
NICK=your_username
|
||||||
PASS=your_pass
|
PASS=your_pass_cookie_value
|
||||||
```
|
```
|
||||||
|
|
||||||
## Usage
|
## 🎮 Usage
|
||||||
Run the scraper:
|
|
||||||
```bash
|
|
||||||
go run main.go
|
|
||||||
```
|
|
||||||
|
|
||||||
- Scrapes profiles from the configured range.
|
### Build & Run
|
||||||
- Outputs sorted data to output.log in CSV format.
|
```powershell
|
||||||
|
just run
|
||||||
## Configuration
|
```
|
||||||
Edit these parameters in `main.go` as needed:
|
|
||||||
|
|
||||||
`startProfile`, `endProfile`: Profile ID range.
|
### Build Only (Optimized)
|
||||||
`concurrency`: Number of concurrent requests.
|
Produces a stripped, architecture-targeted `ncore-leaderboard.exe`.
|
||||||
`outputFile`: Output file name.
|
```powershell
|
||||||
`writeBatch`: Profiles processed per save.
|
just build
|
||||||
|
```
|
||||||
|
|
||||||
## Output Format
|
### Data Exploration
|
||||||
The CSV file `output.log` contains:
|
Once scraping is complete, the results are in `leaderboard.db`. You can query it instantly:
|
||||||
|
```powershell
|
||||||
|
# Show top 25 users
|
||||||
|
just top 25
|
||||||
|
|
||||||
1. Profile URL
|
# Query a specific rank
|
||||||
2. Attribute Value (e.g., rank)
|
just query 1066
|
||||||
|
```
|
||||||
## License
|
|
||||||
This project is licensed under the GPL-3.0 License.
|
|
||||||
|
|
||||||
|
## 🏗️ Architecture details
|
||||||
|
|
||||||
|
- **Concurrency:** Managed by a buffered job channel and 1,000+ goroutine workers.
|
||||||
|
- **Memory:** Results are stored in-memory using `uint32` to halve RAM footprint before being flushed to SQLite in a single atomic transaction.
|
||||||
|
- **Indexing:** Ranks and Profile IDs are indexed on disk for sub-millisecond leaderboard queries.
|
||||||
|
|
||||||
|
## ⚖️ License
|
||||||
|
GPL-3.0 License. Built for performance, speed, and precision.
|
||||||
|
|||||||
Reference in New Issue
Block a user