Data Download
Select exchange, symbol, date range and multiple timeframes. Smart download: Binance monthly bulk archives first, CCXT fallback. Real-time progress via SSE.
An end-to-end quantitative research system: data acquisition from 111+ exchanges, vectorized backtesting with real execution costs, 8 strategy families, interactive Plotly dashboard and advanced robustness analysis — all in Python.
Most quant systems have a large gap between idea and execution: insufficient data, backtests without realistic costs, strategies without market regime awareness. This platform closes that gap with a fully reproducible end-to-end pipeline.
From incremental Binance data download using monthly bulk archives, to vectorized backtesting with execution delay and real costs (10 bps fee + 2 bps slippage), to an interactive Plotly dashboard — all implemented in Python with type annotations throughout.
FastAPI + SSE
src/web/app.py
src/ modules
data/processed/*.parquet
The research dashboard is split into 5 independent sections, each covering one phase of the quantitative research cycle.
Select exchange, symbol, date range and multiple timeframes. Smart download: Binance monthly bulk archives first, CCXT fallback. Real-time progress via SSE.
Visual grid of all downloaded datasets with full metadata: exchange, symbol, timeframe, row count, file size and quality status.
Multi-select datasets and strategies, configure capital/fees/slippage, run vectorized backtests with live progress tracking.
Plotly charts: equity curve with market regime shading, drawdown panel, monthly return heatmap, rolling 30-bar Sharpe and a color-coded sortable metrics table.
90-day rolling analysis across all datasets with regime detection, strategy ranking and a momentum gauge.
Automatic checks across 6 categories: timestamp gaps, duplicates, OHLCV consistency, zero-volume bars, outliers and overall quality score.
Each strategy is implemented as a frozen dataclass — immutable parameters, fully reproducible results. Signals are binary: Long or Cash. No leverage or short selling.
| Strategy | Type | Entry Logic | Market Sensitivity |
|---|---|---|---|
| EMA Trend | Trend | Fast EMA(20) > Slow EMA(100) | Best in trending markets; poor in choppy conditions |
| RSI Mean Reversion | Mean-Rev | RSI < 30 entry, RSI > 50 exit | Range-bound and sideways markets |
| Bollinger Band | Mean-Rev | z-score < −2σ entry, mean reversion exit | Moderate volatility, mean-reverting regimes |
| Donchian Breakout | Trend | 55-period channel breakout (Turtle Trading) | Trending markets with clear breakouts |
| ATR Breakout | Trend | Close > MA + 1.5×ATR(20) | High volatility with adaptive breakout threshold |
| MACD Crossover | Trend | Fast EMA crosses signal line | Trending markets with momentum |
| Stochastic Rev. | Mean-Rev | %K at oversold/overbought levels | Ranging markets with defined oscillation |
| ML Signal | ML | Gradient Boosting on 6 technical features | Chronological 65/35 train/test split |
The platform includes tools that eliminate look-ahead bias and measure genuine out-of-sample robustness.
Rolling out-of-sample validation with configurable train/test windows. Prevents overfitting to historical data.
Bootstrap resampling of return series to estimate worst-case drawdown distributions and strategy robustness.
Identifies 4 market regimes: Trending Up/Down, Ranging, High/Low-Vol. Recommends best strategy per regime.
Information Coefficient (IC), Rank IC, factor decay across 1–24 bar horizons, cross-factor correlation matrix.
Cross-dataset correlation, portfolio-level Sharpe aggregation, drawdown analysis across multiple assets.
Automatic testing of nearby parameter variations — prevents curve-fitting over narrow parameter ranges.
Random Forest for return and volatility prediction. Strictly chronological splits, no shuffling — zero data leakage.
40+ technical and statistical features: Trend, Momentum, Volatility, Volume, Structure and Statistical categories.
Every component was chosen for production standards: no unnecessary dependencies, full type annotations throughout.
Given identical Parquet inputs, all backtests and reports are fully deterministic — no hidden randomness.
Chronological train/test splits, never shuffled — look-ahead bias is eliminated by design.
Immutable strategy parameter containers — parameter sweep experiments are fully traceable.
Thread-based background jobs with SSE for real-time progress — no Redis or Celery required.
This project is publicly available on GitHub. Full documentation, modular architecture and a CLI for batch processing — ready to use and extend.
git clone https://github.com/0xh0551/Quant_research
Most quant projects get lost in scattered scripts. This platform demonstrates that when the pipeline from data to report is designed with a clear architecture, trading decisions are grounded in evidence rather than intuition.
Walk-forward validation and Monte Carlo analysis answer the critical question: does this strategy actually work, or is it just curve-fitted to historical data?
If you need a data pipeline, reliable backtesting, risk management or a research dashboard, start with an architecture diagnosis session.