Case Study / Quant Research

Quant Research
Platform

An end-to-end quantitative research system: data acquisition from 111+ exchanges, vectorized backtesting with real execution costs, 8 strategy families, interactive Plotly dashboard and advanced robustness analysis — all in Python.

Python 3.12+ FastAPI CCXT · 111 Exchanges Parquet · Vectorized scikit-learn · LightGBM Plotly Interactive
Project Stats
111+ Exchanges via CCXT
8 Strategy families implemented
40+ Technical & statistical features
9 Timeframes: 1m through 1d
Why this was built

From hypothesis to reliable execution

Most quant systems have a large gap between idea and execution: insufficient data, backtests without realistic costs, strategies without market regime awareness. This platform closes that gap with a fully reproducible end-to-end pipeline.

From incremental Binance data download using monthly bulk archives, to vectorized backtesting with execution delay and real costs (10 bps fee + 2 bps slippage), to an interactive Plotly dashboard — all implemented in Python with type annotations throughout.

Three-Layer Architecture
Browser Dashboard FastAPI + SSE
FastAPI Web Server src/web/app.py
Research Library src/ modules
Parquet Data Store data/processed/*.parquet
Binance Bulk Archives CCXT Fallback Nobitex UDF Incremental Merge
Core Capabilities

5 dashboard sections

The research dashboard is split into 5 independent sections, each covering one phase of the quantitative research cycle.

Data Download

Select exchange, symbol, date range and multiple timeframes. Smart download: Binance monthly bulk archives first, CCXT fallback. Real-time progress via SSE.

Data Inventory

Visual grid of all downloaded datasets with full metadata: exchange, symbol, timeframe, row count, file size and quality status.

Research & Backtesting

Multi-select datasets and strategies, configure capital/fees/slippage, run vectorized backtests with live progress tracking.

Interactive Report

Plotly charts: equity curve with market regime shading, drawdown panel, monthly return heatmap, rolling 30-bar Sharpe and a color-coded sortable metrics table.

Automatic Insights

90-day rolling analysis across all datasets with regime detection, strategy ranking and a momentum gauge.

Data Validation

Automatic checks across 6 categories: timestamp gaps, duplicates, OHLCV consistency, zero-volume bars, outliers and overall quality score.

Strategy Library

8 implemented strategy families

Each strategy is implemented as a frozen dataclass — immutable parameters, fully reproducible results. Signals are binary: Long or Cash. No leverage or short selling.

Strategy Type Entry Logic Market Sensitivity
EMA Trend Trend Fast EMA(20) > Slow EMA(100) Best in trending markets; poor in choppy conditions
RSI Mean Reversion Mean-Rev RSI < 30 entry, RSI > 50 exit Range-bound and sideways markets
Bollinger Band Mean-Rev z-score < −2σ entry, mean reversion exit Moderate volatility, mean-reverting regimes
Donchian Breakout Trend 55-period channel breakout (Turtle Trading) Trending markets with clear breakouts
ATR Breakout Trend Close > MA + 1.5×ATR(20) High volatility with adaptive breakout threshold
MACD Crossover Trend Fast EMA crosses signal line Trending markets with momentum
Stochastic Rev. Mean-Rev %K at oversold/overbought levels Ranging markets with defined oscillation
ML Signal ML Gradient Boosting on 6 technical features Chronological 65/35 train/test split
Advanced Analysis

Beyond simple backtesting

The platform includes tools that eliminate look-ahead bias and measure genuine out-of-sample robustness.

Walk-Forward Validation

Rolling out-of-sample validation with configurable train/test windows. Prevents overfitting to historical data.

Monte Carlo Analysis

Bootstrap resampling of return series to estimate worst-case drawdown distributions and strategy robustness.

Regime Detection

Identifies 4 market regimes: Trending Up/Down, Ranging, High/Low-Vol. Recommends best strategy per regime.

Factor Research

Information Coefficient (IC), Rank IC, factor decay across 1–24 bar horizons, cross-factor correlation matrix.

Portfolio Analysis

Cross-dataset correlation, portfolio-level Sharpe aggregation, drawdown analysis across multiple assets.

Parameter Stability

Automatic testing of nearby parameter variations — prevents curve-fitting over narrow parameter ranges.

ML Research Baseline

Random Forest for return and volatility prediction. Strictly chronological splits, no shuffling — zero data leakage.

Feature Engineering

40+ technical and statistical features: Trend, Momentum, Volatility, Volume, Structure and Statistical categories.

Performance Metrics

Computed backtest metrics

CAGRCompound Annual Growth Rate
SharpeRisk-adjusted return
SortinoSharpe using downside deviation only
Max DDMaximum drawdown depth
CalmarCAGR divided by Max Drawdown
Win RateFraction of profitable trades
Profit FactorGross profit / gross loss
1-bar DelayExecution lag for realism
10bp + 2bpFee + slippage per side
Technology Stack

Production-grade dependencies

Every component was chosen for production standards: no unnecessary dependencies, full type annotations throughout.

Python 3.12+ FastAPI 0.136 pandas 2.2+ numpy 1.26+ CCXT 4.2+ pyarrow (Parquet) Plotly 5.19+ scikit-learn 1.4+ LightGBM 4.3+ XGBoost 2.0+ scipy 1.12+ Typer CLI
Design Principles

Architecture decisions

Full Reproducibility

Given identical Parquet inputs, all backtests and reports are fully deterministic — no hidden randomness.

Zero Data Leakage

Chronological train/test splits, never shuffled — look-ahead bias is eliminated by design.

Frozen Dataclasses

Immutable strategy parameter containers — parameter sweep experiments are fully traceable.

Async Job Management

Thread-based background jobs with SSE for real-time progress — no Redis or Celery required.

Open source — MIT License

This project is publicly available on GitHub. Full documentation, modular architecture and a CLI for batch processing — ready to use and extend.

MIT License Python Type Annotations pytest test suite ruff + mypy CLI + Web API demo mode (synthetic data)
GitHub: Quant_research git clone https://github.com/0xh0551/Quant_research
Project Lesson

A quant system starts with a hypothesis, not code

Most quant projects get lost in scattered scripts. This platform demonstrates that when the pipeline from data to report is designed with a clear architecture, trading decisions are grounded in evidence rather than intuition.

Walk-forward validation and Monte Carlo analysis answer the critical question: does this strategy actually work, or is it just curve-fitted to historical data?

Three core principles
Data quality first — validated across 6 layers before any analysis
Real execution costs — fees + slippage + 1-bar delay
Mandatory robustness — walk-forward and Monte Carlo before any deployment decision

Want to build a quant system for your work?

If you need a data pipeline, reliable backtesting, risk management or a research dashboard, start with an architecture diagnosis session.