Model Training
SlinkyLayer turns raw exchange data into a deployable trading policy through a repeatable, five–step workflow. Each run is archived as a JSON spec plus content hashes, so any compliant node can reproduce the result exactly.
Wizard Overview
1) Market
exchange, symbol, timeframe
Three bar sizes (5 m, 15 m, 60 m).
2) Split
train_start, test_start
Presets: Short, Medium, Long; Custom allowed.
3) Reward
reward_name, params
Default return_minus_cost
; alternatives Sharpe-scaled, log-return, trade-count, win-loss.
4) Budget
total_timesteps, n_envs
Auto = 10 × bars-in-train, clamped 2e5–2e6; presets Quick / Standard / Thorough.
5) Hyper-params
algo, net_arch, lr,
n_steps …
Hidden in Beginner mode; fully editable in Professional mode.
Client-side checks enforce safe ranges, e.g. for PPO (nsteps×nenvs) mod batch_size =0
Supported Algorithms
PPO
on-policy
rollout only
General workhorse; stable on parallel workers.
A2C
on-policy
rollout only
Minimal compute; rapid feedback.
SAC
off-policy
replay buffer
Sample-efficient; handles noisy returns.
TD3
off-policy
replay buffer
Deterministic, twin critics, action noise.
DDPG
off-policy
replay buffer
Legacy deterministic baseline.
All share an MLP policy; layer sizes (net_arch
) and activation (Tanh
or ReLU
) are user-selectable.
Training Pipeline
rollout_pods → learner_GPU → checkpoint
↑ | |
feature_table metrics stream back-test job
Rollout pods run
n_envs
vectorised environments, each steppingn_steps
with the chosen fee, slippage, and short settings.Learner pod computes gradients on GPU, applies Adam updates, and checkpoints every k updates.
Back-test job replays the frozen policy on the unseen test split; outputs equity curve, Sharpe, Sortino, max drawdown.
Reproducibility and Forward Audit
Artefacts stored per run
config.json
Full wizard output.
data_hash.txt
SHA-256 of training candles.
feature_hash.txt
CID of indicator table.
checkpoint.pt
Final weights.
metrics.json
Risk metrics on test split.
Audit path
Auditor nodes subscribe to the live signal stream only.
Verify arrival ≤ 5 s after candle close.
Recompute one-step reward using canonical price feed.
Sign segment hash; quorum of three identical receipts marks the segment verified on chain.
Auditors never receive private hyper-parameter sweeps, keeping intellectual property safe while giving traders cryptographic proof that signals are fresh and honest.
Last updated