Model Training

SlinkyLayer turns raw exchange data into a deployable trading policy through a repeatable, five–step workflow. Each run is archived as a JSON spec plus content hashes, so any compliant node can reproduce the result exactly.


Wizard Overview

Step
Required fields
Notes

1) Market

exchange, symbol, timeframe

Three bar sizes (5 m, 15 m, 60 m).

2) Split

train_start, test_start

Presets: Short, Medium, Long; Custom allowed.

3) Reward

reward_name, params

Default return_minus_cost; alternatives Sharpe-scaled, log-return, trade-count, win-loss.

4) Budget

total_timesteps, n_envs

Auto = 10 × bars-in-train, clamped 2e5–2e6; presets Quick / Standard / Thorough.

5) Hyper-params

algo, net_arch, lr,

n_steps …

Hidden in Beginner mode; fully editable in Professional mode.

Client-side checks enforce safe ranges, e.g. for PPO (nsteps×nenvs) mod  batch_size =0


Supported Algorithms

Algo
Style
Memory
Typical use

PPO

on-policy

rollout only

General workhorse; stable on parallel workers.

A2C

on-policy

rollout only

Minimal compute; rapid feedback.

SAC

off-policy

replay buffer

Sample-efficient; handles noisy returns.

TD3

off-policy

replay buffer

Deterministic, twin critics, action noise.

DDPG

off-policy

replay buffer

Legacy deterministic baseline.

All share an MLP policy; layer sizes (net_arch) and activation (Tanh or ReLU) are user-selectable.


Training Pipeline

rollout_pods  →  learner_GPU  →  checkpoint
      ↑               |               |
feature_table   metrics stream   back-test job
  1. Rollout pods run n_envs vectorised environments, each stepping n_steps with the chosen fee, slippage, and short settings.

  2. Learner pod computes gradients on GPU, applies Adam updates, and checkpoints every k updates.

  3. Back-test job replays the frozen policy on the unseen test split; outputs equity curve, Sharpe, Sortino, max drawdown.


Reproducibility and Forward Audit

Artefacts stored per run

File
Purpose

config.json

Full wizard output.

data_hash.txt

SHA-256 of training candles.

feature_hash.txt

CID of indicator table.

checkpoint.pt

Final weights.

metrics.json

Risk metrics on test split.

Audit path

Auditor nodes subscribe to the live signal stream only.

  1. Verify arrival ≤ 5 s after candle close.

  2. Recompute one-step reward using canonical price feed.

  3. Sign segment hash; quorum of three identical receipts marks the segment verified on chain.

Auditors never receive private hyper-parameter sweeps, keeping intellectual property safe while giving traders cryptographic proof that signals are fresh and honest.

Last updated