🇪🇺 🇺🇸 Euro Macromechanica (EMM) Backtest — Overview & Methodology

Euro Macromechanica — Macro-structural Mechanics of EUR/USD

Backtest Period Map (EUR/USD, 2001–2025-08)

Euro Macromechanica (EMM) Backtest Period

Note: “quant” is short for quantitative.

A full guide for verifying the transparency of the processes and the input/output files used in the backtest is provided in AUDIT.md; the link is also repeated at the end of this document.

The project results provide robust empirical evidence of invariant quantitative factors (signals of market inefficiency) in EUR/USD price dynamics over the entire historical backtest period. Consistent outcomes over nearly the entire history of the euro as the EU’s single currency (since the introduction of cash in January 2002), including the major systemic crises of the past two decades, corroborate this.

This project should not be regarded as a “full-fledged institutional-level micro-backtest”. Rather, it serves as an indirect demonstration of the sustainable effectiveness of a single research model, EMM, based on the results of the base core on M5 under a conservative execution framework. The base core represents the minimal necessary sample of trades for the EMM research, derived from a limited set of quantitative factors. Its purpose is to illustrate performance potential, rather than to provide high-precision metrics, which is inherently unattainable at a non-institutional level, particularly given the OTC-specific context.

The backtest project includes, in particular:

The backtest was conducted in UTC (UTC+0).



Table of Contents



Input Data for the M5 EMM Backtest

Source Minute Data, Data Quality Policy, and the Decision to Split Backtest Periods

The euro was introduced as the EU’s single currency in non-cash form in January 1999; retail access to EUR/USD trading emerged around 2001. Cash euros were introduced in January 2002.

The primary objective is to run the backtest over the longest feasible horizon, ideally starting from the point when retail access via FX brokers became available.

Special thanks to HistData for free access to minute data (and tick samples) for currency pairs with broad historical coverage—one of the few free sources comparable in scope.

A decision was made to conduct the backtest on minute data from HistData rather than on ticks. The rationale for choosing minute data and the HistData source (versus alternatives such as Dukascopy) is described in the section “Rationale for choosing HistData minute data for the M5 EMM backtest.”

Unfortunately, the quality of the minute data did not allow for a robust test of 2001–2002: the number of 5–15 minute gaps was exceptionally high, which is critical for M5 logic (all quant factors are implemented on the M5 timeframe, and a large number of gaps in that range materially distorts the computation of the required levels). Therefore, 2001–2002 are used as a stress period to illustrate strategy behavior under degraded data quality. Accordingly, metrics are computed only for the baseline periods.

Data quality for 2003–2007 is also below the core period, so this span is labeled Extended Baseline to keep it separate from the higher-quality core period — Core Baseline (2008–2025-08). To illustrate results over the longest horizon, a composite track is also provided in a compounding mode (balance carries forward year to year: Ending Balance → next year’s Initial Balance) on mixed-quality data — Composite Baseline (2003–2025-08 = extended + core).

Gaps exceeding 15 minutes are also material, but their impact is lower relative to 5–15 minute intervals. Consequently, the 5–15 minute range is adopted as the key band for classification. In an ideal setting there would be no gaps at all; the work proceeds with the data that were available at the time.

Backtest periods:

Metrics are published only for the baseline periods. (Extended metrics are provided only for the Core Baseline.)

According to HistData, the raw minute data are timestamped in fixed EST (UTC−5) (see the F.A.Q.: http://www.histdata.com/f-a-q/). For the backtest, they are normalized to UTC (UTC+0).

Minute-data normalization includes only:

The minute data values were neither modified nor removed.

All original minute-data files and text reports on minute-data quality were obtained from HistData. The gap analysis is fully reproducible with any tools. A simple visual inspection of the frequency of 5–15-minute gaps confirms the appropriateness of the chosen Data Quality Policy and addresses skeptical concerns (including assertions about possible OTS “timeline anchoring,” etc.). The raw minute data sourced from HistData are likewise fully reproducible. If HistData requests data removal, the raw minute files used in the backtest will be made available for verification upon request.


Economic Calendars

The calendars were compiled manually (in parts) with the assistance of ChatGPT’s advanced modes: official sources throttle bulk HTTP requests, and there are no free consolidated calendars covering such a long period. The engine required a narrow set of countries and only key macro releases, specifically filtered by the author, which are typically published within fixed time windows. Accordingly, the risk of potential timing inaccuracies in the collected releases is low.

All events are sourced from official websites: national central banks, national statistical agencies, etc.

In the implemented logic, the calendar serves as an entry filter, not a signal source: the strategy avoids opening positions during release windows. If a release time is incorrect, a trade will typically be closed via stop-loss rather than generate “false” profit. Consequently, even in the unlikely presence of timing errors, the results are generally more likely to be understated than overstated.

List of high-importance events and countries specifically filtered and compiled for the M5 EMM engine.

United States:

Euro Area:

Reasons for excluding flash HICP, Unemployment Rate, etc. for the Euro Area

Details of the implemented logic in the M5 EMM backtest are provided in the section Implemented M5 EMM Strategy Logic.

Unscheduled high-importance events (e.g., ad-hoc FOMC decisions/statements, inter-meeting actions) were recorded separately as a precaution, but they were not included in the scheduled calendars and were not used in the backtest, as they are not scheduled publications.

Economic calendars are available at data-hub/economic_calendars/.

The directory data-hub/economic_calendars/raw/ contains the collected calendars with events in each source’s local release time, split into parts for greater transparency. The directory data-hub/economic_calendars/prepared/ contains the normalized calendars — converted to UTC+0 — that were used as input files in the backtest.

The calendars do not include actual release values—only metadata such as time, title, importance, country/region, etc.

Calendar normalization consists only of converting publication times to UTC+0 and sorting the records. The UTC+0 conversion uses the IANA time-zone database, ensuring correct handling of local rules and Daylight Saving Time (DST) transitions. There are no DST-related inconsistencies: the key macro releases listed above are not published during clock-change hours.

Calendar normalization script — data-preparation-toolkit/economic_calendar_normalizer

All calendar data used in the backtest are fully transparent and reproducible.



Public M5 EMM Engine Logic

Input Normalization and Determinism

The engine ingests one-minute market data and calendar events timestamped in UTC (ISO-8601). Before simulation, all input series are brought to a single, reproducible form so that the outcome depends only on the data themselves, not on row order, duplicates, or source artifacts. One-minute records are ordered by time; if multiple rows share the same timestamp, the last one is retained and the preceding ones are treated as superseded; malformed or incomplete price values are discarded. No synthetic fills or interpolations are applied—only the actual content of the data is used in calculations.

Minutes are then mapped unambiguously to five-minute windows by interval membership: empty windows are dropped; partially filled windows are preserved without fills/interpolations. Normalization does not create missing minutes, so real gaps are preserved and handled by the gap logic.

Calendar data (hourly events and trading schedules) are normalized to a common format and de-duplicated; their application is described in the section Implemented M5 EMM Strategy Logic.

Normalization and aggregation procedures are deterministic: given identical inputs, repeated runs yield identical results regardless of the original row order, the presence of duplicates, or heterogeneity of the export. At the end of the period, any remaining positions are closed at the last available one-minute price, which fixes the terminal state unambiguously.

Determinism of results.

Simulation outcomes are deterministic: with identical inputs, repeated runs reproduce the same set of trades, timestamps, and prices. This is achieved through stable sorting and de-duplication, a fixed grid and rules for mapping minutes to windows, an unambiguous choice of the “earliest” minute where applicable, a single global epsilon, a fixed SL-over-TP priority in conflict cases, strict quant-logic rules, and the absence of randomness. Partially filled bars are preserved without interpolation; real gaps are not filled and are handled by the gap logic. For exact reproducibility, the M5 EMM engine’s dependency versions are pinned.

Determinism of output files.

Simulation artifacts are produced deterministically: with unchanged inputs, parameters, and dependency versions, each rerun produces the same set of files with identical content. This is ensured by a uniform time base and fixed M5 grid, prior normalization of one-minute records (stable sorting, de-duplication without interpolation), a predictable, staged simulation pipeline (a two-pass scheme that forms trades and equity snapshots, aligns equity to entry times, and a final pass without random components), and stable serialization rules (constant delimiter, fixed numeric precision, a single timestamp format, and time-ordered rows). The composition and naming of outputs remain constant via templates with deterministic year resolution; the SVG visualization is generated from the already-deterministic equity series and is therefore reproducible as well. For external verification of immutability, an artifacts_YYYY.sha256 file is produced listing the SHA-256 checksums of all input and output artifacts; its lines are sorted so the checksum file itself is byte-stable. With identical inputs/parameters and pinned library versions, the contents of trades_YYYY.csv*, balance_YYYY.csv* (equity series), summary_YYYY.csv*, balance_YYYY.svg, and artifacts_YYYY.sha256 are byte-for-byte identical across runs.


Aggregating Minutes into M5 Bars

Minute quotes are mapped onto a fixed five-minute grid: a window starts at time t and covers the half-open interval [t, t + 5 minutes), with each minute belonging to exactly one window. One M5 bar is produced per window: Open comes from the earliest valid minute in the window, High/Low are the maximum/minimum across the minutes in the window, and Close comes from the latest valid minute in the window. If a window contains no valid minutes, no bar is created; if it contains fewer than five minutes, the bar is treated as partially filled and is preserved without padding or interpolation. The bar’s timestamp is the window start time (t).

Aggregation is performed after minute-level normalization and deduplication.

The procedure is deterministic: the order of input rows and duplicate occurrences of the same minute timestamp do not affect the result (when timestamps collide, the last record is used). Partially filled bars are retained; their eligibility for entry searches is governed by data-quality rules and the trading schedule. When a gap is detected, only new entries are temporarily blocked, while management of already open positions continues.


Order Execution Logic

The implemented logic uses market orders. For details of the market-order cost model, see M5 EMM Cost Model (EUR/USD) Methodology.

For the rationale behind computing Objective (TP) and Guard (SL) from fully formed M5 bars (rather than incrementally within the current bar; i.e., look-ahead considerations), see Rationale for a Look-Ahead-Free Implementation of the Mathematical Computation of Objective (TP) and Guard (SL) Levels on Fully Formed 5-Minute Bars in the M5 EMM Engine


Order Execution Logic in the Presence of Gaps

After entry, the position is tracked on M1 within the current M5 bar. If a gap occurs (no minutes inside one or more M5 bars), the close check resumes on the first available minute after the gap. The close is executed at the computed level, not at the opening price of the first minute after the gap.


Gap-Detection Logic

The detector evaluates each M5 bar by comparing:

  1. Time gap between
    • the last minute before the start of the current timeframe (TF) window, and
    • the first minute inside the current TF window.
    If this pause is strictly greater than the configured threshold (in minutes), a gap is recorded.

  2. Price gap between
    • the close of the last minute of the previous window, and
    • the open of the first minute of the current window.
    If the absolute difference is ≥ the configured threshold (in pips), a gap is recorded. (Threshold set to 25 pips.)

The checks apply only if both the “previous minute” and the “first minute of the current window” are available. If either is missing, no gap is detected on that bar.

If the pause before the start of the current window exceeds a separate “long” reopen threshold of 1440 minutes, then—even if the above conditions are met—the gap does not trigger a lockout (it is treated as a normal reopen after a long break such as weekends/holidays). If at least one type of gap (time or price) is detected and this is not a reopen, the lockout window starts at the first minute of the current TF window where the gap was detected. The lockout duration is configured in minutes. (Set to 30 minutes.)

While the current M5 bar falls within the lockout window, only new entries are disallowed. Management and exits of already open positions proceed as usual. If an M5 bar has no minutes at all, the bar is skipped entirely, and management/exits are not evaluated on it. When the current TF time reaches (or is past) the end of the lockout, the lock is removed and normal processing resumes.

Gap checking is performed inside the entry-eligibility branch (after external quant factors) and before testing Trigger touches. A lockout window is set if the bar is eligible for entries and a gap is detected, regardless of whether any Trigger touch occurs. If the bar is not eligible or this is a reopen, no lockout window is set—by design, for conservatism, to avoid additional filtering where entries are not possible anyway.

Implementation Notes

The gap detector evaluates only the M5 window boundary—the interval between the last M1 before the window and the first M1 inside the window. Missing minutes inside the window are not treated as a gap, and no lockout window is started because of them.

Two data-quality rules apply:

Result: a lockout is triggered only by a boundary gap (time or price). Gaps inside an M5 window are neither detected nor penalized.

Due to the gap-protection logic, some potential trades may be skipped; however, given the 2008–2025-08 data quality, such cases are rare. Detecting gaps specifically at M5 window boundaries further filters discontinuities to increase result objectivity. The rationale for boundary-based lockouts is that Trigger / Guard / Objective levels are fixed at the close of each M5 bar (from bar data) and are not recalculated inside the bar. A new level set appears only at a window change, hence the natural place to check for gaps is the boundary (last M1 before the window vs. first M1 inside). Gaps inside an M5 bar do not start a lockout: levels are static within the bar, so the bar-eligibility filter suffices without extra blocking. Introducing lockouts for intra-bar gaps would be overly strict; for such gaps, the bar’s entry-eligibility filter is applied without starting a lockout.

Gap detection is applied exclusively within the entry-eligibility branch (and before testing Trigger touches); outside this context it is not activated. This design avoids over-filtering rare minute-data artifacts and prevents unnecessary noise in the backtest. Empirically, the approach is supported by the stability of long-horizon backtest results under the data-quality period classification.


Cost Accounting, Trade Size Calculation, and Rounding

The engine is configured with fixed cost parameters for a single annual run (within the selected profile/mode and calendar year). Costs comprise commission, spread, and slippage. All three components are applied to every trade; the published results already include their impact on PnL.

Point vs pip. For five-decimal quotes, 1 pip = 10 points. All thresholds/costs are specified in pips and converted to price.

For methodology details, see the section M5 EMM Cost Model (EUR/USD) Methodology.

Bottom line: commission is accounted for separately from risk, while spread/slippage are included in risk and position-size calculations; all costs are applied to every trade, so run results and summary metrics already reflect their impact.

Per-trade position sizing. The position size for each trade is a fixed fraction of the current balance at entry (the risk fraction is set for the entire annual run). Risk is taken from the actual balance right before opening the trade, not from a pre-fixed notional.

If two Trigger levels are hit within the same minute of an M5 bar, trades are opened from both levels; the default per-trade risk is taken for each trade from the balance at entry (e.g., with a $100,000 balance and 1% risk per trade, both trades open at 1% risk each — the risk is not split between them).

Risk-based lot rounding. First, a “raw” position size is computed so that the potential loss at the stop equals the target risk; the stop distance is taken including spread/slippage. The size is then always rounded down to the nearest permissible increment, so the realized stop-risk does not exceed the target. Exceedance is only possible in edge cases:

Spread and slippage are explicit inputs and are included at the lot-sizing stage: the entry-to-SL distance is measured with these costs already applied.


Epsilon (ε)

The model uses a single comparison tolerance ε equal to one minimum tick increment of the instrument. For EURUSD (5-digit quotes): ε = 0.00001 (1 point = 0.1 pip).

Scope.
ε is used only for the binary check of whether a computed level is “touched” by the minute bar’s extremum (High/Low) within an M5 window. ε does not affect the computation of the levels themselves.

Order relative to costs.
First, the “touch / no-touch” event is determined with ε; then the execution price is formed with the configured costs (spread / slippage / commission). ε does not change the magnitude of the costs.

Exceptions (where ε is not applied).
Gap detection and bar-eligibility checks are evaluated strictly by the specified inequalities (> / ) without ε.

Rationale.
A value of 0.1 pip is minimally sufficient to offset boundary discretization and rounding effects on M1. It does not “widen” levels into zones and does not create a methodological advantage; any shifts are smaller than the tick size and far below transactional noise (spread / slippage / commission).

Determinism.
A fixed ε ensures reproducible results given identical inputs.

Related execution rules.
In intra-minute conflicts, Guard (SL) has priority over Objective (TP) when both are reached simultaneously.

Note (OTC specifics).
Minute OHLC series from different providers (e.g., HistData vs. Dukascopy) can differ slightly due to time zone, price type, and aggregation. Formalizing ε standardizes the “touch” criterion and reduces the impact of such differences.



Implemented M5 EMM Strategy Logic

This M5 EMM implementation covers the core—approximately 10–15% of the full strategy — representing a minimally sufficient set of trades to demonstrate effectiveness. In this design, all operations—trading decisions, computations, and feature handling — are performed exclusively on the 5-minute timeframe. The M5 module is not complete: owing to the approach’s complexity and resource constraints, the full suite of quantitative factors has not been implemented even within a single M5 model. Moreover, the current version does not include the principal logic for trading in post-release windows for major macroeconomic data, which is one of the key drivers of performance. The full EMM strategy is a composition of signals across the M1, M5, M15, M30, and H1 timeframes, incorporating macro-structural dynamics of the U.S. Dollar Index under a proprietary methodology; nonetheless, this M5 core is sufficient to demonstrate effectiveness.


Time Filters of the Implemented M5 EMM Logic

The strategy trades macro-structural patterns on M5 bars and opens positions only during calm intervals of the trading day. This is not “Asia-only”: “calm” refers to windows outside the volatile top-of-hour intervals in the London–NY sessions, the rollover period, and scheduled data-release windows; filters are also applied within active London–NY segments. All prohibitions are specified as fixed UTC windows and do not depend on DST (the HistData source’s fixed EST baseline is accounted for; see the FAQ: http://www.histdata.com/f-a-q/).

Within the EMM logic, 10-minute blocks are implemented as a filter for top-of-hour volatility in the active London–NY segments, where spreads/slippage statistically increase. Additional fixed filters prohibit or tighten entries at specific times based on EMM logic. The rollover and the CME Globex maintenance window are covered by a fixed block that spans both summer and winter.

FX rollovers (UTC):

CME Globex daily maintenance break (UTC):

A full no-trade window is enforced from 19:00 to 00:10 UTC, which covers the FX rollovers and the CME Globex break regardless of DST. Accordingly, by the time these windows begin there are no open positions, and no exits are executed during them. This is corroborated by trade metrics on average trade duration.


Calendar Filters of the Implemented M5 EMM Logic

Economic calendars are compiled only for pre-defined Euro Area and U.S. scheduled releases classified as high-importance. All timestamps are provided in UTC and are DST-agnostic.

All filters, their values, and all quant logic are strictly fixed and uniform (no changes and no re-optimization) across the entire backtest horizon (profiles, risk modes, capitalization modes, etc.).



M5 EMM Cost Model (EUR/USD) Methodology

All components of the cost model—commission, spread, and slippage—are applied to every trade. Spread and slippage are incorporated into the orders’ effective execution price.

See Cost Accounting, Trade Size Calculation, and Rounding.

Commission Profiles


Institutional

All institutional modes apply a fixed commission of $5.5 per round-turn per 1 standard lot (100,000 EUR) as a separate expense line, regardless of flow type (ECN/PoP raw-spread, single-dealer, “zero-commission,” etc.).
Equivalents: ≈ 0.55 pips RT (≈ $2.75 per side; $55 per $1M RT).

Spread and slippage are handled separately per the EUR/USD execution model from m5_emm_cost_model_v1.0.csv and do not include the commission. The total transactional cost equals (spread + slippage) plus the $5.5 RT/lot commission.

Rationale

Market levels (reference)

A fixed value of $5.5 RT/lot is used in calculations; the table below is for comparison with market practice.

Flow / Venue Charging method Typical level (RT/lot) Pips RT (equiv.) $ per $1M RT Notes
Top-tier ECN / PoP (raw-spread) Line-item commission $4–6 0.40–0.60 $40–60 Benchmark for mid/large participants; $5.5 sits near mid.
Prime-broker direct, high volume Line-item commission $2–4 0.20–0.40 $20–40 Requires high volumes and tighter tiers.
Smaller PoP / low frequency Line-item commission $6–7 0.60–0.70 $60–70 Lower volume discounts; less regular trading.
Single-dealer (all-in) Cost embedded in spread +~0.3–0.7 pips In the model, commission is separate (fixed $5.5).
“Zero-commission” ECN $0/lot, wider spread ~0.5–0.8 pips All-in equivalent; commission entered separately.

Conversions

Sensitivity (indicative)

Formulas

Commission (institutional conditions, EUR/USD). By default, $5.5 per round-turn per 1 standard lot (100,000 EUR) is applied as a separate expense line, equivalent to ≈ 0.55 pips RT (≈ $2.75 per side; $55 per $1M RT). Spread and slippage are accounted for separately per the execution model. The market range for ECN/PoP is $4–6 RT/lot; for all-in flows, the cost is embedded in the spread, yet a single explicit commission is used here for comparability.


Retail Rebate

In the Retail Rebate profile, an average commission of $5.00 per round-turn per standard lot (100,000 EUR) is used on raw ECN/PoP accounts, after cashback. Equivalents: ≈ 0.50 pips RT, ≈ $2.50 per side, $50 per $1M RT. The commission is booked as a separate line item and is not included in spread/slippage.

This profile assumes retail ECN/PoP (raw-spread) with IB/cashback on the commission. The commission is charged and shown as a separate line item; standard all-in accounts are not used. In calculations, the commission is not included in spread/slippage. Instrument: EUR/USD; this value applies to all trades under the retail profile (subject to operational volume limits up to 50 lots).

Interpretation in the model.

The commission is expressed in $ and pips using standard EUR/USD conversions (1 lot = $10 per pip):

Market context (reference).

Without cashback, retail-ECN typical levels are $6.5–7.0 RT/lot (≈ 0.65–0.70 pips). With cashback, the range is often $4.5–5.5 RT/lot; the chosen $5.0 RT/lot aligns with the central tendency and serves as a neutral average.

Indicative sensitivity.

The Retail Rebate profile targets retail brokers on raw-spread pricing. For all-in tariffs the commission is nominally absent (cost embedded in the spread); within this profile, an explicit $5.00 RT/lot commission is used, while spread/slippage are modeled separately.


Retail Standard

In the Retail Standard profile, an average commission of $7.00 per round-turn per standard lot (100,000 EUR) is used on raw ECN/PoP accounts, without cashback/rebates. Equivalents: ≈ 0.70 pips RT, ≈ $3.50 per side, $70 per $1M RT. The commission is booked as a separate line item and is not included in spread/slippage.

This profile assumes retail ECN/PoP (raw-spread) without IB/cashback. The commission is charged and shown as a separate line item; standard all-in accounts are not used. In calculations, the commission is not included in spread/slippage. Instrument: EUR/USD; this value applies to all trades under the retail profile (subject to operational volume limits up to 50 lots).

Interpretation in the model.
The commission is expressed in $ and pips using standard EUR/USD conversions (1 lot = $10 per pip):

Market context (reference).
For retail raw-ECN without cashback, a typical corridor is $6.5–7.0 RT/lot (≈ 0.65–0.70 pips). The chosen $7.0 RT/lot reflects the upper bound and serves as a conservative base.

Indicative sensitivity.

The Retail Standard profile targets retail brokers on raw-spread pricing without cashback; spread/slippage are modeled separately and do not include the commission.


Dynamic Cost Model (Spread/Slippage) Based on Average Notional Trade Volumes (EUR) of the M5 EMM Strategy

Cost table of average values by Notional Volume (EUR/USD, top-tier ECN/PoP): eurusd_market_order_costs_ecn_round-turn_pips_v1.0.csv

The principles for constructing and using the table eurusd_market_order_costs_ecn_round-turn_pips_v1.0.csv — which reflects market-order execution costs for EUR/USD — are outlined below. Parameters are calibrated for top-tier ECN / Prime-of-Prime (raw-spread) with multi-LP aggregation. All figures are expressed in pips per round-turn (RT), i.e., entry + exit combined.

Temporal assumptions (aligned with the M5 EMM strategy logic).
Calculations are made for “quiet intervals”:

Realism and limitations.

Transparency and reproducibility.
The table reflects industry-calibrated averages for EUR/USD execution costs on top-tier ECN/PoP (raw-spread) in quiet hours; commission ($/lot) is accounted for separately. Any third party can independently benchmark these levels by executing market orders in quiet intervals and measuring RT cost = spread (RT) + slippage (RT). For smaller sizes, deviations are typically minimal; a tolerance of ±0.10 pips RT is operationally acceptable. For larger sizes, variability is higher but should remain within the indicated ranges. The table is a transparent reference point and does not constitute a promise of specific terms with any particular counterparty.


Table of cost values used in the M5 EMM backtest m5_emm_cost_model_v1.0.csv.

The spread/slippage costs are fixed for each annual run based on the strategy’s average trade sizes, which are derived from average stop distances, the fixed risk setting, and the balance path under the implemented M5 EMM engine logic. As the balance changes year to year, the cost model is adjusted according to the table of average values — m5_emm_cost_model_v1.0.csv.

The file m5_emm_cost_model_v1.0.csv lists the values used in backtesting for every profile / risk mode / capitalization mode and calendar year. The spread/slippage values are treated as averages across notional-volume (EUR) ranges, calibrated from eurusd_market_order_costs_ecn_round-turn_pips_v1.0.csv. (For the Institutional profile, the Notional 15M–30M band uses deliberately conservative—elevated—values as a stress yardstick.)

Example profile (Institutional)

Notional 1M–5M; initial balance — $1,000,000; reset to $1,000,000 when the year-end balance ≥ $1,500,000.
Range of spread + slippage (round-turn): 0.4–0.75 pips (source: eurusd_market_order_costs_ecn_round-turn_pips_v1.0.csv).
The backtest uses a conservative 0.65 pips from m5_emm_cost_model_v1.0.csv as the fixed average for the Notional 1M–5M band.

A rough threshold: at the shortest stop distance and fixed risk, notional 5M corresponds to a balance of about $1,500,000. Therefore, when that indicative threshold is reached at year-end, the balance is reset to the approximate median so that average trade sizes remain within the 1M–5M band.

Other capitalization modes follow the same logic:

The Retail profiles use identical accounting. Per-year cost levels are set from the initial balance. This applies both to annual resets and to compounding over the full period — the logic does not change. Under compounding, if the year-end balance approaches the point where, with the shortest stop distance, the trade notional is right at (or presses against) the band boundary where costs should be adjusted upward, the next year’s costs are set accordingly. The same applies when a subsequent year clearly crosses that boundary.

Both tables carry SHA-256 hashes, GPG signatures, and OTS anchors for verification and reproducibility. All cost parameters (commission, spread, slippage) used in backtesting are fixed in the SHA-256 manifest of the CSV table m5_emm_cost_model_v1.0.csv; the manifest is GPG-signed and time-anchored via OpenTimestamps. This guarantees that parameters correspond to the specific profiles/modes and calendar years of the backtest and confirms that published annual run results and metrics were computed under exactly those values.



Rationale for a Look-Ahead-Free Implementation of the Mathematical Computation of Objective (TP) and Guard (SL) Levels on Fully Formed 5-Minute Bars in the M5 EMM Engine

From the Order Execution Logic it follows that Objective (TP) and Guard (SL) levels are computed with respect to the current M5 bar in the simulation using look-ahead: the levels are derived from a fully formed M5 bar, rather than incrementally (as minutes accumulate). This design was implemented deliberately to produce the most objective results feasible. The logical justification is set out below.

Justification of minute-based backtest accuracy and tick-level approximation

Given minute data, the most objective approximation to tick-level reality is the current approach: levels are computed on an already formed M5 bar (look-ahead), while execution is checked on M1 without interpolation and without any “peek ahead.” The reason is straightforward: in live trading, auxiliary levels are re-evaluated on every tick in real time.

Indicative tick-rate ranges (top-tier retail ECN FX):

Allowing for variability, a conservative baseline is ≈ 5 ticks/s (≈ 300 ticks/min), i.e., roughly ~300 level re-evaluations per minute in real time—bearing in mind the strategy’s session filters (London–New York) and the exclusion of major macro releases.

Attempting to backtest on minutes while re-computing levels within a 5-minute window yields at most 5 re-computations per bar (one per minute), versus ~1,500 steps at tick granularity (e.g., 5 min × ~300 ticks/min). Such a sparse update cadence can introduce material error unless a look-ahead policy is used. On a 4-hour timeframe with minute data, there are ~240 minutes inside the bar—~240 steps—which is closer to acceptable, yet still below the requirements of the M5 EMM logic (given here as intuitive contrast). For M5, reducing the update frequency to 5 steps makes level estimation on a “not-yet-completed” bar methodologically vulnerable to distortions, especially for intraday approaches with modest targets.

Conclusion. Computing levels from fully formed M5 bars while checking execution on M1 provides a tick-proximate approximation when using minute data and does not produce any artificial improvement in execution quality.

Rationale for minimal expected divergence:

Practical accuracy. In the strategy’s typical regimes (outside news windows and within permitted session intervals), differences between M5 look-ahead and a tick-level reconstruction within the bar are usually small and do not create a persistent bias in results.

Accordingly, applying look-ahead only at the level-calculation stage on a fully formed M5 bar does not create a modeling advantage.

Scope of look-ahead.
Look-ahead is applied only when computing Objective (TP) and Guard (SL) from a fully formed M5 bar.
The signal Trigger and all execution (TP/SL touches) are handled without look-ahead on M1; all other quant logic in the M5 EMM engine also operates without look-ahead.

Summary: chronology and boundaries of look-ahead

  • Definition. Objective (TP) and Guard (SL) levels are computed from the fully formed M5 bar t (intentional look-ahead).
  • Order of application. Levels computed on t are immediately checked within the same M5 window t on minute data (M1) using the extremum of each minute (retrospective check within window t).
  • Look-ahead scope. Look-ahead is used exclusively at the level-computation stage. All touch checks and execution on M1 are performed without look-ahead: fixed comparison rules (≤ / ≥) with ε = 1 tick, SL priority on conflict with TP, and no interpolation; all other quant logic in the M5 EMM engine likewise runs without look-ahead.
  • Justification. This procedure approximates tick-by-tick re-evaluation of levels on minute series without synthetic interpolation and without peeking ahead at the M1 level.
  • No unwarranted advantage. Symmetric rules for long/short, a strict cost model (spread/slippage/commission), plus session/calendar constraints preclude systematic bias; any residual effect is materially smaller than transaction costs.


Rationale for Choosing HistData Minute Data for the M5 EMM Backtest

The backtest was initially planned to use Dukascopy tick data to maximize accuracy and avoid relying on look-ahead logic for computing Objective (TP) and Guard (SL) levels from fully formed 5-minute bars. However, the decision was made to use HistData minute data for several reasons.

Backtest objectives:


Transparency and Reproducibility

HistData can be downloaded directly from the official website, and its minute and tick series are accompanied by gap reports, which ensures verifiability and repeatability of the inputs for anyone auditing the pipeline.

Using Dukascopy to cover the full backtest horizon (decades) requires third-party scripts and libraries to fetch the data, because the source interface does not support bulk downloads over long periods; all data analysis must then be performed independently. This degrades the clarity of procedures and the replicability of the input data for the backtest.

For tick data from any source, full normalization, deduplication, etc. are also required — which minimizes transparency and reproducibility while substantially complicating the pipeline without providing additional practical benefit for the aims of this backtest. The quality of Dukascopy ticks is objectively unknown without a complete audit over the entire study period. A fair assessment requires exporting all ticks over the entire backtest horizon — tens of gigabytes compressed and up to hundreds of gigabytes in CSV/Parquet — and an independent party is unlikely to verify normalization at that scale.


Publicly Available Tick Data

Rationale for using minute data rather than tick data

Under the implemented M5 EMM logic, using public-access tick data is very unlikely to produce a statistically significant change in backtest results versus minute data, while it would complicate the pipeline and reduce reproducibility.

Bottom line: for this backtest, it is more rational to rely on minute data + deterministic execution rules rather than public tick data.


OTC Specifics and M1 Data Discrepancies

The spot-FX market has OTC characteristics (there is no single consolidated quote feed). Consequently, one-minute bars (M1) from different providers are not required to match frame-by-frame. Sources of divergence include: LP-basket composition and routing, price type (Bid/Ask/Mid/Last), minute stamping (start vs end-of-minute), day boundaries and DST handling, tick-aggregation/clean-up rules, and treatment of shortened sessions.

Implication. For any providers (including HistData and Dukascopy), differences in OHLC for individual minutes will be observed. Once brought to a single specification, these differences are operationally immaterial for the implemented strategy.

Applicability boundary. Material sensitivity arises for configurations with very tight stops of 2–5 pips, triggers at minute extremes, and tick-accurate fill emulation without normalization.


Rationale for the Operational Immateriality of M1 Data-Source Differences for M5 EMM

Canonical source. The backtest uses HistData M1; the provider’s minute-bar specification is accepted as is and defines the working standard.

Normalization. The only operation is converting fixed EST to UTC (timestamp conversion).
Not done: shifting minute labels, rebuilding OHLC, interpolating gaps, adjusting prices/volumes, etc.
Methodologically: the M1→M5 step is aggregation for calculation, not normalization of the source series.

Consequence. All computations rely on raw HistData M1 in UTC; this series serves as the canonical minute specification within the backtest. Alternative sources are considered comparable when aligned to the same aggregation rules as the HistData source.

Factors that reduce the impact of inter-provider deltas (assuming the canonical minute spec is followed):

Bottom line. Small M1 differences across providers are a normal feature of OTC FX and, once sources are harmonized to the canonical minute specification, do not affect the overall effectiveness picture of M5 EMM. Maintaining a “parallel track” on another provider is optional and would reduce transparency due to the heavy normalization required; a truly detailed micro-backtest is feasible only under institutional conditions.

Caveat. The conclusion on operational equivalence applies to this methodology and horizon and is not universal for all strategies and timeframes.


Conclusion on Data Choice

Complicating the pipeline with scripts and libraries for bulk downloading minute/tick series from Dukascopy, followed by detailed normalization and suitability checks, is not warranted for the stated objectives: it reduces the auditability and reproducibility of results. The rational choice is HistData minute data combined with deterministic intrabar execution rules and an explicitly specified cost model.

Replicating the backtest on an alternative minute-data source is, with high probability, unlikely to provide practical added value, while it would reduce transparency due to the need to analyze and conform the series to the HistData canonical specification (normalization, minute-mark alignment, gap control).

The aim of this backtest is not extreme precision of individual metrics, but a clear demonstration of overall dynamics and properties of the strategy over a long historical horizon under a conservative execution model.

Note. A truly detailed micro-backtest — with true queueing, partial fills, realistic liquidity and spread micro-dynamics, and exact event ordering — is feasible only on institutional-grade data and infrastructure (EBS / Cboe FX / Refinitiv, etc.), where L2/L3 data, order-book events (add/modify/cancel/trade), microsecond/millisecond timestamps, and lossless capture are available.



Results: Data Quality Policy v1.0 Periods, Profiles, Risk Modes, Capitalization Modes, and Determinism

Periods per Data Quality Policy v1.0

Based on the Data Quality Policy v1.0 (see Input Data for the M5 EMM Backtest), the backtest periods are classified as follows:


Profiles

Profiles are classified by commission level and structure:


Risk and Capitalization Modes

The Institutional profile uses capitalization modes aligned to notional-volume ranges at 1.0% risk:


Determinism of Results

The M5 EMM engine is configured deterministically (see Public Logic of the M5 EMM Engine).
Accordingly, results reproduce bit-for-bit given identical input data and fixed versions of Python, libraries, and dependencies used by the backtest engine.

Metrics are computed from non-public result files — detailed trade reports (with timestamps and prices) and time-aligned equity/drawdown series. Separate SHA/GPG/OTS artifacts are not provided for the metrics, because the metrics are reproducible from these underlying files; for verification, it is sufficient to validate the integrity of the result files themselves (via the yearly manifests generated by the M5 EMM engine) and recompute the metrics with the open script.



Evidence of Strategy Existence and Invariance

Cryptographic Bundle — Proof of Existence and Integrity of the Sealed Strategy Archive

The deterministic archive of the backtest strategy engine is protected with a combination of SHA-256, GPG signature, and OpenTimestamps (OTS) anchor to verify the existence of the underlying logic implemented in code. Whenever authenticity needs to be demonstrated, the original M5 EMM strategy engine used for the entire backtest can be easily verified through this cryptographic combination.

Environment snapshot

Runtime

Python packages (locked)

Note: packages are version-locked, which ensures reproducibility.


Live-Run Videos

Live runs were executed on randomly selected years across different tracks to confirm the invariance of the strategy logic and the reproducibility of results over the full backtest horizon when using the cost values from m5_emm_cost_model_v1.0.csv. The same sealed build of the strategy archive is used for all periods; only the cost parameters, initial balance, and per-trade risk vary per yearly run.

Each video shows that:

Live runs are recorded as continuous, unedited screen captures (single-take; no splices, cropping, speed-ups, or other post-processing). The original artifacts are published together with SHA-256, GPG signatures, and OpenTimestamps (OTS) anchors for independent integrity checks.

Watch links:

Download links (video + integrity files):



Metrics

Metric-Calculation Methodology (essentials)

Calmar — denominator guard. An ε-guard is applied to |MaxDD| to prevent division by a near-zero denominator.

Intramonth vs EoM (brief):


Conditional Sortino Method (strict, loss-only)

Definition. The reports use a strict (conservative) Sortino variant tailored for precise risk assessment of trading strategies. The downside deviation is computed only over losing months (loss-only) relative to the target (T = 0\%) per month, with the statistical standard deviation (\mathrm{ddof}=1). Annualization multiplies by (\sqrt{12}).

\[\mathrm{Sortino} = \frac{\bar r - T}{\sigma_{\text{down, loss-only}}}\cdot\thinspace\sqrt{12}\] \[\sigma_{\text{down, loss-only}} = \sqrt{\frac{\sum\limits_{i\thinspace\colon\thinspace r_i < T} (r_i - T)^2}{M - 1}}\]

where (M) is the number of losing months in the window. If (M<2), the metric is not reported (insufficient negatives).

Differences from the “standard” Sortino. Industry practice often uses an LPM(2)-based variant with all observations (N) in the denominator (positive and zero months contribute zero), and commonly (\mathrm{ddof}=0):

\[\sigma_{\text{down, std}} = \sqrt{\frac{1}{N}\sum_{i=1}^{N}\big(\min(r_i - T, 0)\big)^2}\]

That approach “dilutes” risk with non‑negative months and typically yields higher Sortino values versus the strict loss‑only variant.

Rationale. Loss‑only with (\mathrm{ddof}=1) avoids diluting risk with flat/positive months and provides a more conservative and precise sensitivity to losing episodes — which is critical for intramonth/intraday profiles and risk management.

Comparability with industry. For a rough conversion to the “standard” LPM variant, the following approximation can be used:

\[\mathrm{Sortino}_{\text{std}} \approx \mathrm{Sortino}_{\text{loss-only}}\cdot\thinspace\sqrt{\frac{N}{M-1}}\]

where (N) is the window size (e.g., 12 or 36 months) and (M) the number of losing months. With small (M), the standard Sortino will be substantially higher.

Note on rolling windows. On short windows (12m), when the number of losing months ((M)) is small, the strict method can yield Sortino < Sharpe — an expected property of the formula, not a calculation error. On longer horizons the effect remains but is weaker.

Computation parameters.


OOS / Walk‑Forward / Multiple Testing & Selection Bias

Time-based OOS / Walk-Forward are not applied, as a single fixed M5 EMM logic is used without re-optimization or parameter tuning. The purpose of time-based OOS is to detect overfitting after parameter adjustments, which is not relevant here.

Multiple testing / selection bias are absent, since only one hypothesis / one parameter set was tested, with no configuration sweeps and no cherry-picking.

Cross-asset test on GBPUSD serves as an independent out-of-sample (OOS) by asset: parameters optimized on EURUSD were applied without any changes to GBPUSD. Post-Brexit, the EURUSD–GBPUSD correlation decreased significantly, so successful application of the model on GBPUSD confirms:

Strategy robustness is further supported by:

Audit. Upon request, a holdout period can be prepared without any retraining or parameter changes for independent verification.


Monte Carlo

Note: final Monte Carlo figures are aggregated by medians across block configurations.

Confidence Intervals (BCa)

Trade Bootstrap (EDR & Losing Streaks)


Benchmark

In the EUR/USD pair, there is no stable long-run directional premium; therefore comparisons with external benchmarks (equity indices) or directional strategies (buy-and-hold, long-only/short-only in EUR/USD) are methodologically inappropriate. This is evident from the behavior of the long-horizon price series over the entire backtest window. The project deliberately uses no external benchmark: quality is assessed via internal risk metrics (Sharpe, strict Sortino (loss-only), Calmar), the stability of 12/36-month rolling windows, BCa bootstrapped confidence intervals, and a stress cost model. For risk-metric calculations, the assumption R_f = 0 is used (USD-denominated, intraday M5). This approach ensures comparability of results within the stated risk profile.


Full Methodology and Metric Definitions

See docs/metrics_methodology/metrics_schema.json and docs/metrics_methodology/metrics_schema.md.

For transparency, the methodology files and the calculator script are also published in metrics-toolkit.


Metric Files

Metric CSVs are located under each capitalization‑mode track in the metrics folder.

Base metric set:

metrics/
  monthly_returns.csv
  full_period_summary.csv
  rebasing_applied.csv
  yearly_summary.csv
  trades_full_period_summary.csv

Extended metric set:

metrics/
  confidence_intervals.csv
  dd_quantiles_full_period.csv
  monthly_returns.csv
  monte_carlo_summary.csv
  full_period_summary.csv
  rebasing_applied.csv
  rolling_12m.csv
  rolling_36m.csv
  trades_full_period_summary.csv
  yearly_summary.csv

rebasing_applied.csv — present for modes with balance rebasing.



Brief Results Overview

Extended Baseline

Retail Standard

Balance Curve — Fixed Start 100k & Compounding EoY-SoY Base 100k Modes (Risk 1%, $7 round-turn per standard lot, M5 EMM cost model v1.0) 2003–2007

M5 EMM — Retail Standard — Extended Baseline 7 USD/lot — Balance Curve (Fixed Start) M5 EMM — Retail Standard — Extended Baseline 7 USD/lot — Balance Curve (Compounding)

Retail Rebate

Balance Curve — Fixed Start 100k & Compounding EoY-SoY Base 100k Modes (Risk 1%, $5 round-turn per standard lot, M5 EMM cost model v1.0) 2003–2007

M5 EMM — Retail Rebate — Extended Baseline 5 USD/lot — Balance Curve M5 EMM — Retail Rebate — Extended Baseline 5 USD/lot — Balance Curve

Core Baseline

Retail Standard

Balance Curve — Fixed Start 100k Mode (Risk 1%, $7 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Retail Standard — Core Baseline 7 USD/lot — Balance Curve — Fixed Start


Balance Curve — Compounding EoY-SoY Base 100k Mode (Risk 1%, $7 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Retail Standard — Core Baseline 7 USD/lot — Balance Curve


Retail Rebate

Balance Curve — Fixed Start 100k Mode (Risk 1%, $5 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Retail Rebate — Core Baseline 5 USD/lot — Balance Curve


Balance Curve — Compounding EoY-SoY Base 100k Mode (Risk 1%, $5 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Retail Rebate — Core Baseline 5 USD/lot — Balance Curve


Institutional

Balance Curve — Notional 1M-5M Mode (Risk 1%, $5.5 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Institutional — Core Baseline 5.5 USD/lot — Balance Curve


Balance Curve — Notional 5M-15M Mode (Risk 1%, $5.5 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Institutional — Core Baseline 5.5 USD/lot — Balance Curve


Balance Curve — Notional 15M-30M Mode (Risk 1%, $5.5 round-turn per standard lot, M5 EMM cost model v1.0) 2008–2025-08

M5 EMM — Institutional — Core Baseline 5.5 USD/lot — Balance Curve


Composite Baseline

Retail Rebate

Balance Curve — Compounding EoY-SoY Base 100k Mode (Risk 1.5%, $5 round-turn per standard lot, M5 EMM cost model v1.0) 2003–2025-08

M5 EMM — Retail Rebate — Composite Baseline 5 USD/lot — Balance Curve


Robustness of Results

Robustness is demonstrated through performance profiles and capitalization regimes based on a dynamic cost model over a long historical horizon, as well as through a GBPUSD cross-asset test conducted before and after Brexit (2016).

Extended Baseline & Core Baseline

Retail Standard:

Retail Rebate:

Composite Baseline

Retail Rebate:

As a detailed robustness example, Institutional profiles fix costs by notional volume bands; when the end-of-year threshold is reached, the base balance is reset:

Summary: Across all profiles, modes, and Baseline tracks, the equity path remains sustainably positive with no catastrophic drawdowns; the effect persists under all modeled cost levels, as confirmed by the closed-equity curves.

GBPUSD Cross Asset Test

GBPUSD Cross Asset Test – Balance Curve — Compounding EoY-SoY Base 100k Mode (Risk 1%, no cost model) 2008–2025-08

EMM M5 — GBPUSD Cross Asset Test — Core Baseline — Balance Curve

To demonstrate model robustness and the absence of overfitting, a cross-asset test was performed: parameters optimized on EURUSD were applied to GBPUSD without any adjustments for UK-specific economic data or volatility patterns.

Historically, EURUSD and GBPUSD were highly correlated, but after Brexit (2016) this correlation dropped significantly. This makes GBPUSD an independent out-of-sample test: the asset was not involved in optimization and has a materially different post-Brexit market structure.

Stable performance on GBPUSD confirms:

Even on degraded data (Stress period), the strategy did not exhibit catastrophic capital loss: the maximum balance decline in 2001 was ≈15%.

Full per-track reviews and source CSVs are available in the corresponding tracks.



Conclusions

Summarizing the material across all sections, the backtest results provide robust empirical evidence of invariant quantitative factors (signs of market inefficiency) in EUR/USD price dynamics, and they validate the correctness of the approach taken (transparency, reproducibility, absence of artificial “execution enhancements”) under the stated assumptions and a conservative execution model.

Additionally, the body of evidence presented in the methodology objectively supports a detailed and conservative approach to backtesting the core subset of the author’s model without attempts to overstate results:

The invariance of the logic, the absence of re-training/“cherry-picking”/multiple testing and selection bias, and the objectivity of the results are further supported by:



Limitations and scope of applicability




Contact

GitHub: rleydev (thelaziestcat) · Email: thelazyazzcat@gmail.com, thelazziestcat@proton.me


Licenses & Attributions


Disclaimer — Not Investment Advice

This material is provided for informational and research purposes only and does not constitute investment advice, an offer, or a solicitation. Backtested results are hypothetical and subject to limitations; past performance is not indicative of future results. Assumptions (cost models, risk, etc.) are illustrative and may not align with the reader’s objectives or constraints.