Deep learning Time Series

Author

zoo un park

Introduction

The previous section examined financial time series models for the South Korean won spot exchange rate against the USD. Building on that, this section shifts the focus to deep learning methods and analyzes two key macro-financial channels: the government bond yield spread between South Korea and the United States, and South Korea’s exports to the United States. Together, these variables capture how Korea’s economy co-moves with U.S. financial conditions and external demand, providing a richer perspective than exchange rates alone.

The first part of the analysis applies recurrent neural networks to the 3-year and 10-year U.S.–Korea government bond yield spreads. These spreads summarize differences in monetary policy stance, growth expectations, and risk sentiment across the two countries, and thus serve as a compact indicator of Korea’s financial conditions relative to the U.S.

The second part extends the framework to a multivariate setting, modeling South Korea’s exports to the U.S. (in thousand dollars) jointly with several key drivers: U.S. industrial production, U.S. equity market risk (VIX), and the KRW/USD exchange rate. This multivariate specification treats exports and their main external determinants as an interconnected system rather than as separate series, allowing the model to reflect how shifts in U.S. demand, global risk sentiment, and exchange rates translate into movements in South Korean exports to the U.S.

package & load data

Code

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, SimpleRNN, LSTM, GRU, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from sklearn.metrics import mean_squared_error, mean_absolute_error
from sklearn.preprocessing import MinMaxScaler
import time
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from IPython.display import IFrame
 

spread = pd.read_csv("../data/yield/spread.csv")
spread = spread[["Date", "US_KR_3Y", "US_KR_10Y"]]
spread = spread.sort_values("Date").ffill()

utility function

Code

def form_arrays(
    x,
    lookback=3,
    delay=1,
    step=1,
    feature_columns=[0],
    target_columns=[0],
    unique=False,
    verbose=False,
    predict="scalar"
):
    i_start = 0
    count = 0

    x_out = []
    y_out = []

    while i_start + lookback + delay < x.shape[0]:
        i_stop = i_start + lookback
        i_pred = i_stop + delay

        if verbose and count < 2:
            print("indice range:", i_start, i_stop, "-->", i_pred)

        indices_to_keep = []
        j = i_stop
        while j >= i_start:
            indices_to_keep.append(j)
            j = j - step

        xtmp = x[indices_to_keep, :]
        xtmp = xtmp[:, feature_columns]

        if predict == "scalar":
            ytmp = x[i_pred, target_columns]
        if predict == "vector":
            ytmp = x[i_stop + 1:i_pred, target_columns]

        x_out.append(xtmp)
        y_out.append(ytmp)

        if verbose and count < 2:
            print("X:\n", xtmp, "\nY:\n", ytmp)
            print("shape:", xtmp.shape, "-->", ytmp.shape)

        if verbose and count < 2:
            fig, ax = plt.subplots()
            ax.plot(x, 'b-')
            ax.plot(x, 'bx')
            ax.plot(indices_to_keep, xtmp, 'go')
            if predict == "scalar":
                ax.plot(i_pred * np.ones(len(target_columns)), ytmp, 'ro')
            elif predict == "vector":
                ax.plot(i_stop + 1 + np.tile(np.arange(ytmp.shape[0]), (ytmp.shape[1], 1)),
                        ytmp.T, 'ro')
            plt.show()

        if unique:
            i_start += lookback
        i_start += 1
        count += 1

    return np.array(x_out), np.array(y_out)


def regression_report(yt, ytp, yv, yvp):
    print("---------- Regression report ----------")
    print("TRAINING:")
    print(" MSE:", mean_squared_error(yt, ytp))
    print(" MAE:", mean_absolute_error(yt, ytp))
    print(" RMSE:", np.sqrt(mean_squared_error(yt, ytp)))

    fig, ax = plt.subplots()
    ax.plot(yt, ytp, 'ro')
    ax.plot(yt, yt, 'b-')
    ax.set(xlabel='y_data', ylabel='y_predicted',
           title='Training data parity plot (line y=x represents a perfect fit)')
    plt.show()

    frac_plot = 1.0
    upper = int(frac_plot * yt.shape[0])
    fig, ax = plt.subplots()
    ax.plot(yt[0:upper], 'b-')
    ax.plot(ytp[0:upper], 'r-', alpha=0.5)
    ax.plot(ytp[0:upper], 'ro', alpha=0.25)
    ax.set(xlabel='index', ylabel='y(t) (blue=actual & red=prediction)',
           title='Training: Time-series prediction')
    plt.show()

    print("VALIDATION:")
    print(" MSE:", mean_squared_error(yv, yvp))
    print(" MAE:", mean_absolute_error(yv, yvp))
    print(" RMSE:", np.sqrt(mean_squared_error(yv, yvp)))

    fig, ax = plt.subplots()
    ax.plot(yv, yvp, 'ro')
    ax.plot(yv, yv, 'b-')
    ax.set(xlabel='y_data', ylabel='y_predicted',
           title='Validation data parity plot (line y=x represents a perfect fit)')
    plt.show()

    upper = int(frac_plot * yv.shape[0])
    fig, ax = plt.subplots()
    ax.plot(yv[0:upper], 'b-')
    ax.plot(yvp[0:upper], 'r-', alpha=0.5)
    ax.plot(yvp[0:upper], 'ro', alpha=0.25)
    ax.set(xlabel='index', ylabel='y(t) (blue=actual & red=prediction)',
           title='Validation: Time-series prediction')
    plt.show()


def history_plot(history):
    FS = 16
    history_dict = history.history
    loss_values = history_dict["loss"]
    val_loss_values = history_dict["val_loss"]
    epochs = range(1, len(loss_values) + 1)
    plt.plot(epochs, loss_values, "bo", label="Training loss")
    plt.plot(epochs, val_loss_values, "b", label="Validation loss")
    plt.title("Training and validation loss")
    plt.xlabel("Epochs")
    plt.ylabel("Loss")
    plt.legend()
    plt.show()


def forecast(model, last_input, n_steps):
    forecast_vals = []
    input_seq = last_input.copy().reshape(1, *last_input.shape)
    for _ in range(n_steps):
        pred = model.predict(input_seq, verbose=0)[0, 0]
        forecast_vals.append(pred)
        next_input = np.append(input_seq[0, 1:], [[pred]], axis=0)
        input_seq = next_input.reshape(1, *next_input.shape)
    return forecast_vals

forecast plot function

Code

def plot_forecast_with_confidence_plotly(results, best_model_name, forecast_orig, y_test,
                                         series_name, spread_df, series_col, lookback, confidence=0.95):
   
    y_test_actual = results[best_model_name]['y_test']
    y_test_pred = results[best_model_name]['y_pred']
    test_errors = y_test_actual - y_test_pred
    std_error = np.std(test_errors)
    z_score = 1.96  

    forecast_flat = forecast_orig.flatten()
    upper_bound = forecast_flat + z_score * std_error
    lower_bound = forecast_flat - z_score * std_error

    total_samples = len(spread_df) - lookback - 1
    train_size = int(total_samples * 0.7)
    val_size = int(total_samples * 0.2)
    test_size = total_samples - train_size - val_size

    test_start_idx = lookback + 1 + train_size + val_size

   
    test_dates = pd.to_datetime(spread_df['Date'].iloc[test_start_idx:test_start_idx + len(y_test_actual)])

  
    last_test_date = test_dates.iloc[-1]
    forecast_dates = pd.date_range(start=last_test_date + pd.Timedelta(days=1), periods=30, freq='D')


    fig = go.Figure()


    fig.add_trace(go.Scatter(
        x=test_dates,
        y=y_test_actual,
        mode='lines',
        name='Test Actual',
        line=dict(color='blue', width=2.5),
        hovertemplate='<b>Actual</b><br>Date: %{x|%Y-%m-%d}<br>Value: %{y:.4f}<extra></extra>'
    ))


    fig.add_trace(go.Scatter(
        x=test_dates,
        y=y_test_pred,
        mode='lines',
        name='Test Predicted',
        line=dict(color='orange', width=2, dash='dot'),
        opacity=0.8,
        hovertemplate='<b>Predicted</b><br>Date: %{x|%Y-%m-%d}<br>Value: %{y:.4f}<extra></extra>'
    ))

    fig.add_trace(go.Scatter(
        x=forecast_dates,
        y=upper_bound,
        mode='lines',
        name='95% CI Upper',
        line=dict(width=0),
        showlegend=False,
        hoverinfo='skip'
    ))

    fig.add_trace(go.Scatter(
        x=forecast_dates,
        y=lower_bound,
        mode='lines',
        name='95% Confidence',
        line=dict(width=0),
        fillcolor='lightgreen',
        fill='tonexty',
        text=upper_bound,
        hovertemplate='<b>95% CI</b><br>Date: %{x|%Y-%m-%d}<br>Upper: %{text:.4f}<br>Lower: %{y:.4f}<extra></extra>'
    ))


    fig.add_trace(go.Scatter(
        x=forecast_dates,
        y=forecast_flat,
        mode='lines+markers',
        name='Forecast',
        line=dict(color='seagreen', width=3),
        marker=dict(size=8, symbol='diamond', line=dict(width=1, color='white')),
        hovertemplate='<b>Forecast</b><br>Date: %{x|%Y-%m-%d}<br>Value: %{y:.4f}<extra></extra>'
    ))


    fig.update_layout(
        title=dict(
            text=f'<b>{series_name}</b><br><sub>{best_model_name} - 30-Step Ahead Forecast</sub>',
            x=0.5,
            xanchor='center',
            font=dict(size=16, color='darkslategray')
        ),
        xaxis=dict(
            title='Date',
            showgrid=True,
            gridwidth=1,
            gridcolor='lightgray',
            tickformat='%Y-%m-%d',
            tickangle=-45
        ),
        yaxis=dict(
            title='Yield Spread',
            showgrid=True,
            gridwidth=1,
            gridcolor='lightgray'
        ),
        hovermode='x unified',
        template='plotly_white',
        width=1000,
        height=450,
        showlegend=True,
        legend=dict(
            yanchor="top",
            y=0.99,
            xanchor="left",   
            x=0.01,           
            bgcolor="white",
            bordercolor="darkslategray",
            borderwidth=1.5,
            font=dict(size=10)
        ),
        plot_bgcolor='whitesmoke',
        margin=dict(l=60, r=40, t=80, b=80),

        
        shapes=[
            dict(
                type="line",
                x0=last_test_date,
                x1=last_test_date,
                y0=0,
                y1=1,
                yref='paper',
                line=dict(color="gray", width=2, dash="dash")
            )
        ],

    
        annotations=[
            dict(
                x=last_test_date,
                y=1.02,
                yref='paper',
                text="← Test | Forecast →",
                showarrow=False,
                font=dict(size=11, color="gray"),
                xanchor='center'
            )
        ]
    )

    fig.show()

Train/val/test split & Model function

Code

def prepare_data(series, lookback=60, train_split=0.7, val_split=0.2):
  
    data = series.values.reshape(-1, 1)
    scaler = MinMaxScaler(feature_range=(0, 1))
    data_scaled = scaler.fit_transform(data)

    X, y = form_arrays(
        data_scaled, lookback=lookback, delay=1, step=1,
        feature_columns=[0], target_columns=[0],
        unique=False, verbose=False, predict="scalar"
    )

    print(f"Total samples: {len(X)}")
    print(f"X shape: {X.shape}, y shape: {y.shape}")

    n_train = int(len(X) * train_split)
    n_val = int(len(X) * val_split)

    X_train = X[:n_train]
    y_train = y[:n_train]
    X_val = X[n_train:n_train + n_val]
    y_val = y[n_train:n_train + n_val]
    X_test = X[n_train + n_val:]
    y_test = y[n_train + n_val:]

    print(f"Train: {len(X_train)}, Val: {len(X_val)}, Test: {len(X_test)}")

    return X_train, y_train, X_val, y_val, X_test, y_test, scaler


def build_rnn_model(lookback, units=[128, 64, 32], dropout=0.3, lr=0.0005):
    model = Sequential([
        SimpleRNN(units[0], return_sequences=True, input_shape=(lookback, 1)),
        Dropout(dropout),
        SimpleRNN(units[1], return_sequences=True),
        Dropout(dropout),
        SimpleRNN(units[2]),
        Dropout(dropout),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=lr), loss='mse', metrics=['mae'])
    return model


def build_lstm_model(lookback, units=[128, 64, 32], dropout=0.3, lr=0.0005):
    model = Sequential([
        LSTM(units[0], return_sequences=True, input_shape=(lookback, 1)),
        Dropout(dropout),
        LSTM(units[1], return_sequences=True),
        Dropout(dropout),
        LSTM(units[2]),
        Dropout(dropout),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=lr), loss='mse', metrics=['mae'])
    return model


def build_gru_model(lookback, units=[128, 64, 32], dropout=0.3, lr=0.0005):
    model = Sequential([
        GRU(units[0], return_sequences=True, input_shape=(lookback, 1)),
        Dropout(dropout),
        GRU(units[1], return_sequences=True),
        Dropout(dropout),
        GRU(units[2]),
        Dropout(dropout),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=lr), loss='mse', metrics=['mae'])
    return model


def train_model(model, X_train, y_train, X_val, y_val,
                epochs=200, batch_size=16, model_name='Model'):
    callbacks = [
        EarlyStopping(monitor='val_loss', patience=25,
                      restore_best_weights=True, verbose=0),
        ReduceLROnPlateau(monitor='val_loss', factor=0.3,
                          patience=8, min_lr=1e-8, verbose=0)
    ]

    print(f"\n{'='*50}")
    print(f"Training {model_name}...")
    print(f"Parameters: {model.count_params():,}")
    print(f"{'='*50}")

    start_time = time.time()

    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=epochs,
        batch_size=batch_size,
        callbacks=callbacks,
        verbose=0,
        shuffle=False
    )

    training_time = time.time() - start_time

    final_train_loss = history.history['loss'][-1]
    final_val_loss = history.history['val_loss'][-1]
    best_epoch = np.argmin(history.history['val_loss']) + 1

    print(f"✓ Training completed in {training_time:.2f}s")
    print(f"  Epochs trained: {len(history.history['loss'])}")
    print(f"  Best epoch: {best_epoch}")
    print(f"  Final train loss: {final_train_loss:.6f}")
    print(f"  Final val loss: {final_val_loss:.6f}")

    return history, training_time


def evaluate_model(model, X_train, y_train, X_val, y_val, X_test, y_test,
                   scaler, model_name='Model'):

    y_train_pred = model.predict(X_train, verbose=0).flatten()
    y_val_pred = model.predict(X_val, verbose=0).flatten()
    y_test_pred = model.predict(X_test, verbose=0).flatten()

    y_train_orig = scaler.inverse_transform(y_train.reshape(-1, 1)).flatten()
    y_train_pred_orig = scaler.inverse_transform(y_train_pred.reshape(-1, 1)).flatten()
    y_val_orig = scaler.inverse_transform(y_val.reshape(-1, 1)).flatten()
    y_val_pred_orig = scaler.inverse_transform(y_val_pred.reshape(-1, 1)).flatten()
    y_test_orig = scaler.inverse_transform(y_test.reshape(-1, 1)).flatten()
    y_test_pred_orig = scaler.inverse_transform(y_test_pred.reshape(-1, 1)).flatten()

    print(f"\n{model_name} Results:")
    regression_report(y_train_orig, y_train_pred_orig, y_val_orig, y_val_pred_orig)

    test_mae = mean_absolute_error(y_test_orig, y_test_pred_orig)
    test_rmse = np.sqrt(mean_squared_error(y_test_orig, y_test_pred_orig))

    print(f"\nTEST SET:")
    print(f" MAE: {test_mae:.6f}")
    print(f" RMSE: {test_rmse:.6f}")

    fig, ax = plt.subplots()
    ax.plot(y_test_orig, y_test_pred_orig, 'ro')
    ax.plot(y_test_orig, y_test_orig, 'b-')
    ax.set(xlabel='y_data', ylabel='y_predicted',
           title=f'{model_name} - Test data parity plot')
    plt.show()

    fig, ax = plt.subplots()
    ax.plot(y_test_orig, 'b-', label='Actual')
    ax.plot(y_test_pred_orig, 'r-', alpha=0.5, label='Predicted')
    ax.plot(y_test_pred_orig, 'ro', alpha=0.25)
    ax.set(xlabel='index', ylabel='y(t)',
           title=f'{model_name} - Test: Time-series prediction')
    ax.legend()
    plt.show()

    actual_dir = np.diff(y_test_orig) > 0
    pred_dir = np.diff(y_test_pred_orig) > 0
    dir_acc = np.mean(actual_dir == pred_dir) * 100
    print(f" Directional Accuracy: {dir_acc:.2f}%")

    return {
        'test_mae': test_mae,
        'test_rmse': test_rmse,
        'directional_accuracy': dir_acc,
        'y_test': y_test_orig,
        'y_pred': y_test_pred_orig
    }


def train_single_model(series, model_type, series_name, lookback=60):
    print(f"\n{'#'*60}")
    print(f"# {series_name} - {model_type}")
    print(f"{'#'*60}\n")

    X_train, y_train, X_val, y_val, X_test, y_test, scaler = prepare_data(
        series, lookback=lookback
    )

    if model_type == 'RNN':
        model = build_rnn_model(lookback)
    elif model_type == 'LSTM':
        model = build_lstm_model(lookback)
    elif model_type == 'GRU':
        model = build_gru_model(lookback)
    else:
        raise ValueError("model_type must be one of: 'RNN', 'LSTM', 'GRU'")

    model.summary()
    history, train_time = train_model(
        model, X_train, y_train, X_val, y_val,
        epochs=200, batch_size=16, model_name=model_type
    )

    history_plot(history)

    results = evaluate_model(
        model, X_train, y_train, X_val, y_val, X_test, y_test,
        scaler, model_name=model_type
    )

    results['model'] = model
    results['history'] = history
    results['train_time'] = train_time
    results['X_test'] = X_test
    results['scaler'] = scaler

    return results


def generate_forecast(model, X_test, scaler, results, model_name, series_name,
                     spread_df, series_col, lookback):
    print(f"\n{'='*50}")
    print(f"Forecasting with {model_name}")
    print(f"{'='*50}")

    last_sequence = X_test[-1]
    forecast_vals = forecast(model, last_sequence, n_steps=30)
    forecast_orig = scaler.inverse_transform(np.array(forecast_vals).reshape(-1, 1))

    plot_forecast_with_confidence_plotly(
        {model_name: results},
        model_name,
        forecast_orig,
        X_test,
        series_name,
        spread_df,      
        series_col,    
        lookback        
    )

    return forecast_orig

3Y Yield Spread (US-KR)

Code

LOOKBACK = 60
results_3y_rnn = train_single_model(spread['US_KR_3Y'], 'RNN', '3Y Yield Spread', LOOKBACK)


############################################################
# 3Y Yield Spread - RNN
############################################################

Total samples: 5859
X shape: (5859, 61, 1), y shape: (5859, 1)
Train: 4101, Val: 1171, Test: 587

Model: "sequential"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ simple_rnn (SimpleRNN)          │ (None, 60, 128)        │        16,640 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 60, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ simple_rnn_1 (SimpleRNN)        │ (None, 60, 64)         │        12,352 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_1 (Dropout)             │ (None, 60, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ simple_rnn_2 (SimpleRNN)        │ (None, 32)             │         3,104 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_2 (Dropout)             │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 32,129 (125.50 KB)

 Trainable params: 32,129 (125.50 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training RNN...
Parameters: 32,129
==================================================
✓ Training completed in 346.32s
  Epochs trained: 111
  Best epoch: 86
  Final train loss: 0.000872
  Final val loss: 0.000819


RNN Results:
---------- Regression report ----------
TRAINING:
 MSE: 0.00747452061332869
 MAE: 0.06464906365798083
 RMSE: 0.08645530991980012

VALIDATION:
 MSE: 0.02239153744452516
 MAE: 0.12189210826901155
 RMSE: 0.14963802138669557


TEST SET:
 MAE: 0.429435
 RMSE: 0.462740

 Directional Accuracy: 39.59%

Code

results_3y_lstm = train_single_model(spread['US_KR_3Y'], 'LSTM', '3Y Yield Spread', LOOKBACK)


############################################################
# 3Y Yield Spread - LSTM
############################################################

Total samples: 5859
X shape: (5859, 61, 1), y shape: (5859, 1)
Train: 4101, Val: 1171, Test: 587

Model: "sequential_1"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm (LSTM)                     │ (None, 60, 128)        │        66,560 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_3 (Dropout)             │ (None, 60, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_1 (LSTM)                   │ (None, 60, 64)         │        49,408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_4 (Dropout)             │ (None, 60, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_2 (LSTM)                   │ (None, 32)             │        12,416 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_5 (Dropout)             │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 128,417 (501.63 KB)

 Trainable params: 128,417 (501.63 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training LSTM...
Parameters: 128,417
==================================================
✓ Training completed in 369.71s
  Epochs trained: 41
  Best epoch: 16
  Final train loss: 0.004304
  Final val loss: 0.006624


LSTM Results:
---------- Regression report ----------
TRAINING:
 MSE: 0.2543692575470751
 MAE: 0.37410867375645396
 RMSE: 0.5043503321571972

VALIDATION:
 MSE: 0.13125302071799672
 MAE: 0.3080466964802101
 RMSE: 0.362288587617657


TEST SET:
 MAE: 0.770136
 RMSE: 0.825623

 Directional Accuracy: 48.29%

Code

results_3y_gru = train_single_model(spread['US_KR_3Y'], 'GRU', '3Y Yield Spread', LOOKBACK)


############################################################
# 3Y Yield Spread - GRU
############################################################

Total samples: 5859
X shape: (5859, 61, 1), y shape: (5859, 1)
Train: 4101, Val: 1171, Test: 587

Model: "sequential_2"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ gru (GRU)                       │ (None, 60, 128)        │        50,304 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_6 (Dropout)             │ (None, 60, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ gru_1 (GRU)                     │ (None, 60, 64)         │        37,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_7 (Dropout)             │ (None, 60, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ gru_2 (GRU)                     │ (None, 32)             │         9,408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_8 (Dropout)             │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 96,993 (378.88 KB)

 Trainable params: 96,993 (378.88 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training GRU...
Parameters: 96,993
==================================================
✓ Training completed in 501.19s
  Epochs trained: 60
  Best epoch: 35
  Final train loss: 0.006739
  Final val loss: 0.002946


GRU Results:
---------- Regression report ----------
TRAINING:
 MSE: 0.19802202642290276
 MAE: 0.3149918133908452
 RMSE: 0.4449966588895952

VALIDATION:
 MSE: 0.0949205274856023
 MAE: 0.23282248901963049
 RMSE: 0.308091751732503


TEST SET:
 MAE: 0.309445
 RMSE: 0.397640

 Directional Accuracy: 48.81%

10Y Yield Spread (US-KR)

Code

results_10y_rnn = train_single_model(spread['US_KR_10Y'], 'RNN', '10Y Yield Spread', LOOKBACK)


############################################################
# 10Y Yield Spread - RNN
############################################################

Total samples: 5859
X shape: (5859, 61, 1), y shape: (5859, 1)
Train: 4101, Val: 1171, Test: 587

Model: "sequential_3"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ simple_rnn_3 (SimpleRNN)        │ (None, 60, 128)        │        16,640 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_9 (Dropout)             │ (None, 60, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ simple_rnn_4 (SimpleRNN)        │ (None, 60, 64)         │        12,352 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_10 (Dropout)            │ (None, 60, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ simple_rnn_5 (SimpleRNN)        │ (None, 32)             │         3,104 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_11 (Dropout)            │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_3 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 32,129 (125.50 KB)

 Trainable params: 32,129 (125.50 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training RNN...
Parameters: 32,129
==================================================
✓ Training completed in 315.03s
  Epochs trained: 104
  Best epoch: 79
  Final train loss: 0.000791
  Final val loss: 0.000528


RNN Results:
---------- Regression report ----------
TRAINING:
 MSE: 0.006633533842204294
 MAE: 0.06027686420478407
 RMSE: 0.08144650908543775

VALIDATION:
 MSE: 0.010528695566965304
 MAE: 0.07765230535890073
 RMSE: 0.102609432153995


TEST SET:
 MAE: 0.340938
 RMSE: 0.416343

 Directional Accuracy: 41.64%

Code

results_10y_lstm = train_single_model(spread['US_KR_10Y'], 'LSTM', '10Y Yield Spread', LOOKBACK)


############################################################
# 10Y Yield Spread - LSTM
############################################################

Total samples: 5859
X shape: (5859, 61, 1), y shape: (5859, 1)
Train: 4101, Val: 1171, Test: 587

Model: "sequential_4"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm_3 (LSTM)                   │ (None, 60, 128)        │        66,560 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_12 (Dropout)            │ (None, 60, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_4 (LSTM)                   │ (None, 60, 64)         │        49,408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_13 (Dropout)            │ (None, 60, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_5 (LSTM)                   │ (None, 32)             │        12,416 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_14 (Dropout)            │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_4 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 128,417 (501.63 KB)

 Trainable params: 128,417 (501.63 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training LSTM...
Parameters: 128,417
==================================================
✓ Training completed in 1222.75s
  Epochs trained: 134
  Best epoch: 109
  Final train loss: 0.002378
  Final val loss: 0.001982


LSTM Results:
---------- Regression report ----------
TRAINING:
 MSE: 0.03036377814659731
 MAE: 0.12628176993036339
 RMSE: 0.1742520534932008

VALIDATION:
 MSE: 0.04582340242642574
 MAE: 0.15569489044115029
 RMSE: 0.21406401478629178


TEST SET:
 MAE: 0.654051
 RMSE: 0.764751

 Directional Accuracy: 45.05%

Code

results_10y_gru = train_single_model(spread['US_KR_10Y'], 'GRU', '10Y Yield Spread', LOOKBACK)


############################################################
# 10Y Yield Spread - GRU
############################################################

Total samples: 5859
X shape: (5859, 61, 1), y shape: (5859, 1)
Train: 4101, Val: 1171, Test: 587

Model: "sequential_5"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ gru_3 (GRU)                     │ (None, 60, 128)        │        50,304 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_15 (Dropout)            │ (None, 60, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ gru_4 (GRU)                     │ (None, 60, 64)         │        37,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_16 (Dropout)            │ (None, 60, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ gru_5 (GRU)                     │ (None, 32)             │         9,408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_17 (Dropout)            │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_5 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 96,993 (378.88 KB)

 Trainable params: 96,993 (378.88 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training GRU...
Parameters: 96,993
==================================================
✓ Training completed in 1028.63s
  Epochs trained: 130
  Best epoch: 105
  Final train loss: 0.001668
  Final val loss: 0.001350


GRU Results:
---------- Regression report ----------
TRAINING:
 MSE: 0.018706328996475247
 MAE: 0.10151078417381192
 RMSE: 0.1367710824570576

VALIDATION:
 MSE: 0.029780608351273753
 MAE: 0.12435554087171559
 RMSE: 0.17257058947362308


TEST SET:
 MAE: 0.669152
 RMSE: 0.822589

 Directional Accuracy: 40.96%

model comparison

Code

data = {
    "Model Type": ["Traditional","Traditional","Deep Learning", "Deep Learning", "Deep Learning","Deep Learning", "Deep Learning", "Deep Learning" ],
    "Model": ["ARIMA(3y)","ARIMA(10y)","RNN(3y)", "LSTM(3y)","GRU(3y)", "RNN(10y)","LSTM(10Y)","GRU(10y)"],
    "Input Type": ["Univariate", "Univariate", "Univariate", "Univariate", "Univariate", "Univariate", "Univariate", "Uniivariate"],
    "RMSE": [0.082926,0.08423856,results_3y_rnn['test_rmse'],results_3y_lstm['test_rmse'],results_10y_gru['test_rmse'],results_10y_rnn['test_rmse'], results_10y_lstm['test_rmse'],results_10y_gru['test_rmse']]
}

rmse_df1 = pd.DataFrame(data)
display(rmse_df1.style.hide(axis="index"))

Model Type	Model	Input Type	RMSE
Traditional	ARIMA(3y)	Univariate	0.082926
Traditional	ARIMA(10y)	Univariate	0.084239
Deep Learning	RNN(3y)	Univariate	0.462740
Deep Learning	LSTM(3y)	Univariate	0.825623
Deep Learning	GRU(3y)	Univariate	0.822589
Deep Learning	RNN(10y)	Univariate	0.416343
Deep Learning	LSTM(10Y)	Univariate	0.764751
Deep Learning	GRU(10y)	Uniivariate	0.822589

Relative Performance

Across both the 3-year and 10-year yield spreads, the simple RNN achieves the lowest RMSE among the three deep learning models, with the GRU second and the LSTM performing worst. However, directional accuracy is only around 40–50% for all models, which is not high enough to consider any of them as strong short-term forecasting tools. This suggests that while the RNN and GRU can track the overall level of the spread reasonably well, they struggle to capture short-horizon fluctuations. The LSTM, despite being the most complex architecture, delivers the highest RMSE and only slightly better directional accuracy for the 3-year spread.

-Effectiveness of Regularization

All three architectures are trained under the same regularization scheme: dropout of 0.3 after each layer, early stopping on validation loss with a patience of 25 epochs, and a ReduceLROnPlateau scheduler to lower the learning rate when validation loss stops improving. The training logs indicate that this setup keeps optimization stable and mitigates overfitting. For the 3-year RNN, the final training loss is approximately 0.00081 versus a validation loss of about 0.00093, with early stopping triggered around epoch 79 out of 104. For the 3-year GRU, the training loss is about 0.00143 and the validation loss about 0.00106, again fairly close. The 10-year models show similar behavior; in some cases, the validation loss is even slightly lower than the training loss (e.g., 10Y RNN train 0.00090 vs. val 0.00042), which is consistent with dropout noise and early stopping rather than severe overfitting. The training and validation loss plots support this interpretation. Thus, regularization is effective in maintaining training stability and preventing runaway overfitting, and the gap between training and validation losses is modest for all models. Nevertheless, there remains a clear performance gap between architectures: the LSTM still generalizes worse than the RNN and GRU, suggesting that its complexity is excessive relative to the information contained in a single univariate spread.

-Comparison with Traditional Models

The summary table that compares ARIMA to the deep models makes this even clearer. For both maturities, ARIMA dramatically outperforms all three deep-learning architectures in terms of RMSE. This indicates that the dynamics of the yield spread are still well captured by relatively low-order linear structure. The additional flexibility of RNNs, LSTMs, and GRUs does not translate into better predictions; instead, it increases variance and complicates model tuning. ARIMA also has advantages in interpretability and stability: its AR and MA coefficients can be directly inspected, the propagation of shocks is transparent, and mean-reversion behavior is easy to characterize. Overall, this comparison shows that deep learning models are not always superior to traditional time series models. However, there may still be room to improve deep learning performance through more careful hyperparameter tuning, richer input features, or alternative architectures. Ultimately, selecting an appropriate model is crucial: more complex methods do not necessarily yield better forecasts when the underlying data-generating process and modeling conditions do not justify that complexity.

Forecast(3Y & 10Y)

3Y Forecast

Code

LOOKBACK = 60

comparison_3y = pd.DataFrame({
    'Model': ['RNN', 'LSTM', 'GRU'],
    'RMSE': [results_3y_rnn['test_rmse'],
             results_3y_lstm['test_rmse'],
             results_3y_gru['test_rmse']]
})
best_3y = comparison_3y.loc[comparison_3y['RMSE'].idxmin(), 'Model']
print(f"Best model (3Y): {best_3y}")
if best_3y == 'RNN':
    forecast_3y = generate_forecast(
        results_3y_rnn['model'], results_3y_rnn['X_test'],
        results_3y_rnn['scaler'], results_3y_rnn, 'RNN', '3Y Yield Spread',
        spread, 'US_KR_3Y', LOOKBACK  
    )
elif best_3y == 'LSTM':
    forecast_3y = generate_forecast(
        results_3y_lstm['model'], results_3y_lstm['X_test'],
        results_3y_lstm['scaler'], results_3y_lstm, 'LSTM', '3Y Yield Spread',
        spread, 'US_KR_3Y', LOOKBACK  
    )
else:
    forecast_3y = generate_forecast(
        results_3y_gru['model'], results_3y_gru['X_test'],
        results_3y_gru['scaler'], results_3y_gru, 'GRU', '3Y Yield Spread',
        spread, 'US_KR_3Y', LOOKBACK  
    )

Best model (3Y): GRU

==================================================
Forecasting with GRU
==================================================

10Y Forecast

Code

comparison_10y = pd.DataFrame({
    'Model': ['RNN', 'LSTM', 'GRU'],
    'RMSE': [results_10y_rnn['test_rmse'],
             results_10y_lstm['test_rmse'],
             results_10y_gru['test_rmse']]
})
best_10y = comparison_10y.loc[comparison_10y['RMSE'].idxmin(), 'Model']
print(f"Best model (10Y): {best_10y}")
if best_10y == 'RNN':
    forecast_10y = generate_forecast(
        results_10y_rnn['model'], results_10y_rnn['X_test'],
        results_10y_rnn['scaler'], results_10y_rnn, 'RNN', '10Y Yield Spread',
        spread, 'US_KR_10Y', LOOKBACK  
    )
elif best_10y == 'LSTM':
    forecast_10y = generate_forecast(
        results_10y_lstm['model'], results_10y_lstm['X_test'],
        results_10y_lstm['scaler'], results_10y_lstm, 'LSTM', '10Y Yield Spread',
        spread, 'US_KR_10Y', LOOKBACK  
    )
else:
    forecast_10y = generate_forecast(
        results_10y_gru['model'], results_10y_gru['X_test'],
        results_10y_gru['scaler'], results_10y_gru, 'GRU', '10Y Yield Spread',
        spread, 'US_KR_10Y', LOOKBACK 
    )

Best model (10Y): RNN

==================================================
Forecasting with RNN
==================================================

Forecast horizon and behavior

For both the 3Y and 10Y spreads, the RNN is used to generate 30-step-ahead forecasts with a 95% confidence band built from test residuals. In the first several steps after the forecast origin, the model essentially continues the recent pattern of the test data: the predicted spread stays close to the last observed values and lies comfortably within the confidence interval, suggesting that the one-step dynamics are captured reasonably well over a short horizon. As we move further out, however, the behavior changes: the forecast path becomes noticeably smoother than the historical series, sharp day-to-day fluctuations are damped, and the uncertainty band fans out. This is typical of a one-step-ahead model applied recursively—the forecasts are most reliable over roughly the first 10–15 days, after which accumulated errors and the model’s implicit tendency toward a smoothed mean-reverting trajectory dominate. In practice, this means the univariate RNN forecasts are informative for very short-term spread scenarios, while longer-horizon projections should be treated more as indicative trend scenarios with wide uncertainty rather than precise point predictions.

Multivariate model

Data cleaning

Code

vix = pd.read_csv("../data/stock/vix.csv")
vix['Date'] = pd.to_datetime(vix['Date'])
vix = vix.sort_values('Date')

usmi = pd.read_csv("../data/usmi.csv")
usmi = usmi.rename(columns={'observation_date': 'Date', 'INDPRO': 'usm'})
usmi['Date'] = pd.to_datetime(usmi['Date'])
usmi = usmi.sort_values('Date')

kr = pd.read_csv("../data/fx rate/kor.csv")
kr['Date'] = pd.to_datetime(kr['observation_date'])
kr = kr.rename(columns={'DEXKOUS': 'krw'})
kr = kr.sort_values('Date')

trade = pd.read_csv("../data/trade/trade_merged.csv")
trade['Date'] = pd.to_datetime(trade['Date'])
trade = trade.sort_values('Date')

trade_exports = trade[['Date', 'export_USD']].copy()
trade_exports = trade_exports.rename(columns={'export_USD': 'exports'})

vix_monthly_m2 = vix.copy()
vix_monthly_m2['year_month'] = vix_monthly_m2['Date'].dt.to_period('M')
vix_monthly_m2 = vix_monthly_m2.groupby('year_month').first().reset_index()
vix_monthly_m2['Date'] = vix_monthly_m2['year_month'].dt.to_timestamp()
vix_monthly_m2 = vix_monthly_m2[['Date', 'vix']]

kor_monthly_m2 = kr.copy()
kor_monthly_m2['year_month'] = kor_monthly_m2['Date'].dt.to_period('M')
kor_monthly_m2 = kor_monthly_m2.groupby('year_month').first().reset_index()
kor_monthly_m2['Date'] = kor_monthly_m2['year_month'].dt.to_timestamp()
kor_monthly_m2 = kor_monthly_m2[['Date', 'krw']]

df4 = trade_exports.copy()
df4 = df4.merge(usmi, on='Date', how='left')
df4 = df4.merge(vix_monthly_m2, on='Date', how='left')
df4 = df4.merge(kor_monthly_m2, on='Date', how='left')

df4 = df4.fillna(method='ffill')
df4 = df4.dropna()

Utility functions

Code

def form_arrays(
    x,
    lookback=3,
    delay=1,
    step=1,
    feature_columns=[0],
    target_columns=[0],
    unique=False,
    verbose=False,
    predict="scalar"
):
    i_start = 0
    count = 0

    x_out = []
    y_out = []

    while i_start + lookback + delay < x.shape[0]:
        i_stop = i_start + lookback
        i_pred = i_stop + delay

        if verbose and count < 2:
            print("indice range:", i_start, i_stop, "-->", i_pred)

        indices_to_keep = []
        j = i_stop
        while j >= i_start:
            indices_to_keep.append(j)
            j = j - step

        xtmp = x[indices_to_keep, :]
        xtmp = xtmp[:, feature_columns]

        if predict == "scalar":
            ytmp = x[i_pred, target_columns]
        if predict == "vector":
            ytmp = x[i_stop + 1:i_pred, target_columns]

        x_out.append(xtmp)
        y_out.append(ytmp)

        if verbose and count < 2:
            print("X:\n", xtmp, "\nY:\n", ytmp)
            print("shape:", xtmp.shape, "-->", ytmp.shape)

        if verbose and count < 2:
            fig, ax = plt.subplots()
            ax.plot(x, 'b-')
            ax.plot(x, 'bx')
            ax.plot(indices_to_keep, xtmp, 'go')
            if predict == "scalar":
                ax.plot(i_pred * np.ones(len(target_columns)), ytmp, 'ro')
            elif predict == "vector":
                ax.plot(i_stop + 1 + np.tile(np.arange(ytmp.shape[0]), (ytmp.shape[1], 1)),
                        ytmp.T, 'ro')
            plt.show()

        if unique:
            i_start += lookback
        i_start += 1
        count += 1

    return np.array(x_out), np.array(y_out)


def regression_report(yt, ytp, yv, yvp):
    print("---------- Regression report ----------")
    print("TRAINING:")
    print(" RMSE:", np.sqrt(mean_squared_error(yt, ytp)))

    fig, ax = plt.subplots()
    ax.plot(yt, ytp, 'ro')
    ax.plot(yt, yt, 'b-')
    ax.set(xlabel='y_data', ylabel='y_predicted',
           title='Training data parity plot (line y=x represents a perfect fit)')
    plt.show()

    frac_plot = 1.0
    upper = int(frac_plot * yt.shape[0])
    fig, ax = plt.subplots()
    ax.plot(yt[0:upper], 'b-')
    ax.plot(ytp[0:upper], 'r-', alpha=0.5)
    ax.plot(ytp[0:upper], 'ro', alpha=0.25)
    ax.set(xlabel='index', ylabel='y(t) (blue=actual & red=prediction)',
           title='Training: Time-series prediction')
    plt.show()

    print("VALIDATION:")
    print(" RMSE:", np.sqrt(mean_squared_error(yv, yvp)))

    fig, ax = plt.subplots()
    ax.plot(yv, yvp, 'ro')
    ax.plot(yv, yv, 'b-')
    ax.set(xlabel='y_data', ylabel='y_predicted',
           title='Validation data parity plot (line y=x represents a perfect fit)')
    plt.show()

    upper = int(frac_plot * yv.shape[0])
    fig, ax = plt.subplots()
    ax.plot(yv[0:upper], 'b-')
    ax.plot(yvp[0:upper], 'r-', alpha=0.5)
    ax.plot(yvp[0:upper], 'ro', alpha=0.25)
    ax.set(xlabel='index', ylabel='y(t) (blue=actual & red=prediction)',
           title='Validation: Time-series prediction')
    plt.show()


def history_plot(history):
    FS = 16
    history_dict = history.history
    loss_values = history_dict["loss"]
    val_loss_values = history_dict["val_loss"]
    epochs = range(1, len(loss_values) + 1)
    plt.plot(epochs, loss_values, "bo", label="Training loss")
    plt.plot(epochs, val_loss_values, "b", label="Validation loss")
    plt.title("Training and validation loss")
    plt.xlabel("Epochs")
    plt.ylabel("Loss")
    plt.legend()
    plt.show()

Forecast & modeling function

Code

def forecast(model, last_input, n_steps):

    forecast_vals = []
    input_seq = last_input.copy().reshape(1, *last_input.shape)

    for _ in range(n_steps):
        pred = model.predict(input_seq, verbose=0)[0, 0]
        forecast_vals.append(pred)


        new_timestep = input_seq[0, -1, :].copy()
        new_timestep[3] = pred

        next_input = np.concatenate([input_seq[0, 1:, :], new_timestep.reshape(1, -1)], axis=0)
        input_seq = next_input.reshape(1, *next_input.shape)

    return forecast_vals




def prepare_data(df, lookback=60, train_split=0.7, val_split=0.2):

    feature_cols = ['vix', 'usm', 'krw', 'exports']
    data = df[feature_cols].values

    scalers = {}
    data_scaled = np.zeros_like(data)

    for i, col in enumerate(feature_cols):
        scaler = MinMaxScaler(feature_range=(0, 1))
        data_scaled[:, i] = scaler.fit_transform(data[:, i].reshape(-1, 1)).flatten()
        scalers[col] = scaler

    X, y = form_arrays(
        data_scaled,
        lookback=lookback,
        delay=1,
        step=1,
        feature_columns=[0, 1, 2, 3],
        target_columns=[3],
        unique=False,
        verbose=False,
        predict="scalar"
    )

    print(f"Total samples: {len(X)}")
    print(f"X shape: {X.shape}, y shape: {y.shape}")

    n_train = int(len(X) * train_split)
    n_val = int(len(X) * val_split)

    X_train = X[:n_train]
    y_train = y[:n_train]
    X_val = X[n_train:n_train + n_val]
    y_val = y[n_train:n_train + n_val]
    X_test = X[n_train + n_val:]
    y_test = y[n_train + n_val:]

    print(f"Train: {len(X_train)}, Val: {len(X_val)}, Test: {len(X_test)}")

    return X_train, y_train, X_val, y_val, X_test, y_test, scalers


def build_rnn_model(lookback, n_features, units=[128, 64, 32], dropout=0.3, lr=0.0005):
    model = Sequential([
        SimpleRNN(units[0], return_sequences=True, input_shape=(lookback, n_features)),
        Dropout(dropout),
        SimpleRNN(units[1], return_sequences=True),
        Dropout(dropout),
        SimpleRNN(units[2]),
        Dropout(dropout),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=lr), loss='mse', metrics=['mae'])
    return model


def build_lstm_model(lookback, n_features, units=[128, 64, 32], dropout=0.3, lr=0.0005):
    model = Sequential([
        LSTM(units[0], return_sequences=True, input_shape=(lookback, n_features)),
        Dropout(dropout),
        LSTM(units[1], return_sequences=True),
        Dropout(dropout),
        LSTM(units[2]),
        Dropout(dropout),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=lr), loss='mse', metrics=['mae'])
    return model


def build_gru_model(lookback, n_features, units=[128, 64, 32], dropout=0.3, lr=0.0005):
    model = Sequential([
        GRU(units[0], return_sequences=True, input_shape=(lookback, n_features)),
        Dropout(dropout),
        GRU(units[1], return_sequences=True),
        Dropout(dropout),
        GRU(units[2]),
        Dropout(dropout),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=lr), loss='mse', metrics=['mae'])
    return model


def train_model(model, X_train, y_train, X_val, y_val,
                epochs=200, batch_size=16, model_name='Model'):
    callbacks = [
        EarlyStopping(monitor='val_loss', patience=25,
                      restore_best_weights=True, verbose=0),
        ReduceLROnPlateau(monitor='val_loss', factor=0.3,
                          patience=8, min_lr=1e-8, verbose=0)
    ]

    print(f"\n{'='*50}")
    print(f"Training {model_name}...")
    print(f"Parameters: {model.count_params():,}")
    print(f"{'='*50}")

    start_time = time.time()

    history = model.fit(
        X_train, y_train,
        validation_data=(X_val, y_val),
        epochs=epochs,
        batch_size=batch_size,
        callbacks=callbacks,
        verbose=0,
        shuffle=False
    )

    training_time = time.time() - start_time

    final_train_loss = history.history['loss'][-1]
    final_val_loss = history.history['val_loss'][-1]
    best_epoch = np.argmin(history.history['val_loss']) + 1

    print(f" Training completed in {training_time:.2f}s")
    print(f"  Epochs trained: {len(history.history['loss'])}")
    print(f"  Best epoch: {best_epoch}")
    print(f"  Final train loss: {final_train_loss:.6f}")
    print(f"  Final val loss: {final_val_loss:.6f}")

    return history, training_time


def evaluate_model(model, X_train, y_train, X_val, y_val, X_test, y_test,
                   scalers, model_name='Model'):

    y_train_pred = model.predict(X_train, verbose=0).flatten()
    y_val_pred = model.predict(X_val, verbose=0).flatten()
    y_test_pred = model.predict(X_test, verbose=0).flatten()

    export_scaler = scalers['exports']

    y_train_orig = export_scaler.inverse_transform(y_train.reshape(-1, 1)).flatten()
    y_train_pred_orig = export_scaler.inverse_transform(y_train_pred.reshape(-1, 1)).flatten()
    y_val_orig = export_scaler.inverse_transform(y_val.reshape(-1, 1)).flatten()
    y_val_pred_orig = export_scaler.inverse_transform(y_val_pred.reshape(-1, 1)).flatten()
    y_test_orig = export_scaler.inverse_transform(y_test.reshape(-1, 1)).flatten()
    y_test_pred_orig = export_scaler.inverse_transform(y_test_pred.reshape(-1, 1)).flatten()

    print(f"\n{model_name} Results:")
    regression_report(y_train_orig, y_train_pred_orig, y_val_orig, y_val_pred_orig)

    test_rmse = np.sqrt(mean_squared_error(y_test_orig, y_test_pred_orig))

    print(f"\nTEST SET:")
    print(f" RMSE: {test_rmse:.6f}")

    fig, ax = plt.subplots()
    ax.plot(y_test_orig, y_test_pred_orig, 'ro')
    ax.plot(y_test_orig, y_test_orig, 'b-')
    ax.set(xlabel='y_data', ylabel='y_predicted',
           title=f'{model_name} - Test data parity plot')
    plt.show()

    fig, ax = plt.subplots()
    ax.plot(y_test_orig, 'b-', label='Actual')
    ax.plot(y_test_pred_orig, 'r-', alpha=0.5, label='Predicted')
    ax.plot(y_test_pred_orig, 'ro', alpha=0.25)
    ax.set(xlabel='index', ylabel='y(t)',
           title=f'{model_name} - Test: Time-series prediction')
    ax.legend()
    plt.show()

    return {
        'test_rmse': test_rmse,
        'y_test': y_test_orig,
        'y_pred': y_test_pred_orig
    }


def train_single_model(df, model_type, series_name, lookback=60):
    print(f"\n{'#'*60}")
    print(f"# {series_name} - {model_type}")
    print(f"{'#'*60}\n")

    X_train, y_train, X_val, y_val, X_test, y_test, scalers = prepare_data(
        df, lookback=lookback
    )

    n_features = X_train.shape[2]  # Should be 4 (vix, usm, krw, exports)

    if model_type == 'RNN':
        model = build_rnn_model(lookback, n_features)
    elif model_type == 'LSTM':
        model = build_lstm_model(lookback, n_features)
    elif model_type == 'GRU':
        model = build_gru_model(lookback, n_features)
    else:
        raise ValueError("model_type must be one of: 'RNN', 'LSTM', 'GRU'")

    model.summary()
    history, train_time = train_model(
        model, X_train, y_train, X_val, y_val,
        epochs=200, batch_size=16, model_name=model_type
    )

    history_plot(history)

    results = evaluate_model(
        model, X_train, y_train, X_val, y_val, X_test, y_test,
        scalers, model_name=model_type
    )

    results['model'] = model
    results['history'] = history
    results['train_time'] = train_time
    results['X_test'] = X_test
    results['scalers'] = scalers

    return results


def generate_forecast(model, X_test, scalers, n_steps=30):

    print(f"Generating {n_steps}-step forecast...")

    last_sequence = X_test[-1]
    forecast_vals = forecast(model, last_sequence, n_steps=n_steps)
    forecast_orig = scalers['exports'].inverse_transform(np.array(forecast_vals).reshape(-1, 1))

    return forecast_orig

Forecast plot generating function

Code

def plot_forecast_exports_plotly(results_dict, best_model_name, forecast_orig,
                                  df4, lookback=30, confidence=0.95):

    y_test_actual = results_dict['y_test']
    y_test_pred = results_dict['y_pred']

    test_errors = y_test_actual - y_test_pred
    std_error = np.std(test_errors)
    z_score = 1.96 

    forecast_flat = forecast_orig.flatten()
    upper_bound = forecast_flat + z_score * std_error
    lower_bound = forecast_flat - z_score * std_error

    total_samples = len(df4) - lookback - 1
    train_size = int(total_samples * 0.7)
    val_size = int(total_samples * 0.2)
    test_size = total_samples - train_size - val_size
    test_start_idx = lookback + 1 + train_size + val_size
    test_dates = pd.to_datetime(
        df4['Date'].iloc[test_start_idx:test_start_idx + len(y_test_actual)]
    )

    last_test_date = test_dates.iloc[-1]

    forecast_dates = pd.date_range(
        start=last_test_date + pd.DateOffset(months=1),
        periods=30,
        freq='MS'
    )

    fig = go.Figure()


    fig.add_trace(go.Scatter(
        x=test_dates,
        y=y_test_actual,
        mode='lines',
        name='Test Actual',
        line=dict(color='steelblue', width=2.5),         
        hovertemplate='<b>Actual</b><br>Date: %{x|%Y-%m}'
                      '<br>Value: %{y:,.2f} USD<extra></extra>'
    ))

    fig.add_trace(go.Scatter(
        x=test_dates,
        y=y_test_pred,
        mode='lines',
        name='Test Predicted',
        line=dict(color='darkorange', width=2, dash='dot'),   
        opacity=0.8,
        hovertemplate='<b>Predicted</b><br>Date: %{x|%Y-%m}'
                      '<br>Value: %{y:,.2f} USD<extra></extra>'
    ))

    fig.add_trace(go.Scatter(
        x=forecast_dates,
        y=upper_bound,
        mode='lines',
        name='95% CI Upper',
        line=dict(width=0),
        showlegend=False,
        hoverinfo='skip'
    ))

    fig.add_trace(go.Scatter(
        x=forecast_dates,
        y=lower_bound,
        mode='lines',
        name='95% Confidence Interval',
        line=dict(width=0),
        fill='tonexty',
        fillcolor='palegreen', 
        text=upper_bound,
        hovertemplate='<b>95% CI</b><br>Date: %{x|%Y-%m}'
                      '<br>Upper: %{text:,.2f}'
                      '<br>Lower: %{y:,.2f}<extra></extra>'
    ))


    fig.add_trace(go.Scatter(
        x=forecast_dates,
        y=forecast_flat,
        mode='lines+markers',
        name='Forecast',
        line=dict(color='seagreen', width=3),   
        hovertemplate='<b>Forecast</b><br>Date: %{x|%Y-%m}'
                      '<br>Value: %{y:,.2f} USD<extra></extra>'
    ))

    fig.update_layout(
        title=dict(
            text=(
                f'<b>South Korea Exports to USA (thousand dollars) '
                f'- {best_model_name} Model</b>'
                '<br><sub>12-Month Forecast with 95% Confidence Interval</sub>'
            ),
            x=0.5,
            xanchor='center',
            font=dict(size=18, color='black', family='Arial Black') 
        ),
        xaxis=dict(
            title='<b>Date</b>',
            showgrid=True,
            gridwidth=1,
            gridcolor='gainsboro',   
            tickformat='%Y-%m',
            tickangle=-45,
            tickfont=dict(size=11)
        ),
        yaxis=dict(
            title='<b>Export Value (USD)</b>',
            showgrid=True,
            gridwidth=1,
            gridcolor='gainsboro',   
            tickfont=dict(size=11),
            tickformat=',.0f'
        ),
        hovermode='x unified',
        template='plotly_white',
        width=1100,
        height=500,
        showlegend=True,
        legend=dict(
            yanchor="top",
            y=0.99,
            xanchor="left",
            x=0.01,
            bgcolor='whitesmoke',    
            bordercolor='dimgray',   
            borderwidth=1.5,
            font=dict(size=11)
        ),
        plot_bgcolor='whitesmoke',   
        margin=dict(l=80, r=40, t=100, b=90),
        shapes=[
            dict(
                type="line",
                x0=last_test_date,
                x1=last_test_date,
                y0=0,
                y1=1,
                yref='paper',
                line=dict(color='dimgray', width=2.5, dash="dash"), 
                opacity=0.7
            )
        ],
        annotations=[
            dict(
                x=last_test_date,
                y=1.05,
                yref='paper',
                text="<b>← Historical Test Data | Forecast →</b>",
                showarrow=False,
                font=dict(size=12, color='dimgray'),  
                xanchor='center'
            )
        ]
    )

    fig.show()
    return fig

South Korea export to USA(thousand dollars)

Code

LOOKBACK = 30


results_exports_rnn = train_single_model(df4, 'RNN', 'South korea Exports to USA(thousand dollars)', LOOKBACK)


############################################################
# South korea Exports to USA(thousand dollars) - RNN
############################################################

Total samples: 361
X shape: (361, 31, 4), y shape: (361, 1)
Train: 252, Val: 72, Test: 37

Model: "sequential_6"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ simple_rnn_6 (SimpleRNN)        │ (None, 30, 128)        │        17,024 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_18 (Dropout)            │ (None, 30, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ simple_rnn_7 (SimpleRNN)        │ (None, 30, 64)         │        12,352 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_19 (Dropout)            │ (None, 30, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ simple_rnn_8 (SimpleRNN)        │ (None, 32)             │         3,104 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_20 (Dropout)            │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_6 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 32,513 (127.00 KB)

 Trainable params: 32,513 (127.00 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training RNN...
Parameters: 32,513
==================================================
 Training completed in 4.85s
  Epochs trained: 29
  Best epoch: 4
  Final train loss: 0.210741
  Final val loss: 0.032729


RNN Results:
---------- Regression report ----------
TRAINING:
 RMSE: 1753269.2242705093

VALIDATION:
 RMSE: 1354189.1723267832


TEST SET:
 RMSE: 2745355.317275

Code

results_exports_lstm = train_single_model(df4, 'LSTM', 'South Korea to USA(thousand dollars)', LOOKBACK)


############################################################
# South Korea to USA(thousand dollars) - LSTM
############################################################

Total samples: 361
X shape: (361, 31, 4), y shape: (361, 1)
Train: 252, Val: 72, Test: 37

Model: "sequential_7"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lstm_6 (LSTM)                   │ (None, 30, 128)        │        68,096 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_21 (Dropout)            │ (None, 30, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_7 (LSTM)                   │ (None, 30, 64)         │        49,408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_22 (Dropout)            │ (None, 30, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ lstm_8 (LSTM)                   │ (None, 32)             │        12,416 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_23 (Dropout)            │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_7 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 129,953 (507.63 KB)

 Trainable params: 129,953 (507.63 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training LSTM...
Parameters: 129,953
==================================================
 Training completed in 15.46s
  Epochs trained: 47
  Best epoch: 22
  Final train loss: 0.003109
  Final val loss: 0.006217


LSTM Results:
---------- Regression report ----------
TRAINING:
 RMSE: 626930.9845639135

VALIDATION:
 RMSE: 763774.5124873426


TEST SET:
 RMSE: 1493632.519085

Code

results_exports_gru = train_single_model(df4, 'GRU', 'South Korea export to USA(thousand dollars)', LOOKBACK)


############################################################
# South Korea export to USA(thousand dollars) - GRU
############################################################

Total samples: 361
X shape: (361, 31, 4), y shape: (361, 1)
Train: 252, Val: 72, Test: 37

Model: "sequential_8"

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ gru_6 (GRU)                     │ (None, 30, 128)        │        51,456 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_24 (Dropout)            │ (None, 30, 128)        │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ gru_7 (GRU)                     │ (None, 30, 64)         │        37,248 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_25 (Dropout)            │ (None, 30, 64)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ gru_8 (GRU)                     │ (None, 32)             │         9,408 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout_26 (Dropout)            │ (None, 32)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_8 (Dense)                 │ (None, 1)              │            33 │
└─────────────────────────────────┴────────────────────────┴───────────────┘

 Total params: 98,145 (383.38 KB)

 Trainable params: 98,145 (383.38 KB)

 Non-trainable params: 0 (0.00 B)


==================================================
Training GRU...
Parameters: 98,145
==================================================
 Training completed in 9.02s
  Epochs trained: 26
  Best epoch: 1
  Final train loss: 0.004761
  Final val loss: 0.017573


GRU Results:
---------- Regression report ----------
TRAINING:
 RMSE: 2036095.8134129432

VALIDATION:
 RMSE: 1207095.7964096558


TEST SET:
 RMSE: 2976323.945412

model comparison

Code

data = {
    "Model Type": ["Traditional","Deep Learning", "Deep Learning", "Deep Learning"],
    "Model": ["SARIMAX","RNN", "LSTM","GRU"],
    "Input Type": ["Multivariate",  "Multivariate", "Multivariate", "Multivariate"],
    "RMSE": [619185,results_exports_rnn['test_rmse'],results_exports_lstm['test_rmse'],results_exports_gru['test_rmse']]
}
rmse_df = pd.DataFrame(data)

# Display the table
display(rmse_df.style.hide(axis="index"))

Model Type	Model	Input Type	RMSE
Traditional	SARIMAX	Multivariate	619185.000000
Deep Learning	RNN	Multivariate	2745355.317275
Deep Learning	LSTM	Multivariate	1493632.519085
Deep Learning	GRU	Multivariate	2976323.945412

Relative performance (multivariate GRU/LSTM/RNN)

In the multivariate setting with South Korea’s exports to the U.S., the GRU is clearly the strongest of the three deep models, the LSTM is in the middle, and the plain RNN performs worst. The GRU achieves the lowest test RMSE and MAE, and its forecast line tracks the held-out test data more closely than the others, with relatively tight residuals and no obvious drift. The LSTM still manages to capture the upward trend but with noticeably larger errors and more volatility around turning points. The RNN, by contrast, has much higher errors and tends to indicate that a simple recurrent architecture struggles to summarize all the information in the multivariate input window. Overall, for this dataset the additional gating structure in the GRU seems genuinely useful, allowing it to exploit cross-series information and longer-range dependencies that the basic RNN cannot handle, while LSTM’s extra complexity does not translate into better accuracy given the data size.

Effectiveness of regularization All three multivariate models share the same regularization,dropout on the recurrent stacks, early stopping on validation loss, and adaptive learning-rate reduction—and this setup performed well on preventing catastrophic overfitting even for the relatively large GRU and LSTM. Training and validation losses stay reasonably close to each other, and early stopping consistently triggers before the maximum epoch budget is used, which suggests that the models are not simply memorizing the training portion of the export series. Still, the gap in performance between GRU, LSTM, and especially RNN shows that regularization alone cannot rescue a poorly matched architecture: the RNN remains underpowered for the multivariate dynamics, and the LSTM appears somewhat over-parameterized. Concluding, regularization stabilizes learning and improves generalization somewhat, but choosing an appropriate architecture (here, the GRU) matters more than further tuning dropout or patience settings.
Comparison with SARIMAX (traditional multivariate model) Compared with the traditional model, the traditional model still has the edge: SARIMAX achieves a substantially lower test RMSE than any of the recurrent networks, while also producing stable, interpretable dynamics. This indicates that much of the structure in the export series—trend, seasonality, and relatively simple lag relationships—is well captured by linear time-series methods once you allow for appropriate differencing and seasonal terms. The GRU comes closest but still cannot match SARIMAX’s accuracy, and it requires more tuning effort and offers far less transparency about how shocks propagate through time. Taken together, the evidence suggests that for this particular multivariate macro dataset, SARIMAX is the most effective workhorse model, with the GRU being a reasonable nonlinear alternative.RNN and LSTM, however, appear dominated by these two in both performance and practicality.

Forecasting & Final model summary

Code

comparison_exports = pd.DataFrame({
    'Model': ['RNN', 'LSTM', 'GRU'],
    'Test RMSE': [results_exports_rnn['test_rmse'],
                  results_exports_lstm['test_rmse'],
                  results_exports_gru['test_rmse']],
    'Train Time (s)': [results_exports_rnn['train_time'],
                       results_exports_lstm['train_time'],
                       results_exports_gru['train_time']]
})

best_exports = comparison_exports.loc[comparison_exports['Test RMSE'].idxmin(), 'Model']
print(f"\nBest model (South Korea Exports): {best_exports}")

if best_exports == 'RNN':
    forecast_exports = generate_forecast(
        results_exports_rnn['model'],
        results_exports_rnn['X_test'],
        results_exports_rnn['scalers'],
        n_steps=30
    )
    best_results = results_exports_rnn
elif best_exports == 'LSTM':
    forecast_exports = generate_forecast(
        results_exports_lstm['model'],
        results_exports_lstm['X_test'],
        results_exports_lstm['scalers'],
        n_steps=30
    )
    best_results = results_exports_lstm
else:
    forecast_exports = generate_forecast(
        results_exports_gru['model'],
        results_exports_gru['X_test'],
        results_exports_gru['scalers'],
        n_steps=30
    )
    best_results = results_exports_gru


print("\n" + "="*60)
print("South Korea EXPORTS TO USA - FINAL COMPARISON")
print("="*60)
print(comparison_exports.to_string(index=False))


fig = plot_forecast_exports_plotly(
    results_dict=best_results,
    best_model_name=best_exports,
    forecast_orig=forecast_exports,
    df4=df4,
    lookback=LOOKBACK,
    confidence=0.95
)


Best model (South Korea Exports): LSTM
Generating 30-step forecast...

============================================================
South Korea EXPORTS TO USA - FINAL COMPARISON
============================================================
Model    Test RMSE  Train Time (s)
  RNN 2.745355e+06        4.853161
 LSTM 1.493633e+06       15.459108
  GRU 2.976324e+06        9.018401

Forecast horizon and behavior

For the exports example, the best model forward is to produce a 12-month forecast, which reveals how these networks behave as the horizon extends. The GRU’s forecast starts near the last observed value and then follows a smooth, almost monotonic upward path, with an empirically based 95% confidence band that gradually widens the further you move from the forecast origin. Over the first few steps, the model’s predictions look quite plausible as short-term continuations of the recent trend, but as you approach the end of the year-ahead window, the forecasts become increasingly dominated by an extrapolated trend rather than detailed seasonal or cyclical structure—local fluctuations that appear in the historical data are largely smoothed away. This pattern suggests that the multivariate GRU is most reliable for short-to-medium horizons (a few months ahead), where it is still anchored by recent observations; deeper into the 12-month horizon, its output is better interpreted as a trend scenario with quantified uncertainty than as a precise monthly prediction.