DataMine Machine Learning Service

The DataMine Machine Learning Service is an Automated Machine Learning platform that uses CME Group data and other public data to build machine learning models for financial instruments such as equities, indices, commodities, currencies, and cryptoassets. The goal with DataMine Machine Learning Service is to enable anyone to derive meaningful data-driven insights for procurement, trading and risk management. The platform comes pre-loaded with a set of example models, and users can also build their own models; no coding or data science expertise required.  

Automated Machine Learning

Automated Machine Learning is a technology that streamlines and accelerates the processes and workflow typically required to utilize machine learning. This involves the automation of data pre-processing, model creation, parameter adaptation, and model updates. Historically this process requires a team of Data Scientists and takes weeks or even months to implement. With Machine Learning Service, non-technical users can build and leverage machine learning models.

DataMine Machine Learning

DataMine Machine Learning Service is a machine learning tool that allows users to build their own machine learning models. Once built, data is ingested automatically every day to update the models and generate daily signals for different timeframes specified by the user. Model signals are converted into strategies and then backtested and evaluated across a variety of performance metrics, allowing users to evaluate and compare different data, models and strategies.

DataMine Machine Learning Service can be used to build machine learning models for the following CME Group Products: 

1.1 Energy

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

WTI Rolling Futures Index

CRFCL

CL

03.Jan.2011

Monthly Contracts (12 months)

Natural Gas Rolling Futures Index

CRFNG

NG

03.Jan.2011

Monthly Contracts (12 months)

NY Harbor ULSD Rolling Future

CRFHO

HO

03.Jan.2011

Monthly Contracts (12 months)

RBOB Rolling Future

CRFRB

RB

03.Jan.2011

Monthly Contracts (12 months)

 1.2 Metals

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Gold Rolling Future

CRFGC

GC

03.Jan.2011

Monthly Contracts (12 months)

Silver Rolling Future

CRFSI

SI

03.Jan.2011

Monthly Contracts (12 months)

Copper Rolling Future

CRFHG

HG

03.Jan.2011

Monthly Contracts (12 months)

Platinum Rolling Future

CRFPL

PL

03.Jan.2011

Monthly Contracts (Jan, Apr, Jul, Oct)

 1.3 Interest Rates & Treasury

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

2Y Treasury Note Rolling Future

CRFZT

ZT

02.Jan.2004

Quarterly Contracts  (Mar, Jun, Sep, Dec)

5Y Treasury Note Rolling Future

CRFZF

ZF

02.Jan.2004

Quarterly Contracts  (Mar, Jun, Sep, Dec)

10Y Treasury Note Rolling Future

CRFZN

ZN

02.Jan.2004

Quarterly Contracts  (Mar, Jun, Sep, Dec)

30Y Treasury Bond Rolling Future

CRFZB

ZB

02.Jan.2004

Quarterly Contracts  (Mar, Jun, Sep, Dec)

Ultra Treasury Bond Rolling Future

CRFUB

UB

01.Mar.2010

Quarterly Contracts  (Mar, Jun, Sep, Dec)

Fed Funds Rolling Future

CRFZQ

ZQ

02.Jan.2004

Monthly Contracts       (12 months)

1M SOFR Rolling Future

CRFSR1

SR1

01.Jun.2018

Monthly Contracts       (12 months)

3M SOFR Rolling Future

CRFSR3

SR3

01.Jun.2018

Quarterly Contracts  (Mar, Jun, Sep, Dec)

 1.4 Agriculture

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Chicago Wheat Rolling Future

CRFZW

ZW

02.Jan.2004

Monthly Contracts (Mar,May,Jul,Sep,Dec)

Corn Rolling Future

CRFZC

ZC

02.Jan.2004

Monthly Contracts (Mar,May,Jul,Sep,Dec)

Soybean Rolling Future

CRFZS

ZS

02.Jan.2004

Monthly Contracts (Jan,Mar,May,Jul,Aug,Sep,Nov)

Soybean Meal Rolling Future

CRFZM

ZM

02.Jan.2004

Monthly Contracts (Jan,Mar,May,Jul,Aug,Sep,Oct,Dec)

Soybean Oil Rolling Future

CRFZL

ZL

02.Jan.2004

Monthly Contracts (Jan,Mar,May,Jul,Aug,Sep,Oct,Dec)

Lean Hogs Rolling Future

CRFHE

HE

02.Jan.2004

Monthly Contracts (Feb,Apr,May,Jun,Jul,Aug,Oct,Dec)

Live Cattle Rolling Future

CRFLE

LE

02.Jan.2004

Monthly Contracts (Feb,Apr,Jun,Aug,Oct,Dec)

Feeder Cattle Rolling Future

CRFGF

GF

02.Jan.2004

Monthly Contracts (Jan,Mar,Apr,May,Aug,Sep,Oct,Nov)

1.5 Foreign Exchange

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

Index

Ticker

Underlying Future (Globex Code)

Start Date

Contracts Used

EUR/USD Rolling Future

CRF6E

6E

02.Jan.2004

Quarterly Contracts (Mar, Jun, Sep, Dec)

GBP/USD Rolling Future

CRF6B

6B

02.Jan.2004

Quarterly Contracts (Mar, Jun, Sep, Dec)

JPY/USD Rolling Future

CRF6J

6J

02.Jan.2004

Quarterly Contracts (Mar, Jun, Sep, Dec)

AUD/USD Rolling Future

CRF6A

6A

02.Jan.2004

Quarterly Contracts (Mar, Jun, Sep, Dec)

FAQ

The following steps and items are part of the onboarding process:

  1. Create a CME Group ID following these instructions. 

  2. Log-in to DataMine. 

  3. Navigate to the DataMine Machine Learning Service here.

  4.  

    1. Here you will be able to subscribe.

  5. Go to your cart and checkout. 

  6. You will now be redirected to the licensing workflow. To proceed you will need to provide the below details:

  7.  

    1. Name of Individual or Company

    2. Individual/Company primary address 

    3. Individual/Company billing address 

    4. Contact information for at least two people

    5. Usage details: Internal (using the data inside your company only), Redistribution (sending the data outside your company), Derived (unique data created by making material calculations and / or changes to the licensed data, so that the licensed data cannot be substituted by, reverse engineered or otherwise identified from the unique data).

    6.  

      1. If Redistribution or Derived usage is required, additional licensing steps will be necessary.

  8. Upon completion of step five, you will be required to sign the DataMine agreement via DocuSign or click-through.

  9.  

    1. Please note, we do not allow this agreement to be modified. 

  10. After signature, the order can take up to 48 hours to be approved. 

  11. Once the order is approved, you can access the service here. 

Who can use DataMine Machine Learning Service?

DataMine Machine Learning Service is designed to make machine learning accessible to non-technical users. With Machine Learning Service you don’t have to be a Data Scientist to utilize machine learning.

Is DataMine Machine Learning Service also helpful for data scientists and other more advanced users?

Yes, DataMine Machine Learning Service streamlines and automates many of the steps in machine learning workflows, freeing up more advanced users to work on higher-value tasks such as signals processing and strategy implementation.

How far into the future can DataMine Machine Learning Service predict?

Machine Learning Service can build models for any timeframe. Different timeframes generally benefit from different data inputs.

How often are the models updated?

Models are updated every day. New data is ingested overnight, and the models update first thing in the morning to be available on the US market open.  

How many models can I build using the DataMine Machine Learning Service?

There is no limit to the number of models a user can build.

How is model performance measured?

The DataMine Machine Learning Service is a tool. All models are measured via performance metrics and backtesting in order for users to make their own evaluation of the model performance and establish confidence in the tool.

 What is the Markets Page? 

The Markets Page is a high-level view of all available forecasts across categories and sectors. The primary goal of the Markets Page is to give users a directional indication of different markets. Additionally, users can toggle on or off labels for Signals and Accuracy, to see the respective signal strength and accuracy of forecasts at different time intervals.

What is the Forecast Page? 

The Forecast Page is a medium-level view of forecasts for specific target variables. The purpose of this page is to provide users with a more detailed illustration of the forecast shape (Prediction Curve), behavior and performance at different time intervals, as well as a historical record of the forecast backtesting over time.

What is the Strategy Simulator Page?

The Strategy Simulator page is a tool that uses signal processing to turn model signals of various timeframes into actionable insights. It does this by combining signals from multiple models using different parameters and filters, so only the best model signals are used. Users are provided with default strategy parameters, but can further optimize strategies to their own individual goals, strategy preferences, and risk profile. Examples of strategy preferences:

  • Select which model signals are used in the strategy. 

  • Optimize for different performance metrics such as NetPosition (Equity Curve), ROI, Sharpe Ratio, Information Ratio, Max Drawdown, etc. 

  • Filter out model signals below certain performance thresholds. 

  • Trade Long/Short, Long-Cash, or Short-Cash.

  • Trade only signals above or below certain thresholds. 

What is the Data Explorer Page?

The Data Explorer is a tool designed to aid users in exploring and navigating CME Group data available for modeling. The dashboard provides a classification Search function as well as helpful visual elements to help the user find what they are looking for, and see the structure and categorization of the available data.

What is Autoseries?

Autoseries is a model building method that builds many machine learning models in a challenger champion framework. The challenger champion framework helps the tool adapt to changing marketing conditions, where different models compete for the top spot. Every month the models are reevaluated, and a new champion (better model) is selected. The outputs from the champion models are stitched together to give the impression of one continuous model. The reason why it is beneficial to build more models is because it allows the models to explore a larger algorithm parameter space and converge on a better output.

How is DataMine Machine Learning Service hosted?

The DataMine Machine Learning Service is hosted as a managed cloud solution.

How much does DataMine Machine Learning Service cost?

The DataMine Machine Learning Service is charged on a per-model basis. Each model is $20 to create and every day it runs, an additional $2/day fee will be applied. For example, if you created two models and each ran for 30  days you would pay: ($20/model *2) + ($2/day * 30 days *2 models) = $160

Can I get trial access to the DataMine Machine Learning Service?

Yes, there is a one-week trial available for all new users. You must opt-out of the trial to avoid being charged. Contact a sales agent at cmedatasales for more details.

How many users can DataMine Machine Learning Service support?

The DataMine Machine Learning Service can support any number of users. Licenses are sold on an individual basis.

How many machine learning algorithms does DataMine Machine Learning Service use?

DataMine Machine Learning Service currently offers a library of algorithms as options when building a machine learning model:

Algorithm Name

Code

Type

Algorithm Name

Code

Type

Gradient Boosting Regressor

GBR

Decision Tree

XGB Regressor

XGB

Decision Tree

LGBM Regressor

LGBM

Decision Tree

Cat Boost Regressor

CAT

Decision Tree

Extra Trees Regressor

XTR

Decision Tree

Random Forest Regressor

RFR

Decision Tree

AdaBoost Regressor

ADA

Decision Tree

K Neighbors Regressor

KNN

K nearest neighbor

SVR

SVM

Linear

Linear SVR

SVR

Linear

Linear Regression

LIN

Linear

Ridge

RID

Linear

Lasso

LAS

Linear

Huber Regressor

HUB

Linear

MLP Regressor

MLP

Neural Network



Is the input data included in the license cost?

Yes, input data is included in the license cost. Users are only charged for building models. 

What data is available through The Machine Learning Service?

Source

Description

Source

Description

CME Group Volatility Indexes (CVOL)

Derived from the world’s most actively traded options on futures benchmarks spanning six asset classes, the CME Group Volatility Index (CVOL) delivers the first-ever, cross-asset class family of implied volatility indexes.

CME Group Liquidity Tool Data

Daily re-construction of the electronic limit order book by performing calculations on CME Group Globex trading engine messages.

CME Group Term SOFR

CME Group Term SOFR Reference Rates provide an indication of the forward-looking measurement of overnight SOFR, based on market expectations implied from leading derivatives markets.

CME Group Rolling Futures Indices

CME Group Rolling Futures Indices are designed to provide market participants with an indicator for investment performance in a single CME Group Market.

Public Energy Information

Energy information from public sources.

News Headlines

News headlines from a variety of different news sources.

Public Federal Reserve Data

Economic time series data provided by the Federal Reserve.

USDA Data

Commodity related data such as production, etc.

USDA Statistics 

Commodity related statistical data such as production, etc.

Macroeconomic Public World Bank Information 

Country macroeconomic information from World Bank.

International Monetary Fund Information 

International Monetary Fund.

Technical Analysis Indicators

Method of analyzing past market data based on price and volume.

How are the CME Group Rolling Futures Indices calculated?

The CME Group Rolling Futures Indices are designed to provide market participants with an indicator for investment performance in a single CME Group Market.

CME Group Rolling Futures Indices are calculated on 28 futures products across five different asset classes. The CME Group Rolling Futures Indices will be calculated for each Business Day, in accordance with the CME Group Globex Trading Schedule and follow a next-month rolling schedule.

To maintain a continuity or pricing and account for contract expiry and movements in liquidity, a roll period is required. The roll period is designed to gradually transfer the weighting of the inputs from the Near Futures Contract to the Second Futures Contract prior to expiry. The weighting of the Near Futures Contract is 100% up until seven days to expiry. After this point, 20% weighting is transferred to the Second Futures Contract on a daily basis. When the Near Futures Contract is 2 days to expiry, it will have a weight of 0% and the Second Futures Contract will have a weight of 100%.

CME Group Rolling Futures Indices are calculated for a list of CME Group products.

What frequency of data can/do the models use? 

Machine Learning Service is capable of using data of different frequencies. While the models themselves update daily, the input data frequency ranges from daily, to weekly, monthly, quarterly, or annual data resolution. 

Can I use my own data?

Currently the solution does not support users ingesting their own data.



Can I build my own models?

Yes, users can build and maintain their own machine learning models. 

How are models trained?

Models are trained on 5 years of historical data preceding the model Start Date (assuming the data is available). For example, if a model is built from January 1, 2022, the model will be trained on historical data from January 1, 2017. If historical data is not available going back that far, the model will train on what it is provided. 

How are the models back-tested?

Model metrics are displayed in the dashboard to facilitate straightforward analysis and evaluation of model performance. 

How do I assess model performance?

The Machine Learning Service provides a list of performance metrics that are updated daily as a part of the model building/update process.

Are my models saved?

With Machine Learning Service, all user-created models are stored on the cloud. Models are backed up to hard storage periodically in case of system failure.






How was your Client Systems Wiki Experience? Submit Feedback

Copyright © 2024 CME Group Inc. All rights reserved.