DeepSig has created a small corpus of standard datasets which can be used for original and reproducible research, experimentation, measurement and comparison by fellow scientists and engineers.

These datasets allow machine learning researchers with new ideas to dive directly into an important technical area without the need for collecting or generating new datasets, and allows for direct comparison to efficacy of prior work.

License Notice

All datasets provided by Deepsig Inc. are licensed under the Creative Commons Attribution - NonCommercial - ShareAlike 4.0 License (CC BY-NC-SA 4.0). If an alternative license is needed, please contact us at

Please reference this page or our relevant academic papers when using these datasets.

DeepSig Dataset: RadioML 2016.10A

A synthetic dataset, generated with GNU Radio, consisting of 11 modulations (8 digital and 3 analog) at varying signal-to-noise ratios. This dataset was first released at the 6th Annual GNU Radio Conference.

This represents a cleaner and more normalized version of the 2016.04C dataset, which this supersedes.  The file is formatted as a "pickle" file which can be open for example in python by using cPickle.load(...).

DeepSig Dataset: RadioML 2016.04C

A synthetic dataset, generated with GNU Radio, consisting of 11 modulations. This is a variable-SNR dataset with moderate LO drift, light fading, and numerous different labeled SNR increments for use in measuring performance across different signal and noise power scenarios.

This dataset was used for the "Convolutional Radio Modulation Recognition Networks" and "Unsupervised Representation Learning of Structured Radio Communications Signals" papers, found on our Publications Page.

There are three variations within this dataset with the following characteristics and labeling: