ChordSeqAI
Menu

Technical

Models

Models

ChordSeqAI uses machine learning models to generate chord progressions. This page provides an overview of the models, their architecture, performance, and how they are used in the app.

Overview

The models used in ChordSeqAI are designed to generate chord progressions based on a given input. The input can be a single chord, a sequence of chords, or a genre and decade. The models learn the patterns and relationships between chords in a dataset of chord progressions and use this knowledge to predict the next chord in a progression.

There are several types of models available in ChordSeqAI:

  • Recurrent Network (RNN) - a simple recurrent neural network architecture, fast and lightweight but less accurate.
  • Transformer S, M, and L - a more complex architecture based on transformers, with small, medium, and large sizes for different levels of accuracy and performance.
  • Conditional Transformer S, M, and L - a variant of the transformer architecture that takes genre and decade as additional input to generate genre-specific chord progressions.

Architecture

The models used in ChordSeqAI are based on deep learning architectures, specifically recurrent neural networks (RNNs) and transformers. These models are trained on a large dataset of chord progressions to learn the patterns and relationships between chords.

The RNN model is a simple GRU-based architecture that takes a sequence of chords as input and predicts the next chord in the progression. The transformer is a more complex architecture that uses self-attention mechanisms to capture long-range dependencies in the chord sequences.

The Conditional Transformer models are a variant of the transformer architecture that takes the genre and decade of the chord progression as additional input. This allows the model to generate chord progressions specific to a particular genre or time period. In particular, the current architecture uses head-wise gain adaptive layer normalization to condition the model on the genre and decade.

The details of the current architectures are explained in depth on Patreon.

Performance

The models have been evaluated on a test set of chord progressions. Accuracy measures how well the model predicts the next chord in a progression (in percentage of correct first predictions), while perplexity measures how well the model understands the entire distribution of chords in a progression (lower is better). The number of parameters indicates the size of the model and its complexity.

NameAccuracyPerplexityParameters
Recurrent Network60.21%4.159377,260
Transformer S72.80%2.496529,803
Transformer M75.87%2.5061,536,075
Transformer L76.89%2.4173,426,059
Conditional Transformer S74.98%2.618536,155
Conditional Transformer M76.36%2.4681,553,859
Conditional Transformer L76.66%2.4373,465,131

Usage

The models were trained using PyTorch, a popular deep learning framework, but they were exported to the ONNX format for use in the ChordSeqAI app. The models are stored in the public/​models directory and are loaded dynamically when needed. The app communicates with a web worker, which employs ONNX Runtime to run the models and suggest the next chord in a progression.