From Beats to Data: Applying AI to predict hits
April 1, 2025Anjali Nair

In the rapidly evolving intersection of music and technology, Ai prediction and machine learning is redefining how we analyse and predict musical success. This article presents a detailed technical overview of our approach to transforming raw audio into quantifiable insights.
By leveraging convolutional neural networks to extract features like danceability, acoustics, valence, instrumentalness, energy, and tempo, we develop a data-driven Ai prediction framework to assess a song’s hit potential. We discuss our methods, from data analysis and preprocessing to predictive modelling and evaluation, offering insights into the challenges and opportunities at the forefront of music analytics.
Background & Motivation
Although the way we appreciate music has remained fundamentally unchanged, there exists an underlying quantifiability in its creation and reception. Large music labels have long relied on a qualitative understanding of —a formula that hints at measurable patterns and characteristics inherent in popular music. We leverage AI to delve deeper into this quantifiability, aiming to uncover insights that extend beyond current understanding. Our approach aspires to democratize access to advanced musical analysis, offering both major labels and independent artists a common tool to evaluate and enhance their creations. This not only levels the playing field but also opens up new avenues for creative experimentation and innovation in music.
Overview of the Application
The tool we engineered provides a streamlined environment where both labels and independent artists can create profiles, upload their tracks, and access comprehensive analytics. Once a track is uploaded, our system evaluates its musical attributes to deliver insights on hit potential and offers targeted feedback to help improve its appeal. Users can view detailed analytics on various aspects of their music, gauge their track’s success probability, and even participate in competitions for prizes. Beyond analytics, our vision is to offer a one-stop destination where creators can showcase their work, gain recognition, and engage with a vibrant community, while also serving as a hub for music lovers to discover fresh tunes.
Tunes to numbers
Our approach focused on converting raw musical audio into actionable numerical insights. The AI prediction tool converts the audio into a numerical representation that reveals its underlying patterns and structures, setting the stage for applying deep learning techniques.
Convolutional Neural Networks (CNNs) are particularly well-suited for this task. They excel at processing data with grid-like structures—such as spectrograms derived from audio signals—allowing them to effectively capture local patterns and hierarchical features critical for understanding musical characteristics.
We experimented with CNNs of varying sizes, trained on well-known open source datasets to predict various audio features we considered necessary for our application, based on previous research and initial round of analysis. Our model’s understanding from the large collection of songs used in training allows it to view patterns in segments of music as numbers and as a result, represent music as these numerical representations. The numerical representation, processed appropriately, quantify the musical piece in terms of various audio features
Audio Features Extracted
Building on the numerical representations provided by our CNN models, we further quantify a musical piece through a set of well-defined audio features. In this context, “audio features” are measurable attributes that capture the rhythmic, timbral, and emotional dimensions of a song. These quantifications transform subjective musical qualities into objective data, enabling us to compare tracks and predict their potential success.

Based on our research and initial analysis, we identified four core features that not only encapsulate the essence of a musical composition but also serve as reliable predictors in our Ai prediction model:
- Danceability: Quantifies the rhythmic stability and groove, indicating how well a track can motivate movement.
- Acousticness: Measures the likelihood that the track is acoustic, distinguishing between electronically produced sounds and those recorded with traditional instruments.
- Valence: Represents the musical positiveness, capturing the overall emotional tone conveyed by the track.
- Instrumentalness: Assesses the presence or absence of vocals, determining whether a track is more instrumental in nature.
- Energy: Quantifies the intensity and dynamic range of a track, reflecting its overall power and drive.
- Tempo: Measures the pace or speed of the track, expressed in beats per minute (BPM), which significantly influences its rhythmic character and feel.
These features provide a comprehensive framework for analyzing music, turning tunes into numbers that reveal underlying patterns and potential appeal.
Model Evaluation for Ai prediction
Ensuring that our audio feature predictions are reliable is critical to our system’s success. To evaluate our models, we used an open source dataset from Spotify that provides the same audio features—danceability, acousticness, valence, and instrumentalness—for tracks trending in the charts over the last year. This dataset serves as a robust benchmark because it represents music that has already proven popular.
Our evaluation process involves comparing the predicted feature values from our models with those in the Spotify dataset. When our predictions closely match the Spotify values, it indicates that our models are accurately capturing the intended musical attributes. Additionally, since we have multiple models—each trained on different open source datasets to extract individual features—we also compare their outputs against one another. The chart below shows a subset of the root mean square errors (rmse) for the evaluations. Overall, the generally low rmse values indicate a strong alignment between our models and the benchmark data, validating our approach.

Hit Probability Prediction Using Machine Learning
One of the main challenges in hit prediction is acquiring current, legally compliant data. Since tracks from today’s hit charts are not open source, we rely on a carefully curated, limited selection of songs that we have purchased to ensure legal use. Our training data for hit prediction modelling is curated genre by genre,as our analysis showed that the factors driving a hit can vary significantly between the various genres.. For instance, while danceability is a key indicator of success in electronic music, pop music might lean more on valence.
Although some previous works on hit prediction have reported accuracies as high as 90%, these models often incorporated the creator’s name and past success as predictors. This introduces a bias, meaning that songs by already successful artists were more likely to be predicted as hits regardless of the music itself. In contrast, our model deliberately excludes any information about the artist’s track record, focusing solely on the musical content to determine hit probability.
Given the limited data available, we opted for a simpler, more interpretable model rather than more complex deep learning approaches, which generally require large datasets. We performed extensive data analysis and preprocessing to make the underlying relationships more apparent to the model. For example, we found that incorporating features like entropy—which quantifies how a song’s composition evolves over time—significantly enhanced our predictions. This strategy allows us to capture the essential musical factors that drive success while mitigating the risks of data scarcity and inherent bias. With extensive preprocessing, a simple regression model performed sufficiently well.
Our results as compared to previous methods are shown in the chart above. When artist popularity is included, the accuracy reaches 90% but skews heavily toward established creators. Removing popularity altogether drops accuracy to 60%, reflecting the challenge of predicting hits purely from basic audio data. Our method also excludes popularity but applies advanced feature engineering and careful threshold tuning, yielding a more balanced 75% accuracy that avoids bias while still delivering reliable predictions.
Evaluation and Result
Our approach utilizes regression models that output a continuous probability indicating the likelihood of a song being a hit. To convert these probabilities into a clear binary decision—hit or not hit—we conducted a ROC curve analysis. This analysis helped us identify an optimal threshold: songs with a predicted probability below 0.6 are classified as “not hit,” while those above 0.6 are considered potential hits.
Conclusion
Our work demonstrates how AI-driven analytics and AI prediction can transform raw audio into meaningful insights, providing both labels and independent artists with actionable feedback on the potential success of their music. By focusing on the intrinsic qualities of a track—rather than the creator’s popularity—we strive to level the playing field and foster a more equitable environment for all musicians.
Looking ahead, our approach can be extended to various genres and refined with larger, more diverse datasets to further boost predictive accuracy. Techniques such as transfer learning, domain adaptation, and real-time analysis could enhance the system’s adaptability and precision. As AI becomes more pervasive in the music industry, it will not only aid in discovering new talent but also empower artists to make data-informed creative choices.
Ultimately, we envision a future where advanced analytics and machine learning serve as catalysts for artistic growth, helping to bridge the gap between raw creativity and market success. By providing fair and transparent evaluations, AI can nurture a more vibrant and inclusive musical landscape—one in which every artist has the tools to shine based on the merits of their work.


Vidar Daniels Digital Director