Week 7: Annotating Teentaal in Vocal Music

April 18, 2025

With the beat classification system in place and performing reliably on isolated tabla recordings, this week I began expanding the project’s scope into more complex and musically rich territory: full vocal performances. The goal was to begin building a dataset that could train a model to recognize sam and khali positions in actual songs, not just percussion recordings.

Dataset Expansion

The first step was collecting a new set of recordings—this time focused on vocal compositions performed in Teentaal. I curated a selection of bandishes and khayal pieces where the rhythmic cycle was clear and consistent. These recordings came from a combination of teaching sessions, practice material, and well-recorded live performances.

Once I had a working dataset, I began manually labeling each taal cycle. For every composition, I annotated the timestamp of the sam—the point of rhythmic resolution—and the khali, which often marks a lighter or unaccented point within the cycle. This process was more nuanced than working with tabla alone: in vocal music, sam is often indicated by melodic cadence or return rather than a distinct percussive hit, and khali can be implied through phrasing or drop in vocal emphasis.

The result was a set of precisely labeled vocal tracks that now mirror the structure of the tabla recordings I worked with earlier, but with the added complexity of musical phrasing, ornamentation, and tempo variation.

Preparing for Feature Extraction

In parallel with annotation, I started planning how to extract relevant features from the vocal audio. These will differ slightly from percussion-focused features. For instance:

RMS energy and spectral centroid still help identify moments of intensity or return.
Pitch contour and delta features can trace melodic phrasing across a cycle.
MFCCs remain valuable for capturing tonal texture, especially in expressive singing.
Onset envelope and tempo estimation will support alignment with taal cycles.

These features, when paired with my manually labeled sam and khali points, will form the training set for a model that can eventually predict rhythmic structure from vocal performance alone.

Why This Matters

This week’s work is a foundational step toward building an AI that can accompany—not just mimic—Indian classical music. It pushes the project toward context-aware analysis, where the system begins to understand how rhythm is expressed through melody, not just percussion. It also opens up exciting future possibilities:

Auto-aligning tabla with a solo vocal recording
Assisting students in finding the sam in a performance
Using vocal phrasing to guide generative tabla improvisation

The complexity is increasing—but so is the depth. The system is starting to listen like a musician.

View more of Atharv D.'s posts.

Week 7: Annotating Teentaal in Vocal Music

Dataset Expansion

Preparing for Feature Extraction

Why This Matters

Reader Interactions

Leave a Reply Cancel reply