Week 7: Annotating Teentaal in Vocal Music
April 18, 2025
With the beat classification system in place and performing reliably on isolated tabla recordings, this week I began expanding the project’s scope into more complex and musically rich territory: full vocal performances. The goal was to begin building a dataset that could train a model to recognize sam and khali positions in actual songs, not just percussion recordings.
Dataset Expansion
The first step was collecting a new set of recordings—this time focused on vocal compositions performed in Teentaal. I curated a selection of bandishes and khayal pieces where the rhythmic cycle was clear and consistent. These recordings came from a combination of teaching sessions, practice material, and well-recorded live performances.
Once I had a working dataset, I began manually labeling each taal cycle. For every composition, I annotated the timestamp of the sam—the point of rhythmic resolution—and the khali, which often marks a lighter or unaccented point within the cycle. This process was more nuanced than working with tabla alone: in vocal music, sam is often indicated by melodic cadence or return rather than a distinct percussive hit, and khali can be implied through phrasing or drop in vocal emphasis.
The result was a set of precisely labeled vocal tracks that now mirror the structure of the tabla recordings I worked with earlier, but with the added complexity of musical phrasing, ornamentation, and tempo variation.
Preparing for Feature Extraction
In parallel with annotation, I started planning how to extract relevant features from the vocal audio. These will differ slightly from percussion-focused features. For instance:
-
RMS energy and spectral centroid still help identify moments of intensity or return.
-
Pitch contour and delta features can trace melodic phrasing across a cycle.
-
MFCCs remain valuable for capturing tonal texture, especially in expressive singing.
-
Onset envelope and tempo estimation will support alignment with taal cycles.
These features, when paired with my manually labeled sam and khali points, will form the training set for a model that can eventually predict rhythmic structure from vocal performance alone.
Why This Matters
This week’s work is a foundational step toward building an AI that can accompany—not just mimic—Indian classical music. It pushes the project toward context-aware analysis, where the system begins to understand how rhythm is expressed through melody, not just percussion. It also opens up exciting future possibilities:
- Auto-aligning tabla with a solo vocal recording
-
Assisting students in finding the sam in a performance
-
Using vocal phrasing to guide generative tabla improvisation
The complexity is increasing—but so is the depth. The system is starting to listen like a musician.
Leave a Reply
You must be logged in to post a comment.