Quantcast
Channel: Echo Chamber Project - playlist
Viewing all articles
Browse latest Browse all 8

Dynamically Creating Sound Bite Sequences with SMIL & Drupal

$
0
0

VICTORY! I am now able to dynamically generate audio metadata and have it be recognized by Quicktime as an edited sound bite sequence! This is a HUGE breakthrough for my collaborative editing schema. Here is a demo of a sequence of three sound bites that have been excerpted from longer audio files and strung together.

I've found a way for Drupal to automatically edit sound bite sequences without having to generate any text files or generate muxed audio files that need to be written to the server.

UPDATE 3/29/06 This URL has the dynamically generated SMIL code (i.e. take a look at the source code for the page to see the SMILtext). And then here is an embedded version of this SMIL metadata:

More details below...

In other words, I can just upload twelve 5-minute MP3 files to my website as source material (i.e. representing a one-hour interview), and then have users edit a large number of different sequences without having to create any hard text files or render the actual audio into separate files that need to be saved to the server.

Drupal can dynamically create SMIL code and have the browser-embedded Quicktime plug-in automatically recognize it as if it were a real audio file. Again, no new audio files need to be generated since it is just a lot of timecode metadata that Quicktime treats as a new audio file (SMIL and Drupal code examples are listed below).

This breakthrough will allow massively scalable collaborative editing, because this type of dynamic SMIL generation approach doesn't require the creation of a lot of large audio files or the burden of writing and rewriting many small text files.

This will allow users to be able to place a number of sound bites into a playlist sequence, and very quickly be able to hear what it sounds like.

And if the timecode is laid over the audio clip as shown here and described here, then users will be able to tweak the IN and OUT times of individual clips, which will be a critical functionality for letting each user determine what constitutes a "sound bite." Otherwise, the collaborative editing schema would depend upon the hierarchical control of one person (i.e. me) determining the start time and end time of each particular sound bite. This will allow the volunteer editors a lot more creative freedom.

I also hope that this breakthrough will help with trying to find a PHP/Drupal developer who can help alter the Drupal playlist module to do this type of sound bite sequencing.

The following URL will show you what a dynamically edited sound bite sequence looks like:
http://www.echochamberproject.com/sequence/2/play

It takes the following three sound bites from our Bill Plante interview that are assigned to a unique number and a unique URL: 340,463,& 476.

Here is the text of the three sound bites that you should hear:
UPDATE 4-6-06: Here's a screenshot of the playlist user interface. The user will be able to reorder these sound bites by dragging and dropping them, and also be able to listen to the audio of these edits.

playlist_ui

Here is the SMIL code that is being generated by the Drupal:

SMILtext<?xml version="1.0" encoding="UTF-8"?>
<smil xmlns:qt="http://www.apple.com/quicktime/resources/smilextensions" qt:immediate-instantiation="true" qt:autoplay="true" qt:time-slider="true" qt:chapter-mode="clip">
<head>
<meta name="author" content="EchoChamberProject.com"/>
<meta name="information" content="Written by Kent Bye"/>
<layout>
<root-layout height="240" width="320" background-color="#000000"/>
<region id="main" height="240" width="320" fit="hidden"/>
</layout>
</head>
<body>

<seq>
<audio src="http://www.echochamberproject.com/files/plante01.mp3" region="main" qt:chapter="plante01_340" clip-begin="npt=85.852519185853s" clipBegin="npt=85.852519185853s" clip-end="npt=103.8038038038s" clipEnd =" npt=103.8038038038s"/>

<audio src="http://www.echochamberproject.com/files/plante07.mp3" region="main" qt:chapter="plante07_463" clip-begin="npt=233.36670003337s" clipBegin="npt=233.36670003337s" clip-end="npt=251.31798465132s" clipEnd =" npt=251.31798465132s"/>

<audio src="http://www.echochamberproject.com/files/plante08.mp3" region="main" qt:chapter="plante08_476" clip-begin="npt=123.99065732399s" clipBegin="npt=123.99065732399s" clip-end="npt=139.60627293961s" clipEnd =" npt=139.60627293961s"/>
</seq>
</body>
</smil>

Here is the Drupal code that is generating the SMIL code above. Again, the timecode metadata for the sound bite numbers 340,463,& 476 originated from Final Cut Pro, and were exported via Final Cut Pro XML, manually parsed and then uploaded into Drupal as "flexinodes." This allows each sound bite to be referred to as a number, which then allows the following Drupal code to fetch the associated timecode metadata from the backend MySQL database.

WARNING: The following code is a hack that is certainly not optimized. Any feedback for more efficient coding would be much appreciated. Thanks.

function soundbite_play() {
  $quicktimeSMIL = 'SMILtext<?xml version="1.0" encoding="UTF-8"?><smil xmlns:qt="http://www.apple.com/quicktime/resources/smilextensions" qt:immediate-instantiation="true" qt:autoplay="true" qt:time-slider="true" qt:chapter-mode="clip"><head><meta name="author" content="EchoChamberProject.com"/><meta name="information" content="Written by Kent Bye"/><layout><root-layout height="240" width="320" background-color="#000000"/><region id="main" height="240" width="320" fit="hidden"/></layout></head><body><seq>';

  $nid01=340; 
  $nid02=463;
  $nid03=476;

  $sequence_array = array($nid01,$nid02,$nid03);

  $Sequence_Node_Count = 3;
  $i = 0;

   while ($i<= $Sequence_Node_Count-1) {
      $ClipItem_In = db_result(db_query('SELECT `textual_data` FROM `flexinode_data` WHERE `nid` = %d AND `field_id` = %d', $sequence_array[$i], 5));
      $ClipItem_Out = db_result(db_query('SELECT `textual_data` FROM `flexinode_data` WHERE `nid` = %d AND `field_id` = %d', $sequence_array[$i], 6));
      $ClipItem_Name = db_result(db_query('SELECT `textual_data` FROM `flexinode_data` WHERE `nid` = %d AND `field_id` = %d', $sequence_array[$i], 7));

      $quicktimeSMIL .= '<audio src="http://www.echochamberproject.com/files/'.$ClipItem_Name.'.mp3" region="main" qt:chapter="'.$ClipItem_Name.'_'.$sequence_array[$i].'" clip-begin="npt='.$ClipItem_In/29.97.'s" clipBegin="npt='.$ClipItem_In/29.97.'s" clip-end="npt='.$ClipItem_Out/29.97.'s" clipEnd = "npt='.$ClipItem_Out/29.97.'s"/>';
      $i++;
  }

  $quicktimeSMIL .= '</seq></body></smil>';
  header('Content-Type: video/quicktime'); 
  print($quicktimeSMIL);
  exit;
}

Thanks to Morbis Iff for the help with figuring out the "header('Content-Type: video/quicktime')" statement which was the ultimate trick for having the Drupal output be treated as if it should be read by Quicktime.

SAMPLING FREQUENCY MISMATCH CAUSES TIMECODE SLIPPAGE
I was being slowed down by a technical roadblock that I finally debugged this morning. I was failing to replicate my original collaborative editing SMIL demo because I was exporting my MP3 files at the wrong sampling frequency.

It turns out that you will get timecode slippage if you export an MP3 files at 44.1 kHz when the original miniDV tape was recording at 48 kHz. In other words, my intention is for MP3 files to serve as a dummy file that would maintain timecode continuity for the audio from the original video clips of the interviews. The MP3 file would be much smaller to save on bandwidth, but also for quicker load times within SMIL.

So I had inadvertently exported the MP3 dummy files at 44.1 kHz, and I was getting very imprecise starting and ending times for the edited sound bites. The beginning and end of the sound bite would be off as much as 1 second, which would begin the sound bite on the previous statement and clip off the end of the sound bite.

LESSON: MP3 files that are intended to maintain film timecode continuity must be exported at 48 kHz if they were recorded at 48 kHz.

I haven't played around enough with it to know whether or not 44.1 kHz MP3 files will work if it's captured at 44.1 kHz.

ANOTHER LESSON:The export bitrates do not seem to cause any timecode slippage.

My Canon GL-1 captures audio on the miniDV tape at a bitrate of 1536 kbits/second, and there was no timecode slippage in MP3 files exported at 40 kbits/sec, 64 kbits/sec or 128 kbits/sec as long as the exported sampling frequency was 48 kHz.

Since the bitrate is independent of the timecode slippage problems, then the smaller the bitrate -- the smaller the file size -- and the quicker that Quicktime will pre-load the SMIL files.

I used an export bitrate of 40 kbits/sec, which produced MP3 file sizes of around 1.5MB per five minutes of audio. The low bitrate causes some loss in voice quality of the recording, but the smaller file sizes allow for edited sequences to load faster and have a better user experience for previewing edits because SMIL seems to pre-load an entire audio file before playing an excerpt.


Viewing all articles
Browse latest Browse all 8

Trending Articles