« Back to Processing

Tutorial #8: Video Scissors

| No Comments | Published on June 5, 2008
The content that follows was originally published on the Don Havey website at http://donhavey.com/blog/tutorials/tutorial-8-video-scissors/

Angel with a beardAlright I’m back. Still got a couple of other projects going on that are sucking me away from the tutorials, but I figured that I should get at least one tutorial up this week to let everyone know that I’m still alive. As I mentioned, I’m going to dive into the topics of video capture and computer vision in Processing.

Today I’ll present three variations on a theme (it’s like This American Life!). Today’s theme: Mapping video on to irregular polygons. Or, as I like to put it, cutting out pieces of video as if you’re using scissors and paper.

Since these applets will all require a webcam, I’m not going to post them inline on the page. I’ll scatter some images about the page and point you towards the required classes (they’re after the jump today).

You’ll have to paste the thing into Processing to see live results. Note that you may need to configure the line that finds your webcam, although this should work for most situations:

//initialize camera
String[] cams = Capture.list();
println(cams);
//you may need to change which camera in the list it chooses below
cam2 = new Capture(this,w/2,h/2,cams[0],fr);

What you’ll learn

  • Filling a polygon with an image instead of a color.
  • Drawing polygons and removing them with a right-click.
  • Using an image’s pixels[] array.
  • A few other tips and tricks to working with video capture in Processing… like using two video streams simultaneously.

Ready?

Here we go.

Video capture in Processing

Me punching myself so hard that all of my teeth flew out, obviouslyThe Capture class in Processing is an extension of the PImage class. It’s updated according to the frame rate you specify when you initiate the class (that’s what makes it, um, video), but it contains all of the properties and methods available to the PImage class, such as the pixels[] array, and it can be used exactly the same way that you use an image. For example, to display a video stream in our applet, we use the image() function.

The methods that the Capture class adds are pretty basic. We’ll be using the available() and read() functions. The available() function, as you’ve guessed, just returns a boolean of true when the Capture should be read and false when it shouldn’t. It helps smooth out discrepancies between your applet’s frame rate and the effective frame rate of the video by telling you when a new video frame is ready to be displayed.

The read() method can be thought of as an update function. It gets the current video frame. You’ll need to call it before rendering the video (using image() or any other method), and before accessing the pixels[] array for the video.

So a basic call to our capture device looks like this:

if(cam.available()){
  cam.read();
}
image(cam,0,0,width,height);

Video performance

Me with parts of Angel's faceAs you know by now, I don’t come from a background in computer science. As such, I don’t know much (nor do I care to know much) about video drivers and the like. But I can tell you from experience that Processing is not the most efficient environment for video manipulation. No big surprise there, but specifically, I’ve found that it’s often faster to use two video capture devices – one set to hi-res for display and one at lo-res for computer vision calculations – trained on the same subject, than it is to try to work with a single hi-res source. I’ll show you the specific application I’m referring to in an upcoming applet, but until then, my rule of thumb for you is this: Avoid video pixel manipulation/calculation in Processing at all costs. Put the burden on your drivers wherever possible.

And feel free to refute me on this.

One modification to our Polygon class

Memphis can haz time delayed fragments?You’ll notice one change to the Polygon class that I introduced back in the Building Blocks tutorial. For this applet, I added another render() method that accepts an image as its only parameter, allowing us to texture map Polygons using Processing’s texture() method.

Since we used the texture() method, our vertex() function now requires two additional parameters that control which part of the image is mapped on to our Polygon. Pretty simple, since we’re just mapping an image at 1:1 (after scaling it to fit the screen).

If you want to displace or scale the mapped image, you could go ahead and choose whatever mapping coordinates you wanted, so long as they fall within the bounds of the image you’re mapping.

void render(PImage im){
  beginShape();
  texture(im);
  for(int i=0;i<npoints;i++){
    //note that we're scaling the mapping coordinates by 1/2
    //because our video resolution is only half of our applet size
    vertex(points[i].x,points[i].y,0,points[i].x/2,points[i].y/2);
  }
  endShape(CLOSE);
}

Drawing Polygons

If I had 6 eyes I'd always be smiling all arrogantly like thisSo the driving element of this applet is the generation of Polygons from user input. We’ll just listen for a mousePressed() event, temporarily store points in an array, then create (and simplify) a Polygon based on those points once the mouse button is released. Of course the user will require feedback during the drawing process, so we’ll give them a temporary Polyline that shows the partial path they’ve drawn.

If a complex Polygon is created (one which overlaps itself), the applet will simplify it into a convex hull, to avoid mapping weirdness. We’ll alert the users with a flash of red if their shape was complex. Something like this:

void mousePressed(){
  if(mouseButton==LEFT){
    drawing = true;
    op = new Point(mouseX,mouseY,0);
  }
}

void mouseDragged(){
  if(mouseButton==LEFT&&drawing){
    //draw a segment between the last point and the current point
    p = new Point(mouseX,mouseY,0);
    Segment s = new Segment(op,p);
    //draw our polyline to give the user feedback while they're drawing
    if(polylines[apolyline]!=null){
      polylines[apolyline].segments = (Segment[]) append(polylines[apolyline].segments,s);
      polylines[apolyline].nsegments++;
    }else{
      Segment[] ss = {s};
      polylines[apolyline] = new Polyline(ss);
      npolylines++;
    }
    op = p;
  }
}

void mouseReleased(){
  if(mouseButton==LEFT){
    drawing = false;
    if(polylines[apolyline]!=null){
     Point[] points = new Point[10000];
      int npoints = 0;
      //get all of the current polyline's points
      //they will define our polygon
      for(int i=0;i<polylines[apolyline].nsegments;i++){
        points[npoints] = polylines[apolyline].segments[i].p1;
        npoints++;
        }
      points[npoints] = polylines[apolyline].segments[polylines[apolyline].nsegments-1].p2;
      npoints++;
      points = (Point[]) subset(points,0,npoints);
      //generate the polygon from those points
      polygons[npolygons] = new Polygon(points);
      //there's going to be a lot of points, so simplify the thing
      //this will help speed things up and avoid mapping bugs
      polygons[npolygons].simplify();
      //if the resulting polygon is complex, make it into a convex hull
      if(polygons[npolygons].is_complex()){
        polygons[npolygons] = polygons[npolygons].get_convex_hull();
        //sound the alarm!
        complex = 20;
      }
      npolygons++;
      apolyline++;
    }
  }
}

Got it? That will all be rendered on a frame-by-frame basis in our render() function. The “complex” variable is just an incremental value that temporarily colors our user’s Polygon red if it has been transformed into a convex hull.

Now don’t forget that we’ll want to clear the screen somehow and delete mistake Polygons. Here our hit testing comes in handy:

void mousePressed(){
  if(mouseButton==LEFT){
    drawing = true;
    op = new Point(mouseX,mouseY,0);
  }else{
    //remove a polygon if it is clicked with the right mouse button
    for(int i=0;i<npolygons;i++){
      if(polygons[i]!=null&&polygons[i].is_coord_inside(mouseX,mouseY,true)){
        polygons[i] = null;
        complex = 0;
        //we're breaking out of the loop for efficiency's sake
        //but this means that if you click an overlapping polygon area
        //only one will be deleted... and it might not be the one you expect
        break;
      }
    }
  }
}

void keyPressed(){
  if(mousePressed){ return; }
  if(npolygons==0){ return; }
  if(keyCode==BACKSPACE){
    //delete the most recent polygon
    npolylines--;
    npolygons--;
    apolyline--;
    polylines[npolylines] = null;
    complex = 0;
  }else if(keyCode==ENTER||keyCode==RETURN){
    //clear the stage
    npolylines = 0;
    npolygons = 0;
    apolyline = 0;
    polylines = new Polyline[1000];
    polygons = new Polygon[1000];
    complex = 0;
  }
}

There we go. Now you can draw Polygons and delete them with a key press or a right click.

The variations

As promised, three variations on the video scissors theme:

  1. Mix two video capture streams together by cutting holes in one that let you see into another. This was the original idea behind the video scissors, intended to be used in a semi-public setting, where users in different locations would be able to create a dynamic “face collages” using live video of themselves and the other user. Kind of like those video booths that combine two faces to see what a couple’s “baby” would look like. They could also be used to cut holes in one video that allow you to peer into another. Requires two webcams. Source code.
  2. Mix a time-delayed video stream with a live capture. Parts of the scene are shown in real time, and others are perpetually trying to catch up. We do this by copying each frame’s pixels[] array into a larger array, and using that array to create a delayed image once per frame. Extremely memory intensive because it requires storing many many frames of video temporarily in very large pixel arrays. A 100-frame delay in my tests required around 300mb of memory. Source code.
  3. Enclosing part of a video capture literally “captures” it, turning it into a static image. Moderately memory intensive (requires storing every “captured” frame in memory), but only once you’ve grabbed many static pieces. Very fun to play around with. Source code.

Is this a dead end?

Angel with a beardI think it is. Angel likes it a lot, but I can’t really think of a good way to make something interesting (beyond just another video toy) out of these experiments… if you can, let me know. There are definitely some improvements to be made, like blending the edge pixels of the Polygons to integrate them somewhat seamlessly (suggestion: create 2 or 3 nested Polygons per gesture; assign a low alpha value to the outer Polygons), and polishing up the UI.

Better tutorials to come. Until then, have fun playing around with this!

Categories: Processing / Tutorials

Tags: / / / / / / /

Leave a Response

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>