31 July 2008

Many to Many Join with Propel

In my symfony application I have many images. Each of those images has lots of tags. Each tag might be associated with many different images. Yes, the classic many to many relationship. Now, using propel I would like to get all of my images, with their associated tags. Seems simple enough... except that "there is no magical support for many-to-many relationships in Propel." Propel is not very good at many to many relationships.

The propel documentation gives an example of how to retrieve objects for this type of situation. There are two major problems with the example. First, it only works if every image has a tag because the joins are inner joins instead of left joins. Second, it runs 1+n querys against the database which is terrible if you are retrieving a lot of images at once. For a lot of cases the example may be sufficient, but it did not work in my case.

The first step to making this work the way I needed was to construct a query using left joins retrieving all the needed data. I did this by creating a new method inside my ImagePeer class. I chose this place because my real goal is to get a list of images, I just want to do it efficiently.
public static function doSelectWithTags(Criteria $c, $con = null){
  $c = clone $c;
  $c->setDbName(self::DATABASE_NAME);
  
  $c->addJoin(ImagePeer::ID, ImageTagPeer::IMAGE_ID, Criteria::LEFT_JOIN);
  $c->addJoin(ImageTagPeer::TAG_ID, TagPeer::ID, Criteria::LEFT_JOIN);
  
  ImagePeer::addSelectColumns($c);  
  ImageTagPeer::addSelectColumns($c);
  TagPeer::addSelectColumns($c);
  

  $first_image_tag_col = (ImagePeer::NUM_COLUMNS - ImagePeer::NUM_LAZY_LOAD_COLUMNS) + 1;
  $first_tag_col = $first_image_tag_col + (ImageTagPeer::NUM_COLUMNS - ImageTagPeer::NUM_LAZY_LOAD_COLUMNS);
  
  $rs = BasePeer::doSelect($c, $con);
The next step was to hydrate all of the objects. This is fairly complex, but much of the code is taken straight from other methods in the ImageTagPeer class.
$images = array();
  $tags = array();
  
  while($rs->next()) {
    $image_id = $rs->getInt(1);

    if(array_key_exists($image_id, $images)){
      $image = $images[$image_id];
    }else{
      $omClass = ImagePeer::getOMClass();
      $cls = Propel::import($omClass);
      $image = new $cls();
      $image->hydrate($rs);
      $images[$image_id] = $image;
      $image->initImageTags();   
    }

    $tag_id = $rs->getInt($first_tag_col);
    if($tag_id > 0){
      $omClass = ImageTagPeer::getOMClass();
      $cls = Propel::import($omClass);
      $image_tag = new $cls();
      $image_tag->hydrate($rs, $first_image_tag_col);
    
      if(array_key_exists($tag_id, $tags)){
        $tag = $tags[$tag_id];
      }else{
        $omClass = TagPeer::getOMClass();
        $cls = Propel::import($omClass);
        $tag = new $cls();
        $tag->hydrate($rs, $first_tag_col);
        $tags[$tag_id] = $tag;
        $tag->initImageTags();
      }
      $image->addImageTag($image_tag);
      $tag->addImageTag($image_tag);
    }
  }
  return array_values($images);  
}
The last thing that I had to figure out was how to work around propel's query caching system. I never found a very elegant way to solve this problem, but I did find a way. Even though I set up all the object connections between images and tags, propel doesn't understand how it was done and will try to rebuild the connections. When the call getImageTags is made the empty criteria passed in does not match the left join criteria that was used to find the ImageTag objects. Therefore, propel refetches the ImageTags to make sure it has all the right ones. This is usually the desired functionality, but there is no nice way to get around it. In my Image class I added the method getImageTags to override the parent functionality. It can be passed a flag that says "just use what ever you have."
public function getImageTags($criteria = null, $con = null, $check_cleanliness = true){
  if($check_cleanliness){
    return parent::getImageTags($criteria, $con);
  }else{
    return $this->collImageTags ? $this->collImageTags : array();
  }
}
Not the most pretty code, but very functional. I can now get a list of images with all of the associated tags in a single query across three database tables.

No comments: