Debugging sampling

Sampling algorithms are generally hard to debug. Often mistakes are hidden by the stochastic nature of the underlying MCMC method. Further the final algorithm combined using several building blocks becomes fast very complex.

Where to start?

Some starting points where to look for problems are:

Where are pitfalls?

While the former points seem to be easy to tackle there are a few possible pitfalls:

A further consequence of such a peaked posterior distribution is that the ar-ratio will drop drastically during a sampling run. So observing a dropping ar-ratio may also indicate that we have to rethink our likelihood model, make the proposal distribution to cover multiple scales or adapt the proposal distribution over time.

A possible way to change the image likelihood is to not model the pixel differences independently but to model the average of all pixel differences. You can use the class scalismo.faces.sampling.face.evaluators.CollectiveLikelihoodEvaluator from our framework.

Simple example to start with

We provide you with a small example how to start debugging. For this we change the sample type to not only contain the nescessary values for fitting but extends it to an additional tag, which is used to store where the sample originate from. The changed sample type then forces us to adapt also different parts of the algorithm.

Create a file called TaggedSample.scala and put in the following code. Note that the IDE should be able to fix the imports.

case class TaggedSample[A](tag: String, sample: A)

case class TaggedProposalGenerator[A](
  name: String,
  generator: ProposalGenerator[A] with TransitionProbability[A]
  ) extends ProposalGenerator[TaggedSample[A]] with TransitionProbability[TaggedSample[A]] {

  override def propose(current: TaggedSample[A]): TaggedSample[A] = {
    TaggedSample(name,generator.propose(current.sample))
  }

  override def logTransitionProbability(from: TaggedSample[A], to: TaggedSample[A]): Double = {
    generator.logTransitionProbability(from.sample,to.sample)
  }
}

case class TaggedSampleVerboseLogger[A](ostream: OutputStream) extends AcceptRejectLogger[TaggedSample[A]] {

  var index = 0
  val writer = new PrintWriter(ostream)

  override def accept(current: TaggedSample[A], sample: TaggedSample[A], generator: ProposalGenerator[TaggedSample[A]], evaluator: DistributionEvaluator[TaggedSample[A]]): Unit = {
    log("A",current,sample,generator,evaluator)
    index += 1
  }

  override def reject(current: TaggedSample[A], sample: TaggedSample[A], generator: ProposalGenerator[TaggedSample[A]], evaluator: DistributionEvaluator[TaggedSample[A]]): Unit = {
    log("R",current,sample,generator,evaluator)
    index += 1
  }

  def log(tag: String, current: TaggedSample[A], sample: TaggedSample[A], generator: ProposalGenerator[TaggedSample[A]], evaluator: DistributionEvaluator[TaggedSample[A]]): Unit = {
    writer.println(s"${"%06d".format(index)} $tag ${sample.tag} ${evaluator.logValue(sample)}")
    writer.flush()
  }

  def stop(): Unit = {
    writer.flush()
    writer.close()
  }
}

object TaggedSample {
  
  object implicits {
    
    implicit class ProposalGeneratorTagger[A](val generator: ProposalGenerator[A] with TransitionProbability[A]) {
      def withName(name: String) = TaggedProposalGenerator(name,generator)
    }

    implicit class TaggedDistributionEvaluator[A](val evaluator: DistributionEvaluator[A]) extends DistributionEvaluator[TaggedSample[A]]{
      override def logValue(namedSample: TaggedSample[A]): Double = {
        evaluator.logValue(namedSample.sample)
      }
    }

    implicit class TaggedAcceptRejectLogger[A](val logger: AcceptRejectLogger[A]) extends AcceptRejectLogger[TaggedSample[A]] {
      override def accept(current: TaggedSample[A], sample: TaggedSample[A], generator: ProposalGenerator[TaggedSample[A]], evaluator: DistributionEvaluator[TaggedSample[A]]): Unit = {
        logger.accept(current.sample,sample.sample,generator.asInstanceOf[TaggedProposalGenerator[A]].generator,evaluator.asInstanceOf[TaggedDistributionEvaluator[A]].evaluator)
      }

      override def reject(current: TaggedSample[A], sample: TaggedSample[A], generator: ProposalGenerator[TaggedSample[A]], evaluator: DistributionEvaluator[TaggedSample[A]]): Unit = {
        logger.reject(current.sample,sample.sample,generator.asInstanceOf[TaggedProposalGenerator[A]].generator,evaluator.asInstanceOf[TaggedDistributionEvaluator[A]].evaluator)
      }
    }

    implicit class TaggedChainStateLogger[A](val logger: ChainStateLogger[A]) extends ChainStateLogger[TaggedSample[A]] {
      override def logState(sample: TaggedSample[A]): Unit = {
        logger.logState(sample.sample)
      }
    }

  }

}

Now we can import the functionality into our sampling script.

import TaggedSample
import TaggedSample.implicits._
import TaggedSampleVerboseLogger

We now transform proposals to ProposalGenerator[RenderParameter] and tag them to create a named version.

val namedShapeProposal = GaussianMoMoShapeProposal(0.05).toParameterProposal.withName("shape")

Next we can create a logger for those samples.

val file = new File("/tmp/taggedVerboseLogger.txt")
val os = new FileOutputStream(file)
val taggedVerboseLogger = TaggedSampleVerboseLogger[RenderParameter](os)

As new initial state we also need a tagged sample.

val taggedInit = TaggedSample("init",initial)

Now we can simply use the new proposal, the new init, the new logger together with old, untagged loggers and evaluators. They should be transformed implicitly when you made the imports.

To look at the information we write to the file, you can use the following application. The application will print and plot the ar-ratio for each proposal over the sampling run and plot the posterior values for the accepted and for the rejected samples.

import java.io.File
import breeze.plot._
import scala.io.Source

object Debugging extends App {

  // read lines and split at spaces
  val source = Source.fromFile(new File("/tmp/taggedVerboseLogger.txt"))
  val allLines = source.getLines().toIndexedSeq
  val infos = allLines.map(line => line.split(' '))

  // plot ar-ratios per proposal
  {
    val proposalNames = infos.map(_(2)).distinct // find name of logged proposals

    // evaluate the ar-ratio batch wise
    val ratioHistory = infos.grouped(1000).toIndexedSeq.map { lines =>

      // evalute the ar-ratio of each proposal
      val ratios = for (name <- proposalNames) yield {

        val selectedProposals = lines.filter(info => info(2) == name)
        val statusSeq = selectedProposals.map(_(1))

        val nA = statusSeq.count(_=="A")
        val nR = statusSeq.count(_=="R")
        val N = nA + nR
        val ratio = nA.toDouble / N

        // pretty print info
        val d = (ratio * 100).toInt
        print(s"${name} ${"%3d".format(d)}% / (${"%3d".format(N)}) \t|\t ")
        ratio
      }
      println()
      ratios
    }


    // make the ar-plot
    val f = Figure("AR-Plot")
    val p = f.subplot(0)
    val linspace = (0 until ratioHistory.size).map(_.toDouble)
    for( (selector,idx) <- proposalNames.zipWithIndex ) {
      val y = ratioHistory.map(seq => seq(idx))
      p += plot(linspace, y, name=selector)
    }
    p.legend = true
    p.ylim = (-0.05,1.05)
    f.refresh()
  }



  // plot posterior values for accepted and rejected samples separately
  {

    // get ((status, posterior-value) index) of samples
    val statusPValueIndex = infos.map(i => (i(1),i(3).toDouble)).zipWithIndex

    // divide accepted and rejected samples apart → (index,pvalue)
    val values = for (status <- Seq("A", "R")) yield {
      val setOfSamples = statusPValueIndex.filter(smp => smp._1._1 == status)
      setOfSamples.map{ case (info,index) =>
        (index,info._2)
      }
    }

    // plot values for the two sequences
    val f = Figure("neg. log. posterior-values")
    val p = f.subplot(0)
    for ( valueSequence <- values ) {
      val (x,y) = valueSequence.unzip
      p += plot(x.map(_.toDouble),y,'.')
    }
    f.refresh()

  }
}