Small Simplicity

Understanding Intelligence from Computational Perspective

Feb 01, 2020

Multimodal Distribution in Image or Text domain

Q: What does "multimodal distribution" mean in computer vision literature (eg. image-to-image translation)?

While reading papers on conditional image generation using generative modeling (eg. "Toward Multimodal Image-to-Image Translation" by Zhu et al (NIPS 2017)), I wasn't clear what was meant by "one-to-many mapping" between input image domain and output image domain, "multimodal distribution" in the output image domain, or "multi-modal outputs" (eg. Quora).

Definition

In statistics, a multimodal distribution is a continuous probability distribution with two or more modes (distinct peaks; local maxima) - wikipedia

(single-variable) bimodal distribution bivariate multimodal distribution
bimodal bivariate-multimodal

In high-dimensional space (such as an Image domain: \(P(X)\) where X lives in \(d\)-dim space where \(d\) is the number of pixels, eg. \(32x32=1024\). If each pixel \(X_i\) takes a binary value (0 or 1), the size of this image domain is \(2^{1024}\). If each pixel takes an integer value \(\in [0,255]\), then the size of this image domain is \(256^{1024}\). This, by the way, is too big to compute for Mac's spotlight:

too-big

What it means by saying "the distribution of (output) image is multimodal" is to say, there are multiple images (ie. realization of the random variable (vector) X) with the (local) maxima value of the probability. In Figure below, the green dots represent the local maxima, ie. modes of the distribution. The configurations (ie. specific values/realizations) that achieves the (local) maximum probability density are the "probable/likely" images.

gan-multimodal-outputs multimodal-annot
The green dots representing modes of the distribution over the image domain (which is abstracted into a 2Dim space for visualization, in this case)

So, given one input image, if the distribution of the output image random variable is multi-modal, the standard problem of

Find \(x\) s.t. \(\underset{x \in \mathcal{X}}{\arg\max} P(X)\) (\(\mathcal{X}\) is the image space)
has multiple solutions. According to the paper (Toward Multimodal Image-to-Image Translation), many previous works have produced a "single" output image as "the" argmax of the output image distribution. But this is not accurate if the output image distribution is multi-modal. We would like to see/generate as many of those argmax configurations/output images. One way to do so, is by sampling from the output image distribution. This is the paper's approach.


Multimodal distribution as the distribution over the space of target domains [Domain adaption/transfer leraning]

So far, I viewed the multimodal distribution as a distribution over a specific domain (eg. Image domain), and the random variable corresponded to a realization, eg. an observed/sampled/output image instance. However,

Jan 20, 2020

Stochastic Thinking: Predictive non-determinism

MIT 6.0002 Lec4: Stochastic Thinking - YOUTUBE - predictive-nondeterminism

Often confusing categorization of a mathematical model: - SE - NB: in CS, people often use "deterministic" to mean non-randomized. This causes confusion: > "Determinism" means non-randomized. But, "Non-determinism" does not mean "randomized". - Determinism vs. Non-Determinism - ...? vs. stochastic/random - a stochastic (or random) process means,

<p hidden>![stochastic-process](images/stochastic-process.png)</p>

I think better way to put this confusion into words is: "Nondeterinistc vs. Probabilistic models" - lavalle 2006

Jan 16, 2020

How to read a paper

Before you start

Q: "why are you reading this?"

  • Write it down where you can see it while reading the paper
    • Your purpose/goal of reading may change later. You will have a different experience then.
  • Is there a clear answer for this question? If not, you probably should not go on reading the paper

Warm-up (1 hr)

Think of it like going on a date with a new person. It's a new relationship, so don't try/expect to understand it in one go -- this is rude:)

  • Go to a quiet place for a few hours. Take your coffee with you

  • Start by reading the title and abstract

    • Goal: gain a high level overview of the paper
    • What are the main goals of the authors?
    • What are the high level results?
    • What is the problem the paper is solving?
  • Skim the paper (~15min)

    • Look at the figures
    • Jot down any keywords to look out for when reading
    • Goal: get a sense for the layout of the paper; get keywords to look out for
  • Go to introduction, especially if you feel unfamiliar with the field/paper. Okay to do it often.

    • Goal: get other references to fill in the gap in your understanding
  • Carefully step through each figure

    • why?: each figure contain key points of the paper. Authors spend a lot of time creating them and try to condense important information that supports their experiments/hypothesis. Pay particular attention to them.
    • Goal: Gain feel for what the authors think is most important; Write down what to look out for when reading the paper in detail (which will follow soon)
  • Take a break. Walk a bit.

First ~pass~ date (1.5hr)

Start taking high level notes. Expect new words, unfamiliar ideas. Mark those (you don't yet need to understand every single word), move on.

This is your first date with the paper. You are not going to learn all gory details about it, but you will ask good questions, understand what motivated the paper, and what it's going to be about.

  • Begin again with the abstract, skim through the introduction*

  • Diligent pass through the methods section

    • Goal: Draw down the overall setup
  • Read the results and discussion

    • Goal: write down the key findings and how they were determined
  • Take a break. Do jumping jacks. Sing a song.

Let's continue.

  • Revisit the figures: by now, you should be able to get into nitty gritties of the figures (having read the methods, results, and discussion section)
    • Goal: find more gems from the figures.
    • Spend about 30min ~ 1hr

Second full pass (1-2hrs)

Goal:

  • Focus on shoring up what you didn't understand previously,
  • Gain a command of the methods section
    • Test if you can write a pseudocode
  • Being a critical reader of the discussion section

Details:

  • Pay particular attention to the areas you marked as being difficult to understand. This is why you read a new paper. Don't play safe. Okay to feel uncomfortable. Okay to do it the following day (but don't push it back too much).

  • Leave no word undefined, unclear. Make sure you understand every sentence.

  • Skim through areas you feel confident in (eg. abstract, intro, results)

Guiding Questions

  • from Quora

  • What previous research and ideas were cited that this paper is building off of? (usually introduction)

  • Was there reasoning for performing this research, if so what was it? (introduction)
  • Clearly list out the objectives of the study
  • Did you write down 3 on your note?
  • Was any equipment/software used? (methods)
  • What variables were measured during experimentation? (methods)
  • Were any statistical tests used? What were their results? (methods/results)
  • What are the main findings? (results)
  • How do these results fit into the context of other research and their 'field'? (discussion)
  • Explain each figure and discuss their significance.
  • Did you write down 9 on your note?
  • Can the results be reproduced and is there any code available?
  • Name the authors, year, and title of the paper!
  • Are any of the authors familiar, do you know their previous work?
  • What key terms and concepts do I not know and need to look up in a dictionary, textbook, or ask someone?
  • What are your thoughts on the results? Do they seem valid?

Apply the technique

Most importantly, apply this guideline to your reading. Modify it to suit your personality.


Write a reading report

This is the end product of your reading. Without it, you didn't do your job.
^Really.

To check out ## check out: - Jason Eisner (JHU): [how to read a paper](https://www.cs.jhu.edu/~jason/advice/how-to-read-a-paper.html) - Prof.Murat at Buffalo: - [how to lead a reading group](https://tinyurl.com/rbree4d) - [how he reads a paper](http://muratbuffalo.blogspot.com/2013/07/how-i-read-research-paper.html) - how Prof. Nancy Lynch works: cool! - Cathy Wu, MIT: [how to lead a reading group](https://tinyurl.com/rbree4d)

Jan 10, 2020

Let's blog with Pelican

Makefile

  1. make html: generates output html files from files in content folder using development config file

    • make regenerate: do make html with "listening" to new changes
    • vs. make publish: similar to make html except it uses settings in pulishconf.py
  2. make serve: (re)starts a http server in the output folder. Default port is 8000

  3. Go to localhost:<PORT> to see the output website
  4. ghp-import -b <local-gh-branch> <outputdir>: imports content in to the local branch local-gh-branch

Workflow

Key points:

  • Do every work in dev branch.
  • Do not touch blog-build or master.
  • blog-build will be indirectly modified by ghp-import (or make publish-to-github)
  • and master is the branch that Github will access to show my website.
  • So, manage the source (and outputs) only in dev branch.

Local dev

  1. Activate the right conda env with pelican library
conda activate my-blog 

cd ~/Workspace/Blog
  1. Make sure you are on dev branch git checkout dev

  2. (this step is for github syncing) Add new files under content git add my-article.md

  3. Generate the content with pelican make html # or, make regenerate

  4. Start a local server make server

  5. Open a browser and go to localhost:8000 Or whatever the port number is, set in Makefile (under the variable name of PORT)

Global dev

  1. Use make publish instead of make html
  2. Update the blog-build branch with contents in output folder
  3. Push the blog-build branch's content to origin/master

These three steps can be done in one line:

make publish-to-github

Version control the source

Important: Write new contents only on the dev branch

git add <files> # make sure not to push the output folder

git cm "<commit message>"

git push origin dev #origin/dev is the remote branch that keeps track of blog sources

Jan 01, 2020

Linear Regression

Todo:

  • [ ] Set up the problem
  • [ ] Simple housing problem
  • [ ] Frequentist view - minimize the squared-loss
  • [ ] Probablistic view - Discriminative classifier to model conditional distribution
  • [ ] Interactive example using holoviews or ipywwidgets
← Previous Next → Page 2 of 3