Saturday, November 3, 2018

The brave new world of computational photography, or why cell phones are killing SLRs

Until very recently photography, for decades, has not endured much change. Most changes to film emulsions, lens optics, and mechanics were more quantitive than qualitative.  Even the first switch to digital which amounted to slapping a digital sensor instead of the film was not revolutionary.

The art of photography is of translating the image of the real world to a still picture on an arbitrary media (paper, screen, slide). It was never about fidelity as the precise replication of that is seen by the human eye is physically impossible. The "art" part is how the picture is transformed. The arsenal of effects of traditional photography was limited: depth of field, focal distance, movement blur, color curves, vignetting, and various optical distortions like chromatic aberrations and results of using optical filters, to name a few. The post-processing steps add a few more, but let us put these aside for now.

As a photographer, I've built a mental model of the camera system, and learned to estimate how changing its parameters will affect the image I am taking. Again, fundamentally they are very few: focus, sensor ISO sensitivity, focal length (zoom), aperture, and exposure. For more specialized cases you may want to know about the type of shutter (running or circular) and your optical filters properties. Most cameras, from $30 point-and-shoot to $3,000 SLR just help you to manage these very same parameters.

The early generations of mobile phone cameras worked exactly like that. On the top of the line (at the time of this writing) Samsung phone, Galaxy S9, you can switch to "Pro" mode and tweak focus, exposure, aperture, sensitivity as some now long dead photographer did on his TLR camera 130 years ago!

But unlike cameras, modern phones have formidable computational resources. They can process data from camera sensor in real-time using their fast CPUs and GPUs. They can do it while taking a picture. Suddenly techniques like using imperceptible hand motion to get several slightly offset images from which a higher-resolution image could be reconstructed [1] become possible. Another burst technique, HDR, used to achieve the better dynamic range become so standard that it is enabled by default on Galaxy S9 and Google Pixel 3. The camera sensors are also evolving. Companies like Lytro developed sensors which capture information about the direction that the light rays are traveling in space. Apple first pioneered phones with dual cameras which opens a world of possibilities. Now you can take simultaneously two pictures at different focal lengths, with different parameters and combine information from them to construct a composite image. This could also be combined with burst techniques. Finally, as the two cameras a physically apart and giving two distinct vantage points, one can try to reconstruct some of the 3D features of the scene.

Comparing digital cameras become more difficult. Before you look at sensor resolution (megapixels), dynamic range, color response curves, focal length, and range of apertures and exposures as the starting point to compare 2 cameras. This is no longer true. Your phones now have multiple cameras, taking multiple sensor readings for each picture, controlled by complex AI-driven proprietary algorithms, which are sometimes as smart as detecting the types of multiple objects in your viewfinder and choosing the best way to render them in a shot.

My Google Pixel 3 camera does not have a "Pro" mode. And this is not because they chose to "dumb it down" for the consumer. The reason is that the set of controls we used to is no longer applicable. The world has changed and many of the skills I've learned over years as a photographer no longer apply. Finally, the real disruption to the camera industry has arrived.

Thursday, March 9, 2017

Writer comonad in Coq

In the last post, we reviewed Writer comonad in Haskell. In this post we will implement it in Coq. Since monads are not the part of standard Coq library, and we will use ExtLib which provide basic monad definitions.

First, we need to define Writer monad. ExtLib only provides WriterT monad transformer, from which we can trivially create a Writer monad:

Variable s t : Type.
Variable Monoid_s : Monoid s.

Definition writer := writerT Monoid_s ident.
Definition runWriter x := 
   unIdent (@runWriterT s Monoid_s ident t x).
Definition execWriter x:= psnd (runWriter x).
Definition evalWriter x:= pfst (runWriter x).

Here, we define `writer` type constructor and `runWriter` method.  Additionally, we define a couple of convenience methods on top of `runWriter` to extract state and the value from the results of unwrapping the monad.

ExtLib has CoMonad class defined as:

Class CoMonad (m : Type -> Type) : Type :=
{ coret : forall {A}, m A -> A
; cobind : forall {A B}, m A -> (m A -> B) -> m B
}.

The first thing you notice is that the names are different from the ones we've seen in Haskell. They reflect category theory duality between monads and comonads. Essentially 'coret' is 'extract,' and 'cobind' is 'extend' with slightly different argument order. For now, let us stick to these definitions. We define a Writer comonad as:

Variable w: Type.
Variable m: Monoid w.

Global Instance WriterCoMonad:  CoMonad (@writer w m) := {
  coret A x := evalWriter x ;
  cobind A B wa f := tell (execWriter wa) ;; ret (f wa)
}.

The definition is pretty much the same as we wrote in Haskell except for the adjustments for Coq/ExtLib monad syntax.

Since we are in Coq, in addition to defining the comonad we can formalize comonad laws and prove that our implementation complies with them. Unfortunately, `cobind` argument's order make the formulation of comonad rules little clumsy, so we will first define wrappers reverting it to familiar Haskell's name and argument order:

Variable m : Type -> Type.
Variable C : CoMonad m.

Definition extract {A:Type}: (m A) -> A := @coret m C A.
Definition extend {A B:Type} (f: m A -> B): m A -> m B  :=
    fun x => @cobind m C A B x f.

Now we can define CoMonadLaws class which specifies the rules:

Class CoMonadLaws : Type := {
  extend_extract: forall (A B:Type),   extend (B:=A) extract = id ;
  extract_extend: forall (A B:Type) {f},   extract  (@extend A B) f = f;
  extend_extend: forall (A B:Type) {f g}, (@extend B A) f  (@extend A B) g = extend (f  extend g)
}.

Finally, we will prove that our WriterComonad satisfies these rules by instantiating this class and proving the rules. To do this, we need to assume that the proof is provided that underlying monid instance used in our Writer complies to MonoidLaws.

Variable ml: MonoidLaws m.

Global Instance WriterCoMonadLaws:
    CoMonadLaws (@WriterCoMonad).
  Proof.
    split.
    -
      intros A B.
      unfold extract, extend.
      extensionality x.
      unfold coret, cobind, WriterCoMonad, id.
      unfold evalWriter, execWriter, runWriter.
      destruct x, runWriterT, unIdent.
      simpl.
      repeat f_equal.
      apply ml.
    -
      reflexivity.
    -
      intros A B f g.
      unfold extract, extend, compose.
      unfold coret, cobind, WriterCoMonad.
      extensionality x.
      repeat f_equal.
      apply ml.
  Qed.

For clarity in the code snippets above, I have omitted some auxiliary code. You can look at the full example at WriterComonad.v.

Wednesday, March 8, 2017

Writer comonad in Haskell

Writer monad is readily available in Haskell. But what about Writer comonad? Let us start with the basics. Comonad class is defined as:

Comonad w where
    extract :: w a -> a
    duplicate :: w a -> w (w a)
    extend :: (w a -> b) -> w a -> w b

The minimal implementation requires you to implement 'extract' and 'extend.'  Additionally, it must satisfy the following comonad laws:

1
2
3
extend extract      = id
extract . extend f  = f
extend f . extend g = extend (f . extend g)

The 'extract' is pretty straightforward -- it allows safely extract value from the monad:

extract x = fst $ runWriter x

Our first intuition for 'extend' would be something like this:

extend f wa = return (f wa)

Which is incorrect, as it losses the state accumulated so far and replaces it with 'mempty' and thus do not satisfy the 1st law above (extend extract = id).

Without further ado, the writer comonad definition below correctly preserves the state and meets all three laws:

instance (Monoid w) => Comonad (Writer w) where
    extract x = fst $ runWriter x
    extend f wa = (tell $ execWriter wa) >> return (f wa)

Links:

  1. Haskell comonad library
  2. Category Theoretical view of comonads
  3. Another implementation of Writer and Reader comonads in Haskell

Friday, January 13, 2017

Switching between 3 languages in Ubuntu

I use three languages: English, Russian, and Ukrainian.



Switching between them is a bit tricky. Usually, you type in one of them, occasionally switching to a second one. It is almost never you need to randomly switch between all 3. MacOS keyboard switching mechanism reflects that, with default keyboard language switch shortcut toggling between the two most recently used languages. If you want to switch to the 3rd one, you can use toolbar menu or a separate keyboard shortcut:



When I switched to Ubuntu, the first thing I've noticed is that input switch shortcut cycles through all 3 languages. This is very inconvenient. Of course, I wanted to replicate MacOS behavior. With a little help from Stack Exchange, I was able to do so. Save this script as `set_lang.py` somewhere and make sure it executable:

Then, go to "System Settings" > "Keyboard" > "Shortcuts" > "Custom Shortcuts". Click the "+" and add a custom shortcut to it. I use Alt-SPACE. It will cycle between last 2 languages.


P.S. Perhaps I should try to supply a patch or at least open a Gnome bug for this behavior. If anybody familiar with Gnome infrastructure could point me in a right direction I can work on this.

Wednesday, November 2, 2016

BitBucket Git checkout over SSH with multiple accounts

If you have multiple user accounts at BitBucket (e.g. personal and work), you may run into a problem. The first think you will notice that you could not use the same ssh key. Adding it to the second account will give you "the key is already used by someone else" error message. Now if you generate a new key and add it to your ssh-agent (via ssh-add), you will not be able to check out from one of your accounts.

The problem is caused by ButBucket identifying a user by first SSH key he or she supplies. There are several workarounds for this problem, but the one I prefer is to use ssh Host alias. In your ~/.ssh/config file add section like this:
Host bitbucket-cm
    HostName git.bitbucket.com
    User git
    IdentityFile ~/.ssh/id_rsa.1
    IdentitiesOnly yes
This would basically create a SSH host alias bitbucket-cm for git.bitbucket.org associated with id_rsa.1 key. Now, instead of usual checkout URL, say, 
git clone git@bitbucket.org:org/repo.git 
use:
git clone git@bitbucket-cm:org/repo.git


Wednesday, July 13, 2016

Pokemon Go circa 2012

Looking at current Pokemon Go success (It added whopping $9B to company's capitalization) I could not help help looking back at another project we did few years ago.

Back in 2012 we, at Codeminders, were working on Computer Vision algorithms for mobile phones. At some point we realized that there is a potential for a business there and developed a demo and even pitched to some angel investors. The investors told us that the gaming aspect is not interesting, and we should focus on some real-life problems like real estate, etc. Eventually we just published a technology demo in app store and let it go.

To demonstrate location-based augmented reality gaming we have chosen Pikachu avatar which you can place at random locations, where your friends can find it. The unique part was that it worked indoors, not relying on GPS alone.

Here is an image from our investor's pitch:


You can see our presentation here (I've removed some boring slides related to financials and planning). Also you can try yourself our demo (Android version, iOS version). In the demo we replaced Pikachu with Squirrel to avoid infringing on Nintendo's trademarks.

Saturday, February 6, 2016

State Monad in Coq Tutorial

In this post we will review usage of StateMonad in Coq using ExtLib on an adapted example from  a Haskell State Monad tutorial.

We will implement a simple game function which takes a string from dictionary {a,b,c} and produces a score number. The game could have two modes: on and off. It consumes an input string character by character and updates the mode and the score at each step. Initially the mode is off. The character 'c' toggles the on/off mode flag. In off mode, consuming characters does not affect the score. In on mode, each occurrence of the character 'a' increases the score by 1, and each occurrence of the character 'b' decreases it by 1. So for example, the input sequence "ab" will result in a score of 0, "ca" a score of 1, "cabca" also a 0, and "abcaaacbbcabbab" will result in a score of 2.

Let us look the implementation of this game in Coq using State Monad:

1 Require Import Coq.ZArith.ZArith_base Coq.Strings.String.
2 Require Import ExtLib.Data.Monads.StateMonad ExtLib.Structures.Monads.
3 
4 Section StateGame.
5   Import MonadNotation.
6   Local Open Scope Z_scope.
7   Local Open Scope char_scope.
8   Local Open Scope monad_scope.
9 
10   Definition GameValue : Type := Z.
11   Definition GameState : Type := (prod bool Z).
12 
13   Variable m : Type → Type.
14   Context {Monad_m: Monad m}.
15   Context {State_m: MonadState GameState m}.
16   
17   Fixpoint playGame (s: string): m GameValue :=
18     match s with
19     |  EmptyString =>
20        v <- get ;;
21          let '(on, score) := v in ret score
22     |  String x xs =>
23        v <- get ;;
24          let '(on, score) := v in
25          match x, on with
26          | "a", true =>  put (on, score + 1)
27          | "b", true => put (on, score - 1)
28          | "c", _ =>   put (negb on, score)
29          |  _, _  =>    put (on, score)
30          end ;; playGame xs
31     end.
32 
33   Definition startState: GameState := (false, 0).
34 End StateGame.
35 
36 Definition main : GameValue :=
37   (evalState  
38            (playGame (state GameState) "abcaaacbbcabbab") startState).

I will omit description of the Coq nuts and bolts of Require, Import, and Open Scope instructions in lines 1-8.

In lines 10 and 11, we declare type aliases for our game score value as a signed integer (Z), and game state as a product of the boolean game mode and and an integer game score.

Variable 'm' in line 13 identifies our Monad by it's type constructor. Line 14 states that 'm' must be an instance of type class Monad. If we look at ExtLib's definition of this type class we will see that it just postulates the presence of the standard Monad bind and return (called ret here) functions:

Class Monad (m : Type → Type) : Type := { 
    ret : ∀ {t}, t → m t ;
    bind : ∀ {t u}, m t → (t → m u) → m u
}.

The next class, mentioned in line 15 brings additional get and put definitions, specific for State Monad:

Class MonadState (T : Type) (m : Type → Type) : Type := {
    get : m T ;
    put : T → m unit
}.

The actual game function definition begins on line 17. It is a recursive function consuming an input string and returning a monadic value as a result. The string is an inductive type and we match its two constructors on lines 19 and 22, respectively. If the string is empty, we use syntax x <- c1 ;; c2 which is a Coq's rendering of well a known do-notation. Informally: we extract a score element from the current state tuple and return it as a monadic value using ret. While in Haskell, we can write a much terser expression (_, score) <- get, in Coq we have to use an intermediate value 'v' followed by destructive let, which unfortunately does not allow us to use a wildcard for the first element (mode), which we do not need anyway.

In the case of the second constructor on line 22 we again use do-notation and destructing let to break down the current state into on and score components and then analyze their combinations via match. Again, we use slightly different syntax from Haskell's version and, instead of matching on x with guards, we match on both x and on. Depending on their combination, we update the score and on/off flag according to our game rules using the state monad's put function. Then, we proceed with the next recursive iteration chained by c1 ;; c2 notation which is sometimes called then operator (>>) in Haskell, and used to to chain monadic computations where c2 does not depend on the outcome of c1. This could be expanded into our familiar do-notation of:  _ <- c1 ;; c2.

On line 33, we define our initial game state (game off, score=0) as startState.

On line 36, we define our 'main' function, which runs a game for a specific string and returns an integer score. It is a bit difficult to parse, so for clarity let us split it into several definitions, specifying return types and making all arguments explicit via '@':

Definition m : Type → Type := state GameState.
Definition pg :  state GameState GameValue := @playGame m _  _ "abcaaacbbcabbab".
Definition main : GameValue := @evalState GameState GameValue pg startState.

The first definition 'm' uses the state record type to create our State Monad's constructor. It is defined as:

state: Type → Type → Type
Record state (S t : Type) : Type := mkState { runState : S → t * S }

We partially specialize state with GameState, which gives us a new type constructor (Type → Type), which instantiates a type which holds a function with a type GameState -> t * GameState.

Th next definition 'pg' holds a monadic value returned as a result of running our game. You may notice that its type is state GameState GameValue which is the same as m GameValue, the type of  playGame. Please note that playGame was defined inside a section. Consequently, when seen outside one, it is additionally universally quantified with all section variables. In our case playGame now takes 4 arguments. The first is a type constructor for the monad, the second is the type class instance Monad m that carries the monad functions, the third is the type class instance MonadState GameState m that carries the monad state functions, and the fourth is the string input to the function. We explicitly provide 'm' but use wildcards for type class parameters, letting the type class resolution mechanism find instances for parameters marked as _ . Coq will look for a type class instance of Monad (state GameState) and MonadState (state GameState). Both of these are defined as Global Instances in the ExtLib.Data.Monads.StateMonad library so type class resolution finds the definitions. Thus, after class resolution, the 'pg' definition will expand to:

(@playGame (state GameState) (Monad_state GameState)
     (MonadState_state GameState) "abcaaacbbcabbab"

Finally in the 'main' definition, we use the evalState function to get our result. It is defined as:

Definition evalState {t} (c : state t) (s : S) : t :=
    fst (runState c s).

The semantics are pretty straightforward; we apply State Monad to the initial state (run state transition function), and project out the first element of the returned tuple (the score).

To test our implementation we can use Compute main instruction to get the expected result (2).

P.S. I would like to thank Gregory Malecha for the explanations, comments, and suggestions he provided.