Variational Inference is a method to solve the most common Bayesian problem: given an observed data, find the probability functions that govern it generation. While the problem and its solution appear to be common, when VI firstly appeared it was considered extremely innovative since it was the very first solution…

Quantum computation (QC) has become a hot topic. Leading research institutes, corporates and dedicated startups, invest massive resources in studying this technology and its actual performances. As for nearly every innovative technology, we may ask both abstract questions such as what is it or why is it trendy? As well…

Everyone that dug their heels into the DL world probably heard, believed, or was a target for convincing attempts that it is the era of **Transformers** . Since its very first appearance, **Transformers** were a subject for massive study in several directions :

- Researchers searched for architecture improvements.
- People study…

This work was born as an outcome of a discussion with one of my savvy acquaintances. …

Golden ratio and Fibonacci sequence are well known “entities”. Whether you are a mathematician or an artist or just a curious person you probably met them somewhere. In this post I will present a well known property that ties them together.

I guess every one knows what is Fibonacci sequence…

In this post I aim to summarize a pretty “old” paper composed by** Max Welling** and **Yee Whye Teh**. It presents the concept of ** Stochastic Gradient Langevin Dynamics **(

My motivation is to present the mathematical concepts that pushed SGLD forward. For those…

Several weeks ago one of our business unit members told me that “life can be more convenient” if they will be able to run our DL engines on their servers. The notion “servers” means running on Cygwin engines where python is not always installed and if it is, the version…

This post is entailed to my previous post . Recall that I discussed there whether a VAE is trained using ELBO the encoder converges to the desired optimal parameters (traditionally, we consider Gaussians hence the parameter are mean and standard deviation ). It appeared that this is not the situation…

The motivation for this post was raised while learning the theory of VAE and following its common implementations. As we know VAE is constructed of two networks: one (**the encoder**) is trained to map real data into a Gaussian distribution aiming to optimize its KL distance from the a…

The factorization problem is one of those problems that everyone can understand and most of us feel that it is their opportunity: the divine breach to mathematics hole of fame:

**Let N, an odd integer a relatively big binary representation ( e.g. …**