Remix.run Logo
bicepjai 4 hours ago

This is one of my favorite philosophical questions to ponder. I always ask it in interviews as a warmup to get their thoughts. I’ve noticed that interviewees often curl up, thinking it’s a technical question, so I’ve been modifying the question one after the other to make it less scary. The interviews are for data scientist roles.

Buttons840 4 hours ago | parent | next [-]

I haven't read the article, but my understanding is that a normal curve results from summing several samples from most common probability distributions, and also a normal curve results from summing many normal curves.

All summation roads lead to normal curves. (There might be an exception for weird probability distributions that do not have a mean; I was surprised when I learned these exist.)

Life is full of sums. Height? That's a sum of genetics and nutrition, and both of those can be broken down into other sums. How long the treads last on a tire? That's a sum of all the times the tire has been driven, and all of those times driving are just sums of every turn and acceleration.

I'm not a data scientist. I'm just a programmer that works with piles of poorly designed business logic.

How did I do in my interview? (I am looking for a job.)

abetusk 2 hours ago | parent [-]

Say I have N independent and identically distributed random variables with finite mean. Assuming the sum converges to a distribution, what is the distribution they converge to?

Buttons840 2 hours ago | parent [-]

A normal distribution.

abetusk 2 hours ago | parent [-]

Levy stable [0].

If I had made the extra condition that the random variables had finite variance, you'd be correct. Without the finite variance condition, the distribution is Levy stable.

Levy stable distributions can have finite mean but infinite variance. They can also have infinite mean and infinite variance. Only in the finite mean and finite variance case does it imply a Gaussian.

Levy stable distributions are also called "fat-tailed", "heavy-tailed" or "power law" distributions. In some sense, Levy stable distributions are more normal than the normal distribution. It might be tempting to dismiss the infinite variance condition but, practically, this just means you get larger and larger numbers as you draw from the distribution.

This was one of Mandelbrot's main positions, that power laws were much more common than previously thought and should be adopted much more readily.

As an aside, if you do ever get asked this in an interview, don't expect to get the job if you answer correctly.

[0] https://en.wikipedia.org/wiki/L%C3%A9vy_distribution

hilliardfarmer 4 hours ago | parent | prev [-]

A lot of times I can't tell if I'm the idiot or if everyone else is. Says that this isn't an interesting question at all and the article was horrible. I studied data science for a few years but I'm no expert, but it seems pretty obvious to me that if you make a series of 50/50 choices randomly, that's the shape you end up with and there's really nothing more interesting about it than that.

smcin 2 hours ago | parent | next [-]

Sampling 50/50 choices would be a binary distribution that (very crudely) approximates a normal distribution.

But the counterintuitive thing about the CLT is that it applies to distributions that are not normal.

alanbernstein 2 hours ago | parent | prev [-]

I don't think "obvious" is the right word here. It makes perfect sense when you understand it, but it's not a conclusion that most people could come to immediately without detailed, assisted study.