Remix.run Logo
quietbritishjim 21 hours ago

> Big disadvantages of matlab:

I will add to that:

* it does not support true 1d arrays; you have to artificially choose them to be row or column vectors.

Ironically, the snippet in the article shows that MATLAB has forced them into this awkward mindset; as soon as they get a 1d vector they feel the need to artificially make it into a 2d column. (BTW (Y @ X)[:,np.newaxis] would be more idiomatic for that than Y @ X.reshape(3, 1) but I acknowledge it's not exactly compact.)

They cleverly chose column concatenation as the last operation, hardly the most common matrix operation, to make it seem like it's very natural to want to choose row or column vectors. In my experience, writing matrix maths in numpy is much easier thanks to not having to make this arbitrary distinction. "It's this 1D array a row or a column?" is just over less thing to worry about in numpy. And I learned MATLAB first, do I don't think I'm saying that just because it's what I'm used to.

D-Machine 20 hours ago | parent [-]

* it does not support true 1d arrays; you have to artificially choose them to be row or column vectors.

I despise Matlab, but I don't think this is a valid criticism at all. It simply isn't possible to do serious math with vectors that are ambiguously column vs. row, and this is in fact a constant annoyance with NumPy that one has to solve by checking the docs and/or running test lines on a REPL or in a debugger. The fact that you have developed arcane invocations of "[:,np.newaxis]" and regular .reshape calls I think is a clear indication that the NumPy approach is basically bad in this domain.

You do actually need to make a decision on how to handle 0 or 1-dimensional vectors, and I do not think that NumPy (or PyTorch, or TensorFlow, or any Python lib I've encountered) is particularly consistent about this, unless you ingrain certain habits to always call e.g. .ravel or .flatten or [:, :, None] arcana, followed by subsequent .reshape calls to avoid these issues. As much as I hated Matlab, this shaping issue was not one I ran into as immediately as I did with NumPy and Python Tensor libs.

EDIT: This is also a constant issue working with scikit-learn, and if you regularly read through the source there, you see why. And, frankly, if you have gone through proper math texts, they are all extremely clear about column vs row vectors and notation too, and all make it clear whether column vs. row vector is the default notation, and use superscript transpose accordingly. It's not that you can't figure it out from context, it is that having to figure it out and check seriously damages fluent reading and wastes a huge amount of time and mental resources, and terrible shaping documentation and consistency is a major sore point for almost all popular Python tensor and array libraries.

quietbritishjim 13 hours ago | parent [-]

> It simply isn't possible to do serious math with vectors that are ambiguously column vs. row ... if you have gone through proper math texts

(There is unhelpful subtext here that I can't possibly have done serious math, but putting that aside...) On the contrary, most actual linear algebra is easier when you have real 1D arrays. Compare an inner product form in Matlab:

   x' * A * y
vs numpy:

   x @ A @ y
OK, that saving of one character isn't life changing, but the point is that you don't need to form row and column vectors first (x[None,:] @ A @ y[:,None] - which BTW would give you a 1x1 matrix rather than the 0D scalar you actually want). You can just shed that extra layer of complexity from your mind (and your formulae). It's actually Matlab where you have to worry more - what if x and y were passed in as row vectors? They probably won't be but it's a non-issue in numpy.

> math texts ... are all extremely clear about column vs row vectors and notation too, and all make it clear whether column vs. row vector is the default notation, and use superscript transpose accordingly.

That's because they use the blunt tool of matrix multiplication for composing their tensors. If they had an equivalent of the @ operator then there would be no need, as in the above formula. (It does mean that, conversely, numpy needs a special notation for the outer product, whereas if you only ever use matrix multiplication and column vectors then you can do x * y', but I don't think that's a big deal.)

> This is also a constant issue working with scikit-learn, and if you regularly read through the source there, you see why.

I don't often use scikit-learn but I tried to look for 1D/2D agreement issues in the source as you suggested. I found a couple, and maybe they weren't representative, but they were for functions that could operate on a single 1D vector or could be passed as a 2D numpy array but, philosophically, with a meaning more like "list of vectors to operate on in parallel" rather than an actual matrix. So if you only care about 1d arrays then you can just pass it in (there's a np.newaxis in the implementation, but you as the user don't need to care). If you do want to take advantage of passing multiple vectors then, yes, you would need to care about whether those are treated column-wise or row-wise but that's no different from having to check the same thing in Matlab.

Notably, this fuss is precisely not because you're doing "real linear algebra" - again, those formulae are (usually) easiest with real 1D arrays. It when you want to do software-ish things, like vectorise operations as part of a library function, that you might start to worry about axes.

> unless you ingrain certain habits to always call e.g. .ravel or .flatten or [:, :, None] arcana

You shouldn't have to call .ravel or .flatten if you want a 1D array - you should already have one! Unless you needlessly went to the extra effort of turning it into a 2D row/column vector. (Or unless you want to flatten an actual multidimensional array to 1D, which does happen; but that's the same as doing A(:) in Matlab.)

Writing foo[:, None] vs foo[None, :] is no different from deciding whether to make a column or row vector (respectively) in MATLAB. I will admit it's a bit harder to remember - I can never remember which index is which (but I also couldn't remember without checking back when I used Matlab either). But the numpy notation is just a special case of a more general and flexible indexing system (e.g. it works for higher dimensions too). Plus, as I've said, you should rarely need it in practice.