Remix.run Logo
quietbritishjim 13 hours ago

> It simply isn't possible to do serious math with vectors that are ambiguously column vs. row ... if you have gone through proper math texts

(There is unhelpful subtext here that I can't possibly have done serious math, but putting that aside...) On the contrary, most actual linear algebra is easier when you have real 1D arrays. Compare an inner product form in Matlab:

   x' * A * y
vs numpy:

   x @ A @ y
OK, that saving of one character isn't life changing, but the point is that you don't need to form row and column vectors first (x[None,:] @ A @ y[:,None] - which BTW would give you a 1x1 matrix rather than the 0D scalar you actually want). You can just shed that extra layer of complexity from your mind (and your formulae). It's actually Matlab where you have to worry more - what if x and y were passed in as row vectors? They probably won't be but it's a non-issue in numpy.

> math texts ... are all extremely clear about column vs row vectors and notation too, and all make it clear whether column vs. row vector is the default notation, and use superscript transpose accordingly.

That's because they use the blunt tool of matrix multiplication for composing their tensors. If they had an equivalent of the @ operator then there would be no need, as in the above formula. (It does mean that, conversely, numpy needs a special notation for the outer product, whereas if you only ever use matrix multiplication and column vectors then you can do x * y', but I don't think that's a big deal.)

> This is also a constant issue working with scikit-learn, and if you regularly read through the source there, you see why.

I don't often use scikit-learn but I tried to look for 1D/2D agreement issues in the source as you suggested. I found a couple, and maybe they weren't representative, but they were for functions that could operate on a single 1D vector or could be passed as a 2D numpy array but, philosophically, with a meaning more like "list of vectors to operate on in parallel" rather than an actual matrix. So if you only care about 1d arrays then you can just pass it in (there's a np.newaxis in the implementation, but you as the user don't need to care). If you do want to take advantage of passing multiple vectors then, yes, you would need to care about whether those are treated column-wise or row-wise but that's no different from having to check the same thing in Matlab.

Notably, this fuss is precisely not because you're doing "real linear algebra" - again, those formulae are (usually) easiest with real 1D arrays. It when you want to do software-ish things, like vectorise operations as part of a library function, that you might start to worry about axes.

> unless you ingrain certain habits to always call e.g. .ravel or .flatten or [:, :, None] arcana

You shouldn't have to call .ravel or .flatten if you want a 1D array - you should already have one! Unless you needlessly went to the extra effort of turning it into a 2D row/column vector. (Or unless you want to flatten an actual multidimensional array to 1D, which does happen; but that's the same as doing A(:) in Matlab.)

Writing foo[:, None] vs foo[None, :] is no different from deciding whether to make a column or row vector (respectively) in MATLAB. I will admit it's a bit harder to remember - I can never remember which index is which (but I also couldn't remember without checking back when I used Matlab either). But the numpy notation is just a special case of a more general and flexible indexing system (e.g. it works for higher dimensions too). Plus, as I've said, you should rarely need it in practice.