▲ | xelxebar 3 months ago | ||||||||||||||||
> Subordination of detail The paper doesn't really explore this concept well, IMHO. However, after a lot of time reading and writing APL applications, I have found that it points at a way of managing complexity radically different from abstraction. We're inundated with abstraction barriers: APIs, libraries, modules, packages, interfaces, you name it. Consequences of this approach are almost cliché at this point—dizzyingly high abstraction towers, developers as just API-gluers, disconnect from underlying hardware, challenging to reason about performance, _etc._ APL makes it really convenient to take a different tack. Instead of designing abstractions, we can carefully design our data to be easily operated on with simple expressions. Where you would normally see a library function or DSL term, this approach just uses primitives directly: For example, we can create a hash map of vector values and interred keys with something like
Standard operations are then immediately accessible:
What I find really nice about this approach is that each expression is no longer a black box, making it really natural to customize expressions for specific needs. For example, insertion in a hashmap would normally need to have code for potentially adding a new key, but above we're making use of a common invariant that we only need to append values to existing keys.If this were a library API, there would either be an unused code path here, lots of variants on the insertion function, or some sophisticated type inference to do dead code elimination. Those approaches end up leaking non-domain concerns into our codebase. But, by subordinating detail instead of hiding it, we give ourselves access to as much domain-specific detail as necessary, while letting the non-relevant detail sit silently in the background until needed. Of course, doing things like this in APL ends up demanding a lot of familiarity with the APL expressions, but honestly, I don't think that ends up being much more work than deeply learning the Python ecosystem or anything equivalent. In practice, the individual APL symbols really do fade into the background and you start seeing semantically meaningful phrases instead, similar to how we read English words and phrases atomically and not one letter at a time. | |||||||||||||||||
▲ | jonahx 3 months ago | parent | next [-] | ||||||||||||||||
To rephrase crudely: "inline everything". This is infeasible in most languages, but if your language and concise and expressive enough, it becomes possible again to a large degree. I always think about how Arthur Whitney just really hates scrolling. Let alone 20 open files and chains of "jump to definition". When the whole program fits on page, all that vanishes. You navigate with eye movements. | |||||||||||||||||
| |||||||||||||||||
▲ | DHRicoF 3 months ago | parent | prev | next [-] | ||||||||||||||||
> k v⍪←↓⍉↑(2 0.33)(2 0.01)(3 0.92) ⍝ insert values > k{str[⍺] ⍵}⌸v ⍝ pretty print > k v⌿⍨←⊂k≠str⍳⊂'buggy' ⍝ deletion I like your funny words. No, really, I should expend some time learning APL. But your idea deeply resonate with my last weeks struggle. I have a legacy python code with too much coupling, and every prior attempt to "improve things" went adding more abstraction over a plain wrong data model. You can't infer, reading the code linearly, what methods mutate their input objects. Some do, some don't. Sometimes the same input argument is returned even without mutation. I would prefer some magic string that could be analyzed and understood than this sea of indirection with factories returning different calculators that in some instances they don't even share the same interface. Sorry for the rant. | |||||||||||||||||
▲ | rak1507 3 months ago | parent | prev | next [-] | ||||||||||||||||
The problem is that this isn't a "hashmap" in any meaningful sense, because all the operations are O(n). | |||||||||||||||||
| |||||||||||||||||
▲ | TuringTest 3 months ago | parent | prev | next [-] | ||||||||||||||||
> APL makes it really convenient to take a different tack. Instead of designing abstractions, we can carefully design our data to be easily operated on with simple expressions. Where you would normally see a library function or DSL term, this approach just uses primitives directly But in doing that, aren't you simply moving the complexity of the abstractions from the functions code into the data structure? Sure you can use generic operators now, but then you need to carefully understand what the value pairs represent in terms of domain logic and how to properly maintain the correct structure in every operation. Someone reading the program for the same time will have just the same difficulties understanding what the code does, not in terms of primitives, but in terms of business domain meanings. I mean there is an improvement in your approach, but I don't think it comes from putting the complexity at a different place, but because this way you get to see the code and the actual actual values at the same time. I have the insight that what makes complex programming easy is this juxtaposition of runtime data and code operations visible together. That's why IDE tools have building better and better debuggers and inspectors to let you see what the program is doing at each step. In that context, creating new good concise abstractions is a good thing, wether you abstract parts of the operations or parts of the data structure. | |||||||||||||||||
| |||||||||||||||||
▲ | jodrellblank 3 months ago | parent | prev [-] | ||||||||||||||||
> "immediately accessible" Most of your 'insert values' code is wrangling the data from what the interpreter lets you type into the form you need it; ⍪← do the appending which is what you want, ↓⍉↑()()() are input parsing and transform which you don't care about but have to do to get around APL limitations and the interpreter's input parsing limitations. 9/11ths of the symbols in that line are APL boilerplate. 81% noise. Deletion with k v⌿⍨←⊂k≠str⍳⊂'buggy' too; enclosing 'buggy' for index to search for it as a single element in a nested array, keeping in mind that (⍴'a')≠(⍴'aa') in case you had a single character key. Making a Boolean array which is nothing to do with your problem domain - it's what you have to do to deal with APL not offering you the thing you want. Saying "we can create a hash map of vector values" is misleading because there isn't any hashing going on. Your code doesn't check for duplicate keys. You can't choose the hash, can't tune the hash for speed vs. random distribution. The keys are appended in order which makes searching slower than a sorted array (and you can't sort array k without messing up its links to array v, same with using nub ∪ on it to make it a set) and at scale the array search becomes slow - even with Dyalog APL's bit-packing and SIMD accelerated searches behind the scenes - so there is a magic interpreter command (I-Beam 1500)[1] which marks an array for internal hashing for faster lookups. But remember the leaky internal abstractions because ⍪← isn't really appending; APL's model is nearly immutable, the array is streamed through and updated into a new array, so the hash has to work with that and you need to understand it and keep it in mind for the times it won't work and some code will invalidate the hash and end up slower. ---- Good ideas of language and tooling design such as "pit of success", "there's only one way to do it", "the first way that comes to mind should be the right way to do it", "different work should look different" are all missing from APL; code that looks workable often has some small internal detail why it isn't; there's no obvious way to do common things. Compare it with Python or C# or similar languages which have syntax for kv={'a':1, 'b':2} and it "just works", if you miss parens or miss a colon it looks clearly wrong, throws editor squigglie underlines, and gives compile time errors. There's many fewer trivial typos which look plausible and run error-free but screw things up. Make a mistake and get an error, and the error won't be LENGTH ERROR or DOMAIN ERROR it will be something mostly helpful. The map behaves how one would expect - deleting doesn't stream the entire dictionary through memory and make a new copy; there's no need to manually link indices between different arrays; failed key lookup gives null or helpful error, the key can be any reasonable data type in the language, etc. APL implementations are reliant on magic interpreter functions for IO (⎕NGET, ⎕CSV, ⎕JSON and friends). Error handling is bad, logging is bad, debugging is bad, because an entire expression is executed as one thing - and with hooks and forks, one thing which can't easily be split into smaller pieces. e.g. if ↓⍉↑ is a valid transform that returns some data but it's not the transform you want, how do you find out why not? How do you find out what it is doing instead of what you thought it was doing? How do you tell that k v⌿⍨←⊂... needed that enclose on the boolean matrix? Trying without the enclose gets LENGTH ERROR, then what? The language and tooling give you minimal to zero help with any of this, you're left with experimentation - and that sucks because you can't actually type in the 2D arrays you want to work with and the code doesn't work for some mismatch of shape/nesting/enclosing/primitives that you don't understand, so experiments to understand what's happening don't work for the exact same reasons that you can't do the experiments for reasons you don't understand. One needs a precise feel for all the different array shapes, why ⊂3 doesn't seem to do anything, why ⊆ and ⊂ are different, what things are a 1D array and what are a 2D array with one row or a boxed 1D array or a nested 1D array as a scalar inside a 2D array, or which things are working by scalar extension[3] before one can do much of anything including experimenting and learning how to do much of anything. Where is the 'immediately accessible' pattern "if a key is in the dictionary, update it, if it's not there, add it"? In Python it's if/else and "key in map". In C# it's if/else and "map.Contains(key)". In your APL design we start with the deletion code and enclose the key, search for it in str to get an index-of and hope it only comes up once, then immediately we find branching is difficult. Do we use ⍣ with a boolean arg to choose whether it runs or not? Do we always append but do a boolean compress on the data to append something-or-nothing? Do we use dfns and guards? Do we switch to a trad-fn and use :if/:else? Do we make an array of functions and index which one to execute, or an array of names and index which one to eval ⍎? There's no immediate answer so instead of skipping over this trivial non-issue in any other language and on with the problem, we descend into "how to reimplement basic thing in APL" for branching. > "What I find really nice about this approach is that each expression is no longer a black box, making it really natural to customize expressions for specific needs." It's an argument Aaron Hsu has made, but it's like using Up-Goer 5 or speaking Toki Pona where you can't say "fire engine" but can say "car of man with job to stop fire".[4] [1] https://docs.dyalog.com/latest/CheatSheet%20-%20I-Beams.pdf | |||||||||||||||||
|