| ▲ | amluto 4 hours ago | |
It does seem like a KP-like algorithm ought to be able to optimize the break positions without extreme algorithmic difficulty aside from the inputs being considerably more complex than for Latin block print: the cost function for a proposed line is a straightforward [0] calculable function of the contents of the line, and I think one could make a dynamic programming algorithm that tracks, for each input position, the cost of the optimal layout of all text up to that position with a break at that position. This gives an algorithm that takes cubic time. (For input length n, you need to fill in n values in the table. Each value scans the entire table before that position and does a calculation with complexity linear in the proposed line length.) As a practical matter, there’s an input length n and there is some upper bound B on a credible line length as measured in code points, so there are only at most n*B credible proposed lines to evaluate, which also limits the useful look back on the table to B positions, so I think the time complexity could be reduced to O(n*B^2) without making the results worse on reasonable inputs, and this is probably quite tolerable. [0] Straightforward once you’ve implemented the whole Arabic rendering stack, anyway. I am certainly not qualified to calculate this function :) | ||