💾 Archived View for dioskouroi.xyz › thread › 25006619 captured on 2020-11-07 at 00:54:30. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
________________________________________________________________________________
While I'm always down with research into applied restricted languages, I'm not sure this is the path to mainstream parallel programming. It's one thing to identify that parallel execution is safe, and it's quite another to exploit that parallelism efficiently. E.g. in a pure functional language, you can automatically extract an enormous amount of parallelism, but doing so eagerly and naively will lead to a very slow program, as the communication and synchronisation overheads will dwarf any performance benefits of parallel execution.
In practice, it appears that programmers don't find it problematic to explicitly indicate when things should run in parallel, and the main task of the language is to impose restrictions to avoid accidental sharing and race conditions. Rust is probably the most hyped example of such a language, but it's by no means the first or only one.
Also, the elephant in the room is that in many common cases, the programming model is such that you get massive parallelism almost for free - e.g. when writing server-side web programs, where parallelism inside of one request may not matter (and often the database will take care of it when it does), because you have thousands of concurrent requests anyway.
I don't see the connection between the Halting Problem and automatic parallelization. The Halting Problem itself plays no noteworthy role in practical algorithm verification anyway because in practice halting is not proved for any input program, but rather for only some very specific inputs.
Theoretically speaking, parallel computation makes proving that a computation halts sometimes unnecessary. For example, a parallel OR boolean comparison may succeed, while the corresponding sequential comparison would lead to an infinite loop.
Is there another connection? Isn't the problem with parallel computation rather that there is no unique description of it? AFAIK, e.g. derivations of the untyped lambda calculus describe any kind of sequential computation, just like a Turing Machine would do. In contrast, the pi calculus does not offer this uniqueness. Or is there such a description for _any_ parallel computation?
Anyway, maybe someone else can explain the connection between the Halting Problem and parallel computation, if there is one, because it's not clear to me from the article.
The actual Halting-ness of the program (or subsections of the program) seems to be a nice bonus of the language's actual goal: accurate worst-case estimation of running time for all computations. That then feeds into the automatic parallelisation, because it helps the compiler to decide how to parallelise different computations against each other based on their estimated running times.
The fact that the language is not Turing-complete is a necessary consequence of being able to accurately determine a worst-case execution time for some code.
I always took the Turing 'Halting problem' as a sortof proxy for the general unknowableness of complex programs. As in, 'halting' could have been substitued with 'somewhere in the program variable x gets set to 5' and you could still construct a program in which it would be impossible to prove whether or not x gets set to 5. And of course the proof of the Halting problem relies on the possibility of crafting programs that are deliberately hard to parse - something that may or may not happen in practice.
And so for me it all boils down to 'complexity is complex'. Which is kindof obvious to anyone thats wokred on large programs but might not be so intuitively obvious in some other fields. And so thats why there's always bugs. However hard you try, there is so much complexity in a large bit of software that you can't ever squash them all.
And parallel processing adds another layer of complexity for sure. But I dont see how worrying about the Halting Problem helps deal with that any better than some of the language constructs we've already seen.
Yep! That general version of the halting problem is Rice's Theorem, and that's exactly the thinking that proves it, by showing that if we did have an algorithm to determine any given non-trivial property of a program, we'd be able to use that algorithm to solve the halting problem.
As for parallel processing, it can be fun to think about how things would work if you had a CPU with infinite cores. Any time your program wants to explore two or more options, you always have cores to explore those options in parallel. This sort of thinking is the root of the "non-deterministic" 'N' in "NP"... with infinite cores you can solve NP problems (including NP-complete ones) in polynomial time.
With infinitely many cores that are all connected to the same bus while each core has a unique id, you actually can solve the halting problem.
You can basically brute force infinitely many inputs at once.
This doesn't work. If none of your tms halt you don't know if they would in a little while.
You can simulate this by making a tm which runs 1 step of each machine at a time. This also hints at the equivalence other people have spoken of.
What you describe would solve an NP problem. Since each NP has a successful path that answers the question in polytime, which must be taken by at least one core.
However this does not help you with a program which would loop infinitely.
What you are describing is essentially a nondeterministic Turing machine. "What can you compute"-wise, NTMs are equivalent with TMs, and the halting problem still applies.
NTMs are interesting when it comes to computation time, however. Open question: Can a TM compute in polynomial time what a NTM can compute in polynomial time?
I'm not sure that that's true. Some of the tricky cases for the halting problem boil down to the spiritual equivalent of trying to determine whether "this sentence is false" is true or false. It's not; it's neither. No infinite number of cores can help you there.
> 'halting' could have been substitued with 'somewhere in the program variable x gets set to 5' and you could still construct a program in which it would be impossible to prove whether or not x gets set to 5.
Rice's theorem extends the Halting Problem in a very general way. The idea is that for any program 'foo', we can invent a new program 'bar; foo;' which will only execute 'foo' once 'bar' has halted. Since the halting problem is undecidable in general, and 'bar' can be any program, it's undecidable whether such a program will eventually execute 'foo'. Any property that depends on whether or not 'foo' gets executed is hence undecidable.
_However_: there are many properties which _don't_ depend on whether 'foo' will eventually be executed. For example 'does this program take more than 1000 steps to halt?' is decidable, since we can just run the program for up to 1000 steps to see if it halts or not.
More subtley, we can check things about the syntax, e.g. 'does this program contain a call to the "LaunchTheMissiles" function?'. That's easy to decide by just looking at all of the tokens in the code. Of course, this doesn't tell us whether such a call would ever be reached, e.g. it would flag a program like
if (false) { LaunchTheMissiles(); } else { ImplementWorldPeace(); }
This program would never launch the missiles, but it's often enough to spot _potential_ issues, even if our errors are overly-conservative.
Such false-positives can be really annoying when we try to retro-fit such checks on to languages which weren't designed for them; e.g. trying to spot pointer errors in C code or potential nulls in Java code. In contrast, designing them into the language from the start can be really useful. For example, that's what type systems do. For example, consider the following code:
String foo() { if (true) { return "hello"; } else { return 123; } }
This is perfectly correct, since it claims to return a String and it will return a String. However, every type checker in the world will reject this program due to the mere presence of the 'return 123' branch.
This is quite a common pattern in programming language research: problems which are undecidable in most languages can be made trivial by designing the language with that problem in mind.
In this case their 'while' example in C is very illustrative: there's not enough information in the code to know which (if any) of the loop iterations rely on the previous ones having been executed first. The fact that it _could_ rely on execution order means we can't safely parallelise it.
In contrast, a call like 'map(something, myList)' implies that each element's processing is independent (hence why MapReduce is popular for parallelism!).
> every type checker in the world will reject this program due to the mere presence of the 'return 123' branch
Just to expand on this a bit: It would be trivial to extend a type checker to cover this case. But a) It wouldn't be terribly useful because this is a rather stupid way to write code, and b) It would always be possible to contrive yet another example that the "extended type checker" would fail on, because static analysis can never completely replace running the program. That's basically Rice's Theorem.
So how do we make this barely not Turing complete? We make a list/array type a first class citizen in the VM itself and then we only add operations that know the bounds that they are to operate on based on that list or some other condition that cannot be directly mutated by the user's own code.
I don't understand this paragraph. If they do have do-while loops, how do they prevent turing completeness?
You can have a condition that you need to be able to prove that some variable is strictly decreasing on each iteration. Then you ensure that the loop exits once the variable reaches zero or less.
ATS has this kind of proof built into the language, so that the type system helps you prove that a loop will always exit, and if it can't, it's a compile error.
It's not entirely clear how useful this is in practice. While you may be able to prove that a loop always exits, it can prove nothing about how long it will take before it does. A loop never terminating, and one that terminates some time after the Milky Way has merged with Andromeda is pretty much equivalent from a practical standpoint.
From a computational perspective you're right, but for languages which allow formal verification (like ATS) there is a big difference between infinite loops and long-running loops.
In particular, if we only care whether something exists then we can accept a long-running computation as proof without having to run it, whilst an infinite loop may be false. For example, a function which takes a list, an index and a proof that the index occurs in the list: we can look up the index safely, without having to care about the actual contents of the proof (the language ensures it's valid).
On the other hand, we can write an infinite loop which _claims_ to prove anything we like, but never actually has to cough up anything. For example:
ProofOfRiemannConjecture myProof { while (true) {} }
If I got it correctly, they do not have do-while, it was only shown as an example operation. Instead, they only use "for each", i.e., bounded loops directly on the data structures list/array.
That's mostly correct. :) We used `do-while` to demonstrate that we _could_ replicate the behaviors of a normal assembly language within the dependency graph representation.
However, we do have a restricted version of `do-while` implemented as described in the Sequential Algorithms RFC[1]. We want to push people into writing parallelizable code by having the default constructs and data types encouraging that, but we know certain algorithms are only possible in a sequential form, and some of them (like the Newton-Raphson method) are not predictable in the number of sequences to run so we have an escape hatch in the language that lets you do the classic iterative programming, but all of them wrapped in a maximum iteration counter so we'll still be able to assign a maximum expected runtime to them (and be sure they halt, perhaps with an error).
[1]:
https://github.com/alantech/alan/blob/main/rfcs/007%20-%20Se...
I think this is saying that arrays are immutable at runtime and the bounds of the array are checked on compile.
By doing this, loops can be checked before execution to prevent infinite loops and so constrain that as a Turing Machine construct.
I think the underlying good idea here is that Turing Completeness is maybe overrated. You could argue that most good programming languages are good because of what they _don't_ let you do - constraints around how state is mutated is core to Object-oriented and functional programming. Adding some more constraints to ensure complete and reliable static analysis sounds like an excellent idea to me.
That's not enough. You also have to constrain what the mapped function is doing. For example, it could create one or more brand new arrays or for that matter perform any infinite computation without mutating the original array. Proving that the mapped function _does not_ do any of that requires that it not be Turing Complete.
You make sure all operations on lists and arrays terminate. For instance by requiring the user to annotate where progress is being made on the bounded array.
Of course, there are some programs that are intended to run potentially forever, e.g. servers. Does Alan have some way to allow a non-terminating top level while constraining all the pieces underneath?
Yes, we didn't want to make this article too long or confusing, but we have a static, compile-time-defined event system[1] to allow for applications that wait for external inputs to start executing again. Each event handler is guaranteed to halt even if the application continues to run.
[1]:
https://alan-lang.org/alan_overview.html#a-static-event-syst...
We give up on automated program analysis far too easily - even for "Turing complete" languages.
To be truly Turing complete requires infinite memory. That clearly doesn't hold for any current programming language.
Mathematical proofs are only applicable where their assumptions are valid. So why would we believe that the where memory isn't infinite (everywhere in practice) that Turing's halting analysis is applicable?
Sure, automated program analysis might not be easy, but that doesn't mean we should just give up. Indeed I have a simple algorithm (although relatively inefficient) to solve the halting problem where memory size is fixed and finite: enumerate every state until either a state is repeated or the program halts. This completes in O(2^N) time where N is the size of the memory.
But I'm sure someone clever will come up with a more efficient algorithm than that. I already have several ideas that this margin is too small to contain.
For me the tricky bits are where a loop terminates if and only if Goldbach's conjecture is true (or similar) - but perhaps a compiler could require an "unsafe" block for such code (in the vein of Rust).
> To be truly Turing complete requires infinite memory. That clearly doesn't hold for any current programming language.
> Indeed I have a simple algorithm (although relatively inefficient) to solve the halting problem where memory size is fixed and finite: enumerate every state until either a state is repeated or the program halts. This completes in O(2^N) time where N is the size of the memory.
"Relatively inefficient" is a colossal understatement. Anything that we would nowadays categorize as a "computer" has a state space so huge that, while technically finite, it is effectively infinite for all practical purposes.
For example: the ATmega328 microcontroller in an Arduino has 2KB of RAM (assuming we ignore registers and the possibility of storing data in Flash). That's a total of 2^16384 possible states. If every cubic Planck length of the observable universe was capable of checking one of these states per Planck time, and did so from the Big Bang until the heat death of the universe (let's say 10^100 years in the future), they would still have enumerated only a googol'th of a googol'th of a googol'th of a googol'th [... repeat about 10 more times...] of the state space.
We could of course imagine trying to find a better algorithm, but since this problem is easily shown to be NP-hard, it is very unlikely that we will find something that guarantees good enough performance to make this kind of analysis possible.
We haven't "given up" on automated program analysis. It's just that the fully general case is intractable, and it takes a lot of hard work and ingenuity to find strategies that can separate the tractable problems we care about from the intractable ones.
> That clearly doesn't hold for any current programming language.
To be fair, programming language != implementation. A language can be Turing complete without the implementation being complete.
> to solve the halting problem where memory size is fixed and finite
Just to clarify any lost reader: The halting problem describes halting on Turing Machines, the algorithm presented only works because Turing Machines that have finite memory are not Turing Machines.
> For me the tricky bits are where a loop terminates if and only if Goldbach's conjecture is true (or similar) - but perhaps a compiler could require an "unsafe" block for such code (in the vein of Rust).
This is for me the most interesting part of your comment and something I have also thought about. This lead me in an interesting rabbit hole: Most of the lines of code we write don't require Turing Completeness. i.e. When we do `for i in 0..10` we are iterating over a finite range of memory, we can prove we will finish (which appears to be easier than proving we will not finish). If you loop a finite set of times and do a finite set of work each time you will eventually finish.
This made me realize that if you know your input, the only regions of code something like what you are proposing would work are `while True` loops, recursive code, stream consumption (which in reality is just a fancy version of a while True), and functions where you cannot prove if the number of loops is finite (i.e Some inner part of the loop may reset the outer loop), etc...(there are some other examples out of scope).
From those, it looks to me like only the last group would truly benefit, the rest are relatively easy to spot. But notice the big caveat, all of this only works at runtime (when you know the amount of work you have to do per loop). At compile time, it only really works for really small regions of code. So as far as I can reason, this kind of static analysis would not be as beneficial as one may originally think.
> So as far as I can reason, this kind of static analysis would not be as beneficial as one may originally think.
There are many advantages to total (terminating/coterminating) programming languages. It's not so much about throwing some random Java/C/whatever codebase at a tool and expecting useful results; it's more about making languages themselves more consistent, simpler, etc.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.106...
The halting problem proof will work fine on finite sizes, I think. Failing due to an infinite loop will just be replaced by going out of memory. The main caveat would be if the solver fills memory then the diagonalized solver (which is slightly larger) won't fit on the machine.
It does not. A Turing machine with a finite tape is called a Linear Bounded Automaton. Halting is decidable for LBAs, but we don't know of any particularly fast algorithms for doing so.
For finite sized memory, the problem can be solved by recording every system state (registers, memory values etc.) and seeing if the same state is ever reached twice. If so, the program will never terminate, and vice versa.
"Mathematical proofs are only applicable where their assumptions are valid. So why would we believe that the where memory isn't infinite (everywhere in practice) that Turing's halting analysis is applicable?"
Because adding the assumption in that memory is finite doesn't make the analysis easier, it makes it _harder_. There's a reason the abstract mathematical model assumes infinite memory; it isn't because the abstract mathematical model can assume into existence, so, you know, why not, it's to _simplify_ it. Now in addition to all the other things the mathematical proof has to worry about, it has to be constructed in such a way that if at any point memory is exceeded, it also "halts". That is _way_ easier said than done.
As a homework exercise, try to take the halting problem proof and update it to include the possibility that either the halting-detection machine or the machine it is simulating (within the boundaries of the halting-detection machine's resources!) runs out of memory. Pro-tip: If you think you cracked it in a couple of minutes, you did not. For instance, now rather than just waving at the fact the halting-detection machine is running a Universal Turing machine that simply has some "linear" factor of expansion in memory usage and/or runtime and not worrying since you have infinite amounts of both, now your proof is going to be intimately concerned with the details of exactly _how_ the universal turing machine is encoded. What you end up with at the end won't be some neat claim about how the halting problem is impossible, you're going to end up with some rather complicated contingent claim with terms that will include specific details about the size of the encoding and the amount of memory given to the detection machine.
If you do this work honestly and thoroughly, you'll quickly learn that adding finiteness of resource doesn't make the problem _more_ tractable. Rather than a cute little statement about the halting problem, you'll end up with an incredibly complicated statement with dozens of terms in it about what the detection machine can and can not do, and the resulting mathematical statement will be much harder to use.
"This completes in O(2^N) time where N is the size of the memory."
In other words, our computers are finite state machines, which is basically true, so attack them with finite state machine tools. The problem is that even if your "finite state machine" you're modeling is, say, a Commodore 64, and even if you ignore external input and just simulate the machine itself, you have an intractable problem on your hands. The problem grows exponentially as you add _each individual bit_. Modern computers have a lot more _bits_ than a Commodore 64. This turns out not to be a useful approach.
If you're going to go this route, what you're looking for are sub-Turing programming languages [1] that allow you to write programs in a reduced-power environment that allows more useful assertions to be made. If you are legitimately interested in this, there's a lot of research occurring on the fun fringes of programming language research. But... I can tell you that they are certainly far from a magic wand either. At a minimum, it requires a huge mindset change, and it is very difficult to fit even some common problems/algorithms into the requisite constraints. And "completes in a finite period of time", even if you can prove it, can be a rather vague promise to make.
[1]: Example from a quick google:
http://ainfosec.github.io/crema/
Further, it has been exacerbated by recent computing mechanisms, such as multithreading, NUMA, GPGPU, and distributed computing, which have moved away from the "single tape" Turing Machine model of computation that most programming languages are founded upon.
Stopped reading here. No programming language with arrays resembles single tape Turing machines. The correct model would be random-access machine (RAM).
That's unfortunate, because they weren't talking about the physical execution, they were referring to the determinism, decideability, and serial-sequential nature of program execution.
I am curious in which way programming languages with arrays are not equivalent to a Turing Machine model. It's not clear to me what you mean with "resembles" and why that would be the same thing as "founded upon".
All Turing-complete programming languages/models of computation can simulate each other with polynomial overhead. What I think is weird is that the article specifically mentions that programming languages are built on single tape Turing machines (as opposed to Turing machines in general, or random access machines), because single tape Turing machines are highly impractical and programming them would feel completely different than programming with any real programming language. For example as basic operation as array access takes a linear number of operations in the size of the array in the single tape model.
Could some of these compiler-friendly loops be done in traditional languages by specifying that the data structure or array bounds are not modified during the execution of the statements inside the loop? Could for-each be leveraged to be a for-each-constant?
This article appeals to so many results in complexity theory but can anybody tell me if they are valid? I did pass a basic course in theory of computation but can't verify these myself.
also see Blockstack's Clarity language (
https://docs.blockstack.org/references/language-overview
) for another example of a non-Turing-complete language. Clarity bans loops altogether, which is theoretically exactly the same as only allowing "bounded loops", since any bounded loop can be written as, well, not a loop. :)