💾 Archived View for dioskouroi.xyz › thread › 29399844 captured on 2021-12-05 at 23:47:19. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-04)
-=-=-=-=-=-=-
________________________________________________________________________________
I'm really excited about Julia 1.8 and diffractor:
https://github.com/JuliaDiff/Diffractor.jl
Keno Fisher did a presentation in a discussion moderated by Simon Peyton Jones here:
https://www.youtube.com/watch?v=mQnSRfseu0c
Also combined with enzyme which can differentiate through static paths at the llvm bitcode level:
I wonder what this will enable at the frontier of what is computationally tractable in the combo of physics + ml.
Julia is the Haskell of numerical computing.
It seems like AD is a solved problem, which is not. I can personally think two instances in which I have hand rolled my own gradients: in the context of recommender systems (dense matrices are evil) and right now in the context of real time collision detection (memory allocations are evil).
One of the domains where I hope Julia will excel is precisely as you comment on physics + ml (I check frequently if MuJoCo source code is available at last).
Anyway, I encourage everyone with any background in scientific computing or DS to have a look at Julia. The ecosystem is nowhere near Python yet, but the language is very good and the tooling is getting better. The performance Julia provides without interfacing C/C++ or Fortran is not just a convenience, it has architectural consequences. It's not about of coding faster, it's about coding "further".
> The performance Julia provides without interfacing C/C++ or Fortran is not just a convenience, it has architectural consequences.
These statements set the expectations really high. Yet they omit the ugly truth of increasingly slow compilation times, pure compilation caching, JIT which might affect overall performance due to type instability.
Yeah, like Haskell has a nice Quick-sort example prominently displayed at the front page, Julia is sleek for scientific computing, yet it can bite you really bad as soon as you start wrapping the developed model into a real world REST app.
Do not get me wrong, I see some of the benefits of Julia. On the other hand, I do not think this uncritical hype about "coding 'further'" does good for the language.
Compilation times are still an issue, but what do you mean with 'increasingly slow' compilation times? As far as I know and can tell, compilation times have been getting _faster_ not slower, and by a significant amount.
Let's even put raw numbers to it. DifferentialEquations.jl usage has seen compile times drop from 22 seconds to 3 seconds over the last few months.
https://github.com/SciML/DifferentialEquations.jl/issues/786
Yeah, I should not have used "increasingly", yet what you are pointing out is an example of an improvement in a single module, not the language compiler & llvm. It does not actually prove the original comment wrong.
Almost all of the improvements came from changes to Julia's Base that improved compilation times and changes to tooling which suggest trivial changes to improve compilation times. In v1.5 or so this would've been a heroic feat. In v1.7 this is something any undergrad could do in a weekend to any module. So yes that proves that things have changed quite a bit.
It's not due to effort that is specific to a single module. It reflects an improvement in the community's understanding of what patterns causes slow compile times and also tooling that helps improve compile times by reducing method invalidations.
There may well have also been improvements in the Julia compiler.
I manage the MuJoCo.jl wrapper, and having the source code wont (naively) help with AD. Internally, there's a number of iterative algorithms that you wouldn't want to automatically differentiate, and if analytical derivatives were apparent they would be in there (the MuJoCo creator was my advisor). As it stands, finite differencing is the best way to extract derivative/gradient information from the underlying physics model of MuJoCo, and is the suggested method.
We had a paper that included MuJoCo (with custom defined adjoints) within Julia's DiffEq framework to learn continuous control policies (
https://arxiv.org/abs/2012.06684
) which revealed a whole host of other issues, namely that gradients are not all that helpful when your optimization landscape is insanely non-linear. But that's a different problem.
Open to chatting about; Julia is definitely the tool for this kind of work.
> namely that gradients are not all that helpful when your optimization landscape is insanely non-linear
Did you try multiple shooting?
https://diffeqflux.sciml.ai/dev/examples/multiple_shooting/
I want to restructure the docs because I have come to think that any non-trivial fitting problem requires multiple shooting in order to be stable. Otherwise the loss becomes dominated by "loss values calculated beyond where the simulation has already fallen off". So I'd like to dig into some of the non-fitting examples and see if this or some other tricks are the right answer.
I'm not thinking just about AD. I would love lots and lots of little simulation modules instead of monolithic solutions.
Now some uninteresting story: the reason to have a look at MuJoCo is personal: I studied Mechanics I, II and Analytical Mechanics at college and thought non-continuos mechanics a solved problem. When doing my masters thesis about helicopter simulation I needed at some point to model ground contact and I thought "easy, just add a damped spring". I feel that thesis was nice for an undergrad overall but that part as you can imagine was a complete and utter piece of crap. And now at my forties I want to have vengeance.
> Julia is the Haskell of numerical computing.
Is that a compliment or an insult? When Haskell is involved I can never tell.
Me neither, and yet I maintain my opinion.
That's because Haskell is like the programming language equivalent of The Gig That Changed The World[0], although I guess Algol60 also has a strong claim to that title.
Let me explain by quoting these paragraphs Roger Ebert's review[1] of 24h party people:
> _As the film opens, Wilson is attending the first, legendary Sex Pistols concert in Manchester, England. (...) Wilson is transfixed by the Pistols as they sing "Anarchy in the U.K." and sneer at British tradition. He tells the camera that everyone in the audience will leave the room transformed and inspired, and then the camera pans to show a total of 42 people, two or three of them half-heartedly dancing in the aisles._
Sounds like the average language designer entranced and inspired by their first time grokking Haskell.
> _Wilson features the Pistols and other bands on his Manchester TV show. Because of a ban by London TV, his show becomes the only venue for punk rock. Turns out he was right about the Pistols. They let loose something that changed rock music. And they did it in the only way that Wilson could respect, by thoroughly screwing up everything they did, and ending in bankruptcy and failure, followed by Sid Vicious' spectacular murder-suicide flameout. The Sex Pistols became successful because they failed; if they had succeeded, they would have sold out, or become diluted or commercial. I saw Johnny Rotten a few years ago at Sundance, still failing, and it made me feel proud of him._
I could rephrase that last sentence at "I checked out a contalk by Simon Peyton Jones from a few years ago, still avoiding success at all costs[2], and it made me feel proud of him" and it would be absolutely true.
And no, I also did not expect to find a parallel between Haskell and punk music, but there you go.
[0]
https://openculture.com/2015/06/the-sex-pistols-1976-manches...
[1]
https://www.rogerebert.com/reviews/24-hour-party-people-2002
[2]
https://www.youtube.com/watch?v=re96UgMk6GQ&t=1372s
I wonder what language would be considered post-punk?
Scala (experimenting with various styles) comes to mind.
How about proto punk? Something like ML?
This is the best HN comment I have seen for quite some time.
If you watch the linked SPJ video, I timestamped it at the moment where he explains what he means with "avoiding success at all costs", but also: if you watch a bit longer he shares the story of a GHC compiler bug where _if_ you compiled a module on windows, _and_ in a different directory than the current directory, _and_ it contained a type error... it would tell you there was a type error, and then delete the source file.
And instead of people going berserk over this, all he got was some friendly emails from people who just wanted to inform him they bumped into this problem, but it's fine, they had a workaround. Because Haskell people in those days were used to Breaking changes with a capital B.
I think that's at least a little bit punk
> it would tell you there was a type error, and then delete the source file
Which makes all the sense because Haskell is about avoiding bugs at compile time
I think I have a good news for you. Deep mind acquired and open sourced the project a few months ago.
https://github.com/deepmind/mujoco
I know, I know, but the source code is not yet available. The github repo is a placeholder for now
I thought Julia is the Dylan of numerical computing. :)
>One of the domains where I hope Julia will excel is precisely as you comment on physics + ml (I check frequently if MuJoCo source code is available at last).
Somebody already did a comparison of DifferentialEquations.jl's standard usage against MuJoCo and DiffTaichi and showed that it outperformed them by about 5x-10x.
https://arxiv.org/abs/2012.06684
https://homes.cs.washington.edu/~thickstn/ctpg-project-page/...
That's all showing the raw iteration count to show that it algorithmically is faster, but the time per iteration is also fast for many reasons showcased in the SciMLBenchmarks routinely outperforming C and Fortran solvers (
https://github.com/SciML/SciMLBenchmarks.jl
). So it's excelling pretty well, and things like the automated discovery of black hole dynamics are all done using the universal differential equation framework enabled by the SciML tools (see
https://arxiv.org/abs/2102.12695
for that application).
What we are missing however is that, right now these simulations are all writing raw differential equations so we do need a better set of modeling tools. That said, MuJoCo and DiffTaichi are not great physical modeling environments for building real systems, instead we would point to Simulink and Modelica as what are really useful for building real-world systems. So it would be cool if there was a modeling language in Julia which extends that universe and directly does optimal code generation for the Julia solvers... and that's what ModelingToolkit.jl is (
https://github.com/SciML/ModelingToolkit.jl
). That project is still pretty new, but there's already enough to show some large-scale models outperforming Dymola on examples that require symbolic tearing and index reduction, which is far more than what physical simulation environments used for non-scientific purposes (MuJoCo and DiffTaichi) are able to do. See the workshop for details (
https://www.youtube.com/watch?v=HEVOgSLBzWA
). And that's just the top level details, there's a whole Julia Computing product called JuliaSim (
https://juliacomputing.com/products/juliasim/
) which is then being built on these pieces to do things like automatically generate ML-accelerated components and add model building GUIs.
That said, MuJoCo and DiffTaichi have much better visualizations and animations than MTK. Our focus so far has been on the core routines, making them fast, scalable, stable, and extensive. You'll need to wait for the near future (or build something with Makie) if you want the pretty pictures of the robot to happen automatically. That said, Julia's Makie visualization system has already been shown to be sufficiently powerful for this kind of application (
https://nextjournal.com/sdanisch/taking-your-robot-for-a-wal...
), so we're excited to see where that will go in the future.
MuJoCo was purchased by Deepmind and open-sourced, a few weeks ago:
https://deepmind.com/blog/announcements/mujoco
this talk and the referenced pair talk by matt bauman were really excellent.
basically the big takeaway that i walked away with was that they defined a sort of pseudoclosure in the guts of the language before the compiler which relaxes some of the behaviors of a closure (ie what gets captured) which then enables the creation of larger optimizable regions for the compiler in the context of doing autodiff.
the demo by matt bauman where they do autodiff on a function that prompts for and has the user type in the name of another julia mathematical function was really impressive!
> basically the big takeaway that i walked away with was that they defined a sort of pseudoclosure in the guts of the language before the compiler which relaxes some of the behaviors of a closure (ie what gets captured) which then enables the creation of larger optimizable regions for the compiler in the context of doing autodiff.
Yes! That's essentially correct.
If anyone else is wondering what the AD stands for in "Next Generation AD" it's "algorithmic differentiation".
Also known as automatic differentiation or autodiff.
What's special about this compared to other ADs?
Reading the front page it seems to focus on efficient higher order derivatives, is that it?
Diffractor solves a couple of issues that are inter-related. I'm gonna contrast with Zygote which is the current "standard" AD in Julia and the most fancy one we have (though there are a couple other AD packages that are useful in certain situations that Zygote is bad at - Diffractor will cover some of these but not all).
Essentially, what Zygote does is insert itself into the compiler pipeline at the lowered code stage to perform the AD transform. That is it operates on the form of Julia code before we perform any type analysis, devirtualization or optimizations. Essentially, you can think of Zygote as a lisp-style macro applied automatically/dynamically to every function being called starting at a particular entry point. This works pretty well, but has a couple limitations.
The first is that it has absolutely no semantic information available (since it operates on non-inferred code), so it can't use that for optimizations or things like data layout planning, which are important optimizations for a production-grade AD system. Essentially, it's not allowed to know that the `+` symbol it sees is actually the `Base.+` function that does addition, so must make the most pessimistic assumptions. This issue doesn't actually show up that much in machine learning use cases, but it's a big issue when you need to AD scalar code (which happens frequently in various differentiable programming contexts).
The next issue with running at this stage is that a lot of existing julia code is written with some mental model of the capabilities of type inference and the optimizer. For example, destructuring code like `a, b = f(x)`, people don't think about at all, but semantically that allocates several tuples and then indexes into them to take them apart. By running the AD transform, you basically double (at least), the complexity of every operation, so in a number of cases patterns that used to completely optimize away are now terribly slow, because they are no longer optimized (and then AD transformed on top of that).
A related issue is that because you cannot interleave optimization with the AD transform, if you want to perform nested (i.e. higher order) differentiation, you're gonna get exponential code generation, which you then have to hope the optimizer will cut down again for you, which it often can't all the way, but even if it could, generating exponential code in the first place is bad, because you're gonna wait an exponential amount of time for the compiler to be done with it (and super-exponential in practice, because the compiler is not linear in the size of the input problem), so Zygote is basically unusable for anything beyond second order (and even at second order is a struggle).
Lastly, there is also a technical issue, which is that code that operates on lowered IR isn't technically allowed to build any closures, but the AD transform has to do that in order to put the code for the backwards pass somewhere. Zygote gets around this by taking advantage of the fact that the AD transform is pure, so it basically runs everything twice (once to generate the forward pass, once to generate the backwards pass when execution gets to it) and it knows that things match up because the input code is the same. That mostly works, but this dependency isn't visible to the runtime system, so you can get into issues where code is updated in between the forwards and the backwards pass, which breaks Zygote in all sorts of ways.
Anyway, Diffractor is designed to fix all of this by "simply" moving the AD transform stage from lowered IR to post-inference (where the optimizer sits). The issue is that Julia the language, currently doesn't really allow semantic transformations post inference (after all optimizers are supposed to make things faster, but not change the outcome of things) and in particular, running optimizers is always optional and the language may choose not to do it. So to get there, we need to do a couple of things:
1. We need to have some sort of wedge into the semantics of the language that allows for optimization-time changes that are semantic. For this, I added `OpaqueClosures`, which are essentially like regular closures, except that they do not have semantics as to their capture lists or the code that they run. Now, this may be confusing to people from some functional languages where all closures have such semantics (you can see SPJ ask this question in the linked discussion we had - didn't think about it ahead of time, because I'm so used to our semantics, so my answer was a bit muddled), but essentially in Julia, the contents of closures is semantically visible and optimizations over closure boundaries are mostly prohibited, because ordinary closures do not close over the world age (i.e. if you have `f = x->sin(x)` and then say `sin(x) = 9999`, then the next execution of `f` will return 9999). So opaque closure basically change this and say "nothing in the system is allowed to look at the code they contain, or the capture list, and also we capture the world age". More importantly though, we now have a datastructure (the capture list of an opaque closure) that is allowed to be changed by the optimizer, so we drive basically drive a truck through that.
2. We need the actual mechanism to move the transform to inference time. This isn't super hard, but we haven't quite finished this yet, and it's something I'm hoping to get to over the next couple of months. It's somewhat intertwined with making the compiler in general more accessible to packages outside the core language. There's about 8-10 different packages that want to do compiler-y things on Julia IR and we really need to figure out a good solution to make that generally possible. But as always, designing core language APIs is a bit tricky, because you're gonna be stuck with them for a while.
There's also some other nice bits and pieces in Diffractor. The big one is that I did a fair bit of theoretical exploration at the top of this project to really understand higher order AD. When I first took differential geometry (10 years ago now), I used to joke with my classmates that I had no idea what a second derivative was, because all the textbooks basically just say "look, the tangent bundle of a smooth manifold is a smooth manifold" and never actually really go into higher order derivatives at all. Anyway, I really sat down to work all that out and in the course of that I came across a way to do higher order derivatives more efficiently (under suitable assumptions of what the compiler does) than just nesting the first order transform. In retrospect, it seems pretty obvious, but I can honestly say that I didn't think about that until I worked out the theory. In the course of it, I also managed to make very precise the notion that reverse-mode AD and pullbacks of cotangent vectors are the same thing. This connection was pretty well known in the oral tradition, but I never really saw a convincing writeup of it (and my result extends to higher orders of course).
Other than that, it directly uses ChainRules.jl as its AD rule system, where Zygote used to keep its own rules (Zygote and ChainRules developed concurrently, and Zygote was later adapted to look in both places, but its a bit of a mess) and it also has a unified forward mode that (in theory, once it's robust enough), subsumes both ForwardDiff.jl and TaylorSeries.jl.
In theory, this should all make for a pretty good AD system, but there's a fair bit of work still to be done to make it robust (though all the pieces except the julia-level mechanism for actually moving the AD pass are done and tested) and as always, time is very limited :).
This is a great writeup. Any chance it can end up in a blog post?
Maaayybee ... Writing a blog post always seems like a time commitment, and I really should be doing other things ;). Somehow sequeezing in a big HN comment while I'm waiting for my builds to go through doesn't feel like the same, but I'm probably just deluding myself ;).
I've been investigating Julia AD recently - really hoping for arbitrary code diff, but nothing really works right now.
How broad will Diffractor's support of the language be? E.g. will it support mutation / exceptions / recursion?
Mutation is tricky, because basically the only way to do it sanely is to copy all data into the residual, but since most arrays aren't actually mutated, that's extremely wasteful. I've been hoping to address this by changing mutability in the language more generally (e.g.
https://github.com/JuliaLang/julia/pull/42465
) to make immutable arrays the default at which point there wouldn't be a penalty anymore. I've had request to do the mutable copying optionally, but it's a bit tricky, because it needs rule system integration and the rule system currently doesn't reason about mutation.
As for exceptions and recursion, shouldn't be a problem, just needs to be implemented.
Thanks for the reply!
As a julia user of five-ish years one of my favorite things that you guys do is write really in depth blog posts which I can read ten times and try to puzzle out.
I am sure they take a huge amount of work but my appetite for them has never been satisfied!
The basic answer is it (similar to enzyme) is hooked into a compiler so it can optimize before and after AD. This is important because the typical approach can lead to code that is much harder for the compiler to optimize.
It would be awesome to get library/macro support for something like this in Rust.. it's llvm so theoretically should be able to hook in?
I've always thought that AD needs something akin to a "compile" step.
The Enzyme team has been doing something like that (AD at the LLVM step making it available to all LLVM languages) with Rust:
-
https://internals.rust-lang.org/t/automatic-differentiation-...
-
https://github.com/rust-ml/oxide-enzyme
Multidimensional array construction is something I had looked forward to in Julia. I am not convinced that the approach taken in julia1.7 compares favorably with other language implementations (Ignoring R)
In my view the numpy syntax has more clarity for this task.
[[[1,2],[3,4]],[[5,6],[7,8]]]
compared with
[1 2;3 4;;;5 6;7 8]
It is not immediately clear why [1,2,3,4] is equivalent to [1;2;3;4], but [1,2,,3,4] (and [1,2;3,4] vs [1 2;3 4] ect) is not equivalent to [1;2;;3;4].
For creating a 3d slice, I expected that ";", ";;", ";;;" would each refer to incrementing a specific dimension. Eg, It seems intuitive that if you can create a 2d matrix with [1 2;3 4], then you should be able to make a 3d tensor with [1 2;3 4;;5 6;7 8]
The semicolon use is actually completely consistent, ";", ";;", ";;;" etc. do indeed refer to incrementation of the corresponding dimension. Try [1;2;3], then [1;2;3 ;; 4;5;6] and then [1;2;3 ;; 4;5;6 ;;; 7;8;9;; 10;11;12]
The confusion arises because "," and whitespace also have overlapping meanings in array notation. "," is used for regular vector creation, and whitespace for concatenation in the second dimension. The coexistence of those two different notations is a bit uneasy, but I doubt that "," and whitespace will be deprecated, since they are so entrenched and familiar. And for 1D and 2D arrays (which probably make up >99% of all literal array use) it's also more elegant and clean.
Maybe this will help: "," separators only work for 1D arrays, you cannot use them while making 2D or higher arrays. Whitespace is used when you want to create the array writing the data down row-wise, so your innermost dimension in writing is actually the second dimension of the array. The semicolons are for completely consistently going from the first to the n'th dimension, with the corresponding number of ";" in each dimension.
I think it would be hard to come up with a nice way to express these in a unified notation.
The way numpy does this with lots of brackets isn't really very convenient when working in 2D, which is the more common case.
To address the specific examples:
[1,2,3,4] is literal vector creation, and also "," is just the regular way you create a list of inputs to a function. [1;2;3;4] is concatenation along the first dimension, so it must be the same as [1,2,3,4].
[1,2,,3,4] has no meaning, because repeated "," hasn't been given any syntactical meaning. But maybe that would have been a good idea?
[1,2;3,4] mixes literal vector syntax and vertical concatenation. The only reasonable interpretation would be that it's the same as [1;2;3;4], so maybe it could have been allowed, but ";" is supposed to concatenate _arrays_, with a special case for scalars (0-dimensional arrays), it's not clear to me what would be concatenated in [1,2;3,4].
[1 2; 3 4] on the other hand, concatenates two row vectors vertically, so this has a clear meaning. It can't be equivalent to [1;2;;3;4], since that has 1 and 2 lying along a _column_ not a row.
A 3D tensor can't be [1 2;3 4;;5 6;7 8], since it only has ";;" while concatenation along the 3rd dimension must be ";;;". The notation [1 2;3 4;;; 5 6;7 8] works for this, but mixing whitespace notation and ";" notation is confusing.
So, clearly this is all a bit complicated, but it is a solution to a somewhat complicated problem, where you both need to allow new, consistent, notation, while simultaneously keeping the historical notation, which is in fact better in the most common (lower-dimensional) cases.
Yeah, I looked up the manual and completely fail to understand how the syntax is supposed to be read and written.
I'd love to figure out where the disconnect is — and how we can make the manual more clear. It's a pretty simple rule: the number of semicolons specifies the dimension in which you "move". I've seen two disconnects, but yours might be different
* Julia's arrays are column major but when you use spaces to write them, you do so in row major fashion. This new syntax enables a column major input: `[1 2; 3 4]` is equivalent to `[1; 3;; 2; 4]`.
* When you're using spaces, it might feel "funny" to jump from one semicolon (which concatenates the rows in the first dimension) to three semicolons (which concatenates the matrices in the third dimension) in an expression like `[1 2; 3 4;;; 5 6; 7 8]`, but the key is that spaces first build rows and then the semicolons concatenate them along a particular dimension.
Anyhow, if you can expand on what's causing trouble, it'd be great to figure out how to improve the description in the manual.
It's really hard to say in what way I don't understand something that I'm not sure I even understand.
I think one approach is to maybe fully explain the column major approach, and then introduce the row major approach, and lastly how those two interact.
> [1 2; 3 4]` is equivalent to `[1; 3;; 2; 4]`.
is a hard to understand example.
Is the space equal to ;;? e.g., is [1 2; 3 4] the same as [1;;2;3;;4]?
The ;; is similar to the space, in that it separates elements in the second dimension. But they are not quite equivalent, because you cannot write [1;;2;3;;4]. With semicolons you have to start with the innermost dimension and work outwards. The space notation is there so that you can write the array in row-order, since that is very common.
There has got to be a way to sprinkle emojis into the documentation:
https://github.com/under-Peter/OMEinsum.jl#learn-by-examples
Agreed, I also found that rather confusing.
This is a really nice write up, hope we get one of these at least once every few releases.
The multi-; syntax is something that both looks weird at first glance but is also _really_ convenient and satisfying in its consistency. The weirdness factor will likely go down as we get used to seeing this new construct, while the convenience/ease-of-use goes up - so overall a solid positive to the language.
We've done such release highlights for the last 3 releases, and certainly hope to continue!
https://julialang.org/blog/2021/11/julia-1.7-highlights/
https://julialang.org/blog/2021/03/julia-1.6-highlights/
https://julialang.org/blog/2020/08/julia-1.5-highlights/
Maybe this is the right forum to ask... Why the debug system in Julia is so terribly slow? It seems to me that Debug.jl (or whatever runs in VSCode) interprets, rather than running, the code. The result is that for me debugging is just unusable.
The standard way to put breakpoints in an executable is to replace the instruction at which to stop with INT3 (or something analogous in other architectures). Then give the system a callback for your debugger when the CPU receives the interrupt.
Is there a way to make Julia's debugger do that?
We had a debugger like that a few years ago, but the experience was unsatisfying for people, because you got the "debugging optimized C++ code" experience with unreliable breakpoints and mostly unavailable debug variables. I took the decision to scrap that and instead put out something simple that's slow but robust and reliable. The plan was always to then use the JIT on top of that to create a "debug-specialized" (using statepoints for local variables rather than DWARF) version of the running code, which should give you perfect debuggability at minimal runtime cost, but it's a fair amount of work that nobody has wanted to do yet.
In general, traditional debugging has always taken a bit of a backseat in Julia, because people code is usually decently functional, so they just run it in a Revise loop and write their state dumps directly into the code (you could deride that that printf debugging, but I think it has a bit of a bad rap, particularly in a live-reloading system, where you basically get an execution log of your revising expression on every file update).
There are still cases where a traditional debugger is useful, so I'm hoping someone will take that on at some point, but so far there've been higher priorities eleswhere.
Also do note that you can switch the debugger into compiled mode, which will be faster, but ignore breakpoints.
Thanks for this explanation.
One thing I've never quite grokked: if the debugger ignores breakpoints in compiled mode, how is running in debug compiled mode different than just...running the code normally?
> how is running in debug compiled mode different than just...running the code normally?
It's not. By default, the debugger will recursively interpret any nested function calls until it encounters any intrinsics, since breakpoints can be set inside any functions. Compiled mode means the debugger won't do that for functions in a certain module (e.g. Base) and instead invoke the function like you normally would, so breakpoints either inside those functions or -- if functions from other modules are called from inside the compiled functions -- also breakpoints inside those will be ignored.
The problem is that when you compile code you can have things like constant propagation make it so that your breakpoint just isn't in the compiled code. Finding the right balance to make it look like your code is running the way you expect is really hard.
Yes, it uses Debugger.jl, which relies on JuliaInterpreter.jl under the hood, so while you can tell the debugger to compile functions in certain modules, it will mostly interpret your code.
You might be interested in
https://github.com/JuliaDebug/Infiltrator.jl
, which uses an approach more similar to what you describe.
The other part of the answer is that currently JuliaInterpreter is really slow because it is a very naive interpreter. Speeding it up by a factor of 5 or so should be relatively easy if anyone wants to try.
I would not go as far as calling it very naive, there has certainly been some work put into optimizing performance within the current design.
There are probably some gains to be had by using a different storage format for the IR though as proposed in [1], but it is difficult to say how much of a difference that will make in practice.
[1]
https://github.com/JuliaDebug/JuliaInterpreter.jl/pull/309
I'm similarly disappointed in Debugger.jl, but I find that Infiltrator.jl often helps me get where I need to go for intra-function problems.
IMO, the best part about this is that 1.6 is officially the new LTS. Hopefully this finally ends people trying to use 1.0.x which at this point is really sub-par.
It's true that 1.0 feels really old at this point. We're also trying to improve messaging that people should generally _not_ be using LTS unless they work in a really deeply risk-averse organization. Almost everyone should just use the latest release. That messaging should hopefully help make what the LTS release is less important. It just shouldn't matter to most people.
Been really pleased with Julia and happy about the continued progress. Package speedups on Windows are expecially nice for me in this release.
Likewise. The core language is pretty amazing and this steady stream of improvements is very impressive and reassuring. Being able to easily install, run, and combine bleeding-edge research tools is fantastic.
I'm really enjoying exploring the probabilistic-programming corner of the Juliaverse and finding it much smoother to get up and running with than Python/R tooling.
"Package speedups on Windows are expecially nice for me in this release."
This is huge! Despite my best efforts Julia as been practically unusable on Windows. There are lots of people at work who could probably replace Matlab with Julia, but this has been a complete showstopper.
"Julia v1.7 is also the first release which runs on Apple Silicon, for example the M1 family of ARM CPUs. "
I hope those benchmarks are coming in hot
>I hope those benchmarks are coming in hot
M1 is extremely good for PDEs because of its large cache lines.
https://github.com/SciML/DiffEqOperators.jl/issues/407#issue...
The JuliaSIMD tools which are internally used for BLAS instead of OpenBLAS and MKL (because they tend to outperform standard BLAS's for the operations we use
https://github.com/YingboMa/RecursiveFactorization.jl/pull/2...
) also generate good code for M1, so that was giving us some powerful use cases right off the bat even before the heroics allowed C/Fortran compilers to fully work on M1.
Still Tier 3 support, but hopefully it'll be Tier 1 very soon :)
(As far as I know, the community is really working on it! And I'm phenomenally excited)
I'm extremely excited about this. But there still many problems, for example a ton of tests failing. For example:
https://github.com/JuliaLang/julia/issues/43164
.
Yeah, please file bugs if you find them!
I love how the Xoshiro PRNG Family is replacing Mersenne Twister more and more.
This release was a long time coming. Very glad it's now here!
So much hard work from the whole Julia community, it's great to see the release go live!
I used to love Julia, but it increasingly makes this Koan make more and more sense:
A martial arts student went to his teacher and said earnestly, “I am devoted to studying your martial system. How long will it take me to master it?” The teacher’s reply was casual, “Ten years.” Impatiently, the student answered,”But I want to master it faster than that. I will work very hard. I will practice everyday, ten or more hours a day if I have to. How long will it take then?” The teacher thought for a moment, “20 years.”
(originally seen in the context of this article:
https://brianlui.dog/2020/05/10/beware-of-tight-feedback-loo...
)
In this analogy, Julia is the student? Or are you and Julia is the martial arts?
I'd be curious to hear more specific critiques if you don't mind
"We hope to be back in a few months to report
on even more progress in version 1.8!"
1.8rc1 ?
We're a ways away from 1.8rc1, but we probably will have a feature freeze for 1.8 soonish (next month or so). Hopefully, 1.8 takes less time to release than 1.7 did.
Lots of nice goodies, quite interesting to follow on Julia's development.
time to give it another look!
try doing this year of Advent of Code in Julia!
Yes! Julia is quite elegant in AoC.
I had a blast doing it in Pluto last year:
https://github.com/fonsp/Pluto.jl
I’m doing the same this year. AoC in Julia and Pluto.
I tried it last year using JupyterLab.
I found Julia error messages so unhelpful that eventually I gave up and returned to Python.
The type-based error messages can be pretty opaque until you have a good grasp of the type system, so those can be harsh/seem unhelpful when you're trying to learn the language (and in this case solve daily problems with time constraints too). Were those the problem or did you have some other examples in mind?
i seem to be the last person in the world to prefer c-style syntax. but so much cool stuff happening in julia that it seems silly to avoid diving in on such a basic semantic nit.
I think it's an unfortunate result of people complaining about significant whitespace in Python.
Significant whitespace in python is annoying indeed. Altough, specifically in python (because the mixture of statement based, not strongly enforced type system and the whitespace).
In f# I actually really like the significant whitespace.
I don't understand how it's _ever_ an actual problem and not just people being lazy / literally just starting out learning the language. Every Python editor / IDE is capable of, and immediately will, notify you of any formatting issues of the sort. Python also has great, trivial to use autoformatters like black[1] that essentially all production code should use and which ensure that you never, ever have to think about this again.
What realistic scenario is there where it actually makes sense deviating from the standard the language is trying to enforce? And even if you can think of one (I honestly cannot), is now having to write an “end” keyword all over your code really a tradeoff worth making?
[1]
My problem has been that I sometimes lose indentation, by accident, and this can break or, worse, alter the behavior. It is also a nuisance when I want to copy-paste code from somewhere else, and it doesn't go smoothly.
Once I needed to incorporate code from a colleague who used 2 spaces for indentation. That was a nightmare.
You're simply admitting to not using a formatter.
Both Black or PyCharm's Ctrl+Alt+L format should be able to handle that scenario. You can enable it to run automatically on file save in either (n)vim, VSCode, or PyCharm. Or as a git hook. Or in a CD/CI pipeline.
Seriously, simply installing black through your favourite package manager and running `black .` in your project directory takes like 10 seconds flat. Done, you never have to think about tabs / 2 spaces / 4 spaces ever again.
Please use these tools, I promise both your own and your colleagues' life will be greatly improved, and it takes no effort at all to use.
If you say so. I've never heard of either of those, and have only the vaguest idea of what a 'formatter' might be. Is it not a solution to an unnecessary problem? And how can a formatter fix a lost indentation that changes the meaning of the code?
I do know for sure that my colleague will never use this, even if I might.
Fortunately, this isn't an issue in the languages I use most of the time.
(Another problem that's bothering me almost as much is that without brackets or `end` statements, the code looks assymmetrical - unbalanced and awkward, as if it's trailing off, having forgotten its purpose.)
> Is it not a solution to an unnecessary problem?
You literally stated the problem yourself, so obviously not.
Seeing how it’s an official tool by the Python Software Foundation, it should underline how critical it is.
> I do know for sure that my colleague will never use this, even if I might.
You gain value even if you’re the only one using it, as it solves the issue you mentioned having.
What? I mean, it's a solution to an unnecessary problem caused by indentation being significant.
The problem wouldn't exist if whitespace wasn't significant, which was an unfortunate design choice. So it's a solution to an unnecessary problem.
f# has significant whitespace as well, yet I don't think it suffers from it at all.
in python if you something like this:
while (cond) if (someothercond) //code else //morecode somefunction() //othercode
it can be unclear if "somefunction()" belongs in the else or not. This is IMO due to python being statement based.
While in f# the if evaluates to a value, so the type of it would more likely than not change and thus throw out a visible error. Hence me calling out the mixture specifically.
You make a fair point, I guess I’m just used to it now. My IDE (PyCharm) adds indentation lines, so it’s never been particularly confusing for me.
I do think F# is very interesting and quite pretty, though I’ve never used it.
julia> using Pkg
julia> Pkg.update()
Alternatively: Type the ']' key (to enter the package management mode) and then type 'up' (short for update) to update all packages.
For more info on this:
https://pkgdocs.julialang.org/v1/repl/
Whatever "Julia" is.
Another douchily obscure HN post.