________________________________________________________________________________
I've been using Copilot for 3 weeks in a typical web project: typescript, react, css. Feedback:
- when i declare a variable or function using a suggestive name, it correctly autocompletes what i want 50% of the time and it also takes into account the context; sometimes it just blows my mind how accurate it can be.
- sometimes it doesn't autocomplete even if i delete the line to start over
- i changed the way i code: i rely on Copilot more and more. i start typing my code and i know the prompt will correctly autocomplete my intention, but if it doesn't, it will a couple of lines later. If i stop to think, it goes ahead of me
- it's now faster for me to get an answer from Copilot than searching on Stackoverlow.
Copilot is a great companion, please make it even better! Welcome to the future my friends
I have an honest question, do you feel you are saving time? And if so, is this really the case?
Having clever code proposals is surely a big advantage and seems to be a big time saver.
But it also means that you have to constantly switch between writing your code and reading/analyzing code that you didn't write yourself, to check if it's correct, if it handles all the cases, etc.
These situations, where we go from one task to another, are typically the kind of situations where we have a very bad perception of the time passed.
It would be interesting to investigate whether this undeniably saves time or not
I feel it definitely saves time, if you can write a decent comment code, and the function name is descriptive, that's all you need. The functions write themselves. It's better if you follow the single-responsibility principle or at least don't have too many side effects in your methods.
Often times the comment isn't even needed and just a method name is enough to get it churning out the right code.. I might change a variable name, and some minor stuff but I don't have to search for little things I normally forget ...like framework sugar syntax I've used before but can't remember off my mind but when I see it, I know oh yeah that looks right...
The most insane is that it even autocompletes comments. It’s honestly scary sometimes.
Comment autocomplete is very neat. I use copilot for comments more than for code.
What kind of comment is the result? More “what” or more “why”?
Stuff like "This is " becomes "This is only temporary until we get more data".
Which sometimes is exactly what you wanted to say...
Copilot may be the unexpected solution to the holy grail of code reuse. :)
And bug re-use is now 50% faster! Just kidding… mostly.
Yes I love it too. It takes about 2-3 seconds to generate the suggestions so even when I know what I'm going to type next I just wait for it to do it for me since I know there is a high chance that it will get it correctly (I'm talking 5 to 6 lines of code). Plus it acts as a reinforcing feedback that what I want to write is indeed correct.
It is definitely the future and it's here to stay and get better with time. I'm super excited!
I love it. It removes the more tedious parts of my work and I've found it to give me time to think about the overall system.
I'm wondering what it will cost me (read my employer) once it is out of beta.
Considering I am willing to bet that it's gonna be a fair bit.
There are open versions of GPT so presumably, one could train an open version of Codex. It would be really terrible if Codex was made inaccessible to swaths of people in the name of commercial interests.
GitHub sent OpenAI something like 57 terabytes of data from GitHub. Good luck scraping that.
(I helped build The Pile, the largest openly-available text dataset.)
You're right that you theoretically can do this, but doing it in practice requires either funding or time.
Yeah, I thought of mentioning that but wasn't sure how in the weeds anyone would want to go lol Besides, I'm optimistic about what an enterprising individual is capable of when faced with those sorts of limits... It's those clear bounds that set creativity free.
By the way, I'm so fucking stoked that Shawn Presser of The Pile responded to me. Your work is proto-solarpunk incarnate. Really amazing contributions dude, can't wait to see what's next.
I'm really happy to hear that. Thank you.
When I started out, I only wanted to make some small contribution somewhere. It's really surreal that there are people rooting for me now. I'll do my best to continue to contribute in ways that I can.
You can too, by the way. There's not a lot of difference between me and you. I believe in you.
I was literally sitting here trying to stop the waves of sadness I'm feeling from not meeting my own expectations. Bumping into a kindred spirit that's getting shit done really helps. Thank you for the nudge.
No stress, friend :) Remember, small contributions really matter! You can do it!
Hello! I'm the founder of
, an AI that explains code. It's available now, with a free demo and VS Code extension, with an Emacs extension coming soon.
Denigma is a product that has been around for a while.
Denigma goes beyond a literal line by line explanation, and explains of programming concepts and deduces the business logic and goal of code.
I'm excited to see innovation in the field.
Here's a sample explanation from Denigma:
The code starts by initializing the trampoline. Then it checks to see if there is a passed_info structure in memory, and if so, copies the struct into the passed_info variable. Next, it sets up some variables for use later on:
- info_struct - The smp_information struct that will be used throughout this function
- longmode - Indicates whether or not we are running in 64 bit mode
- lv5 - Indicates whether or not we are running with L1V5 support enabled (if you're unsure what this means google "Intel VT-x" and "L1D cache")
- pagemap - The address of our page table entry array (this is where all of our virtual addresses go)
- x2apic - Indicates whether or not we are using an Intel X2APIC controller
The code is a function that is called when the kernel is booted. The function checks to see if the kernel was booted in long mode or not, and if it was, it sets up a trampoline to call the original kernel.
We're small, privacy-focused, and bootstrapped, and don't have the marketing budget of GitHub, so we'd appreciate you talking about Denigma in your communities.
Good on you for editing your comment. I was about to say you probably shouldn't say that :-) I just checked out Denigma, and it's quite wonderful actually. I may give it a try - the pricing is pretty good, less than a cup of coffee where I'm at.
Nice job.
I encourage everyone to try it on real-world (non-esoteric, non-homework) code from production. That's where it works best. :-)
I actually did! It worked a lot better than GitHub's Lab.
Very cool product, I'm curious if you have an offline version that an individual developer could pay for. At my Fortune-100 employer I read a ton of code that is often opaquely written and stored in a private git repository. Unfortunately it's a huge hurdle to approve a SaaS app which receives any data or code from our work environments.
A self-hosted version will be developed, but it will likely take a few months at minimum.
The hardware requirements (GPU) might be hefty. I'll try to see if I can optimize the size of the model to something smaller.
Here is a sample I just tried --
Very amusing that the explanation goes into great lengths to explain some nuance that doesn't exist ...and gets it completely wrong?
Is this because it was trained on some similar but different code that had this type of nuances explained in comments?
Hmmmm, if it's not "too long to fit on one line and so instead it's written as 12", what is "setLong(12)"?
Where does this actually run?
Can I run this on prem.
Can it search though a whole project and tell what I'm doing?
Loving this!
Most (all?) of the commenters here seemed to miss the entire point of this feature:
"Create custom prompts
We provide a few preset prompts to get you started: three that explain what a particular block of code does, and another that generates example code for calling a function.
You can customize the prompt and stop sequence of a query in order to come up with new applications that use Codex to interpret code. Creating these can feel like more of an art than a science! Small changes in the formulation of the prompt and stop sequence can produce very different results. The three different “explain” examples showcase strategies that tend to produce useful responses from the model, but this is uncharted territory! We’re excited to see what you use this for."
This isn't for explaining code. That's just an example. This lets us create custom prompts for the OpenAI API using the codex model.
when i first used copilot i thought it wasn’t very useful. But as i’ve used it more i realize it helps a lot. A surprising amount of code is just boilerplate, e.g. a single if-statement is 3 lines but you’re not going to abstract all your if statements, so you write similar if statements over and over again.
It especially helps when i’m programming in a language or library i’m not familiar with, because i don’t quite understand the syntax but copilot does.
This is exactly what concerns me about copilot. Reducing the barrier to writing boilerplate will make sure a lot more of it is written. Copilot will probably have a tendency to balloon SLOC as its autocomplete means adding an extra 35 lines is just a click away.
Of course not everybody will use it this way and responsibly used it'll be fine yadda yadda yadda but it is going to shape developer behavior - most especially for developers being a bit lazy (which is all of us sometimes and some of us all of the time).
Also, SLOC _implies_ productivity, which will put a damper on people's reluctance to just add those 35 lines.
Somebody is going to have to dig through all of that code one day looking for obscure bugs. I suspect difficulty in finding bugs is correlated to the square of SLOC (as opposed to writing which is probably closer to linear) so on the other end of the code lifecycle copilot could end up being a masssive false economy.
This could be mitigated by extra vigilant code reviews but I've noticed that vigiliance in code reviews is inversely correlated to PR length (theres a cartoon about this somewhere), so I'm slightly pessimistic it'll stem the boilerplate tsunami.
I'm also worried about needing to review boilerplate. I find that it's tedious to go through and make mistakes harder to spot. As said by Yaron Minsky:
> "You can’t pay people enough to carefully debug boring boilerplate code. I’ve tried."
I've been finding it amazingly useful from the outset. It's like it reads my mind generating whole correct functions just from a small comment or example, and it's great at extrapolating repetitive increments/transformations. Write a small comment telling it what you want and it's uncanny what it can spit out.
I've had the opposite experience so far, though I've admittedly not used it _that_ much yet. I've got this Python fiscal date library at work, and I wanted to add a something like a `strtftime()` method which would replace $P with the period, $Q with the quarter, etc, but treat $ as a literal $ (so that e.g. $P would result in $P). Simple enough, but I thought it would be cute to see if I could get copilot to do it for me.
I spent at least a half hour trying to figure out what kind of prompt would result in a correct implementation. No matter how explicit I was it kept giving answers that failed to handle the $P -> $P conversion correctly. Finally I was able to get it to spit out an answer that at least had the right idea, but it had an unrelated bug in it.
The package also has some functions to determine holidays, and I was impressed that Copilot could generate code for new holidays that mimicked my style, docstrings and all. But disappointingly it failed for any slightly obscure holiday (say, Eid al-Fitr), generating code that was just wrong, even when I hinted that it would have to use an alternative calendar library. It kept trying to use the Hebrew calendar for Islamic and Hindu holidays, I guess because I'd already imported it for Hanukkah.
It also seems useless for Spark ETL, having to be handheld to a degree that it would be easier just to write the code myself. My first attempt to play around with Copilot was to see if it could do a salted join (no luck, unless by "salted join" I meant something like "a join on a column called salt").
I'll keep playing around with it, maybe it's just a "me" problem.
I noticed that for very specific things like this, it doesn't always work. However, if you had several functions which needed slight variations of the same thing, then writing the first one manually would allow it to automatically guess the rest of them.
Maybe it's better at Java and Javascript than Python at the moment.
Have you tried it on test cases at all?
What exactly are you asking it to do?
I've found the same. Particularly coming in nicely when I add some kind of parsing in and out of a data structure and it auto fills in all the parameters.
Hello! I'm one of the founders of
, a coding assistant for data scientists and ML practitioners. Cogram brings Copilot-like features to Jupyter Notebook.
You can use Cogram by writing ordinary code (and Cogram will complete it), or by writing comments (in which case Cogram completes the code that matches the comment). We've also launched a web app that generates SQL statements from descriptions in simple language.
We're just starting out and would love it if you could try it out, give feedback, and spread the word if you like. It doesn't cost anything -- we have a generous free tier (1000 completions / month).
I'm a fan of CoPilot, but as given in their own example here (and gently mocked in the top response), the "explanations" generated seem to be no more useful than archetypal _bad comments_ that are normally given as examples of how _not_ to comment code in that they simply translate the statements "word-for-token" _without_ explanation or insight.
Nevertheless, given Copilot's success so far, hopefully the output will quickly improve.
We've had Code Explanation (based on Codex)[1] in production for over a month and while it leaves a lot to be desired it was surprising that 75% of users thought the explanation were useful to them.
[1]:
What's the use-case for this? Hopefully not for commenting code.
This only seems to be able to explain syntax, which seems pretty useless for anyone who isn't new to the language they're using. Am I missing something?
The use case is exactly what is stated on the box. You select a bit of code in a language/syntax that may be unfamiliar to you, and it tries to explain it better in English.
The point of code comments isn't to explain what the code does, unless you're writing a tutorial. The point is usually to explain the intention of the author. If a person knows the language, they don't need an explanation of what the code does.
I should try it with decompiled APKs in Smali.
I'm wondering the same thing. There is one example in the discussion where explaining syntax seems useful, a regular expression. There's a bunch of existing solutions for that, and the explanation generated by Copilot is wrong.
It's a pretty straightforward, understandable task - which then leads you towards making your own tweaks to it / writing something new.
It may also just be an interesting attempt at something that could be great depending on how well it worked - it's an experiment. A general "explain to me what this code does" that was as good as a human explaining overall what is going on could be very useful in more complicated parts of a codebase.
My final thought is that being new to a language and needing to dig around in a codebase is something I do quite a lot and so it could be useful even if it's quite basic.
My favorite copilot moment recently is rails migrations. I write the up migrating n, copilot gets the down migration right 90% of the time. It's wild.
And these aren't vanilla "change" commands. It's triggers and check constraints and changing column types and stuff.
Not using Copilot yet, would appreciate if you could provide some real examples on what you typed and what copilot suggested.
When I type:
def up execute <<-SQL alter table users add constraint active_users_must_have_email check active = false OR email is not null NOT VALID SQL
i.e. I am adding a check constraint but making it NOT VALID so it won't lock the table while scanning
If I continue and write
execute <<-SQL a
then it completes it with
alter table users
And then the next line I type
v
and it completes it with
validate constraint active_users_must_have_email SQL
I then type
end
and it suggests:
def down execute <<-SQL ALTER TABLE users DROP CONSTRAINT IF EXISTS constraint active_users_must_have_email; SQL end
Which is exactly what I needed.
What impresses me is that a) it recognized that a NOT VALID index needs to be VALIDATEd and b) it figured out what the down migration was.
having used copilot for only a couple hours and found it annoying... this might be a but less obstructive. though contrary to the commenter the samples in that thread didnt seem all that great to me. ill try it out tho and since its in the sidebar itll hopefully be less distracting.
quick test is actually a bit impressive.
I've been using Copilot these past days and it's kind of mindblowing. I was doing some Rails tests, and after writing the factories it suggested me the exact right expectations. Did not expect that, kind of scary. Same for suggesting while commenting code. Wtf.
I'd love to keep using it, but I'm way too familiarized with Sublime Text to change now (and I tried to customize VsCode). I know it's a long shot, but it'd be great if they release an extension for Sublime.
Neat! The explain feature is exactly what I had hoped to see built:
https://news.ycombinator.com/item?id=27813639
I made the first comment on that issue: tbh it does work better than expected on intentionally bad code, but not perfect. It could be useful to speed up documentation though.
Tried Copilot and gave up on it, it was only helping out where it didn't matter, the rare trivial parts. After a few mistakes I concluded it was dragging me down. Maybe I'm a hard customer for it even though I work in ML and have been following closely ever since it started.
Oh wonderful. Can I finally have something to explain non trivial Nix expressions to me? Love the ideas behind Nix, loathe the syntax.
Is this the same backend that is powering
?
Almost certainly not
---
That was also the first time I had heard of denigma.app and their tagline of
_We stress-tested it on the worst, most obscure code we could find.
That's why we're confident it will work on your complex codebase._
gave me a good chuckle since just trivially modifying their example ... _sort of_ worked
Given:
RUN apt-get update && exit 1 && apt-get install -y libnss3 libgtk-3-0 libx11-xcb1 libxss1 libasound2
It produced:
- Next, apt-get is used to update the system and exit 1 is used to stop any further commands from running.
- Then libnss3, libgtk-3-0, libx11-xcb1, libxss1, and libasound2 are installed using apt-get.
For good fun, I tried another small edit
RUN exit 1 && apt-get update && apt-get install -y libnss3 libgtk-3-0 libx11-xcb1 libxss1 libasound2
With an even _better_ explanation:
- The code then runs an exit command and uses apt-get to update its package list before installing libnss3, libgtk-3-0, libx11-xcb1, libxss1, and asound2.
I think having it produce good output on nonsensical / obviously wrong code would be a very different target than explaining commonly found code, so your expectation is a little unreasonable at this point.
Well, they're the ones that said "worst" and "most obscure," that wasn't me paraphrasing
But my heartburn is the intersection of these two bullet points:
"and exit 1 is used to stop any further commands from running", followed by _any other claim_
> obviously wrong code
Based on my experience with code reviews, I'd bet $5 I could put (or leave) "&& exit 1 &&" in the middle of a sea of RUN commands and it'd go unnoticed. So I guess it depends on who the target audience is for both this and the absolutely laughable Labs output: people who don't want to read what an if statement does, or people looking for the _meaning_ of the code?
I think the explanation is just as correct as the code.
Garbage in -> garbage out
Denigma works very poorly on short code– especially single lines. It warns you.
It is designed for explaining real-world code in context, and performs poorly on esoteric examples.
In case it wasn't clear, I modified your DevOps example, so it's not like that code came out of left-field or was hand-crafted to trip you up. If you want to say "it is not good at reasoning about shell, instead focusing on programming languages" that'd be a much more plausible concession
Since you're hanging out here, what is the explanation for how it can say "exit 1 is used to stop any further commands from running." and then carry on?
It's even more bizarre that it (seems to?) understands what "exit 1" does, but only when it is not the first command in the sequence?
I’m currently working on a custom, state of the art zero-knowledge proof smart contract framework, and as I’m writing circuits using it Copilot correctly suggests an insane number of lines to me. It feels like I’m cheating.
Perhaps your framework is not as state-of-the-art as you believe?
Why is it so difficult to conceive that what these models produce is novel, and capable of high level technical and semantic correctness?
Transformer models aren't auto-complete engines. They're semantic graphs describing the probabilities of token positions within a sequence of a few thousand tokens. They go deep, with a nuanced and hierarchical representation of concepts. The representation is functional, conditionally altering the behavior of tokens within its sequence context. Because of the hierarchical nature of the internal rules, when the model "learns" things, it extracts meta-patterns from the training data, and can chain a seemingly arbitrary number of such rules together to produce an output.
Auto-complete systems generally fail at the first level of abstraction. If transformers have sufficient vocabulary size and high quality data, they're capable of learning any degree of abstractions represented within the training corpus.
They're not magic, or intelligent like human brains, but they're allowing programmers to model very complex systems and produce novel material. This ranges from constructed languages to the rules of programming to storytelling and poetry to protein sequencing and more.
You should be in awe of gpt-3, or at the very least, far less dismissive. It represents a quantum leap in the power of software, and we're barely scratching the surface of what's possible. Recurrence and episodal memory, along with Moore's law and/or a substantial decrease in the size of the models, could be the thing that achieves human level agi that lives on consumer grade devices.
I’ll take the bait. What makes you say that?
Because —and this serves as a reply to the sibling comment too— GPT-3 is _just_ a relatively simple construct pumped full of data and compute. After training it is relatively static, and while output might appear novel in some cases I think it is _far more likely_ that GPT-3 is not more imaginative than most people suspect, but that people are _less imaginative_ than most people expect.
I’m not knocking your work, it certainly may be the first (or near enough to) instance that techniques are applied in the way that they are; but in my understanding and opinion, the very fact that Copilot can output code that seems novel and groundbreaking suggests the opposite. Occam’s razor.
This is impressive. Is it still beta and invite only?
edit: i am wrong, its still beta/waitlist for copilot.
copilot? i believe its open to anyone now. this feature? its a nightly build according to the post.
Kinda offtopic, I wonder how's the life of code reviewers and freelancers' clients now with the existence of Copilot.